Propagation-vectors for trees (PVT): Concise yet effective summaries for hierarchical data and trees

Venkata S. Cherukuri, Kasim Candan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Summarization of hierarchical data and metadata is a fundamental operation in applications in many domains. In particular, similarity search of hierarchical data, such as XML, would benefit greatly fromconcise and indexable summaries. This is especially true in P2P scenarios, where the search needs to be done in a distributed fashion on multiple peers. This situation requires summaries which are small, yet effective in identifying potential peers that need to be further explored. In this paper, we propose a method, called propagation-vectors for trees (PVT) which constructs very concise and accurate summaries of hierarchical data, such as XML trees. We then show how to use this summary to perform similarity search on summarized data. The proposed summarization scheme relies on a label-propagation mechanism, which constructs an n-dimensional vector from a given tree with n unique data labels. Experimental results have shown that the constructed PVT summaries capture the structure of the input trees very accurately, the representations are highly concise, and that the search based on these summaries are faster than the existing approaches.

Original languageEnglish (US)
Title of host publicationInternational Conference on Information and Knowledge Management, Proceedings
Pages3-10
Number of pages8
DOIs
StatePublished - 2008
Event2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval, LSDS-IR'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08 - Napa Valley, CA, United States
Duration: Oct 26 2008Oct 30 2008

Other

Other2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval, LSDS-IR'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08
CountryUnited States
CityNapa Valley, CA
Period10/26/0810/30/08

Fingerprint

Propagation
Summarization
Peers
Similarity search
Metadata
Scenarios
Propagation mechanism

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this

Cherukuri, V. S., & Candan, K. (2008). Propagation-vectors for trees (PVT): Concise yet effective summaries for hierarchical data and trees. In International Conference on Information and Knowledge Management, Proceedings (pp. 3-10) https://doi.org/10.1145/1458469.1458481

Propagation-vectors for trees (PVT) : Concise yet effective summaries for hierarchical data and trees. / Cherukuri, Venkata S.; Candan, Kasim.

International Conference on Information and Knowledge Management, Proceedings. 2008. p. 3-10.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cherukuri, VS & Candan, K 2008, Propagation-vectors for trees (PVT): Concise yet effective summaries for hierarchical data and trees. in International Conference on Information and Knowledge Management, Proceedings. pp. 3-10, 2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval, LSDS-IR'08, Co-located with the 17th ACM Conference on Information and Knowledge Management, CIKM'08, Napa Valley, CA, United States, 10/26/08. https://doi.org/10.1145/1458469.1458481
Cherukuri VS, Candan K. Propagation-vectors for trees (PVT): Concise yet effective summaries for hierarchical data and trees. In International Conference on Information and Knowledge Management, Proceedings. 2008. p. 3-10 https://doi.org/10.1145/1458469.1458481
Cherukuri, Venkata S. ; Candan, Kasim. / Propagation-vectors for trees (PVT) : Concise yet effective summaries for hierarchical data and trees. International Conference on Information and Knowledge Management, Proceedings. 2008. pp. 3-10
@inproceedings{1b007f1da7324e2fb0e2d6e4f9790a2f,
title = "Propagation-vectors for trees (PVT): Concise yet effective summaries for hierarchical data and trees",
abstract = "Summarization of hierarchical data and metadata is a fundamental operation in applications in many domains. In particular, similarity search of hierarchical data, such as XML, would benefit greatly fromconcise and indexable summaries. This is especially true in P2P scenarios, where the search needs to be done in a distributed fashion on multiple peers. This situation requires summaries which are small, yet effective in identifying potential peers that need to be further explored. In this paper, we propose a method, called propagation-vectors for trees (PVT) which constructs very concise and accurate summaries of hierarchical data, such as XML trees. We then show how to use this summary to perform similarity search on summarized data. The proposed summarization scheme relies on a label-propagation mechanism, which constructs an n-dimensional vector from a given tree with n unique data labels. Experimental results have shown that the constructed PVT summaries capture the structure of the input trees very accurately, the representations are highly concise, and that the search based on these summaries are faster than the existing approaches.",
author = "Cherukuri, {Venkata S.} and Kasim Candan",
year = "2008",
doi = "10.1145/1458469.1458481",
language = "English (US)",
isbn = "9781605582542",
pages = "3--10",
booktitle = "International Conference on Information and Knowledge Management, Proceedings",

}

TY - GEN

T1 - Propagation-vectors for trees (PVT)

T2 - Concise yet effective summaries for hierarchical data and trees

AU - Cherukuri, Venkata S.

AU - Candan, Kasim

PY - 2008

Y1 - 2008

N2 - Summarization of hierarchical data and metadata is a fundamental operation in applications in many domains. In particular, similarity search of hierarchical data, such as XML, would benefit greatly fromconcise and indexable summaries. This is especially true in P2P scenarios, where the search needs to be done in a distributed fashion on multiple peers. This situation requires summaries which are small, yet effective in identifying potential peers that need to be further explored. In this paper, we propose a method, called propagation-vectors for trees (PVT) which constructs very concise and accurate summaries of hierarchical data, such as XML trees. We then show how to use this summary to perform similarity search on summarized data. The proposed summarization scheme relies on a label-propagation mechanism, which constructs an n-dimensional vector from a given tree with n unique data labels. Experimental results have shown that the constructed PVT summaries capture the structure of the input trees very accurately, the representations are highly concise, and that the search based on these summaries are faster than the existing approaches.

AB - Summarization of hierarchical data and metadata is a fundamental operation in applications in many domains. In particular, similarity search of hierarchical data, such as XML, would benefit greatly fromconcise and indexable summaries. This is especially true in P2P scenarios, where the search needs to be done in a distributed fashion on multiple peers. This situation requires summaries which are small, yet effective in identifying potential peers that need to be further explored. In this paper, we propose a method, called propagation-vectors for trees (PVT) which constructs very concise and accurate summaries of hierarchical data, such as XML trees. We then show how to use this summary to perform similarity search on summarized data. The proposed summarization scheme relies on a label-propagation mechanism, which constructs an n-dimensional vector from a given tree with n unique data labels. Experimental results have shown that the constructed PVT summaries capture the structure of the input trees very accurately, the representations are highly concise, and that the search based on these summaries are faster than the existing approaches.

UR - http://www.scopus.com/inward/record.url?scp=70349331234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70349331234&partnerID=8YFLogxK

U2 - 10.1145/1458469.1458481

DO - 10.1145/1458469.1458481

M3 - Conference contribution

AN - SCOPUS:70349331234

SN - 9781605582542

SP - 3

EP - 10

BT - International Conference on Information and Knowledge Management, Proceedings

ER -