AFilter: Adaptable XML filtering with prefix-caching and suffix-clustering

K. Selçuk Candan, Wang Pin Hsiung, Songting Chen, Junichi Tatemura, Divyakant Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

63 Scopus citations

Abstract

XML message filtering problem involves searching for instances of a given, potentially large, set of patterns in a continuous stream of XML messages. Since the messages arrive continuously, it is essential that the filtering rate matches the data arrival rate. Therefore, the given set of filter patterns needs to be indexed appropriately to enable real-time processing of the streaming XML data. In this paper, we propose AFilter, an adaptable, and thus scalable, path expression filtering approach. AFilter has a base memory requirement linear in filter expression and data size. Furthermore, when additional memory is available, AFilter can exploit prefix commonalities in the set of filter expressions using a loosely-coupled prefix caching mechanism as opposed to tightly-coupled active state representation of alternative approaches. Unlike existing systems, AFilter can also exploit suffix-commonalities across filter expressions, while simultaneously leveraging the prefix-commonalities through the cache. Finally, AFilter uses a triggering mechanism to prevent excessive consumption of resources by delaying processing until a trigger condition is observed. Experiment results show that AFilter provides significantly better scalability and runtime performance when compared to state of the art filtering systems.

Original languageEnglish (US)
Title of host publicationVLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
PublisherAssociation for Computing Machinery
Pages559-570
Number of pages12
ISBN (Print)1595933859, 9781595933850
StatePublished - 2006
Externally publishedYes
Event32nd International Conference on Very Large Data Bases, VLDB 2006 - Seoul, Korea, Republic of
Duration: Sep 12 2006Sep 15 2006

Publication series

NameVLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases

Other

Other32nd International Conference on Very Large Data Bases, VLDB 2006
Country/TerritoryKorea, Republic of
CitySeoul
Period9/12/069/15/06

ASJC Scopus subject areas

  • Information Systems and Management
  • Hardware and Architecture
  • Information Systems
  • Software

Fingerprint

Dive into the research topics of 'AFilter: Adaptable XML filtering with prefix-caching and suffix-clustering'. Together they form a unique fingerprint.

Cite this