HCS: Hierarchical cut selection for efficiently processing queries on data columns using hierarchical bitmap indices

Parth Nagarkar, Kasim Candan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

When data are large and query processing workloads consist of data selection and aggregation operations (as in online analytical processing), column-oriented data stores are generally the preferred choice of data organization, because they enable effective data compression, leading to significantly reduced IO. Most columnstore architectures leverage bitmap indices, which themselves can be compressed, for answering queries over data columns. Columndomains (e.g., geographical data, categorical data, biological taxonomies, organizational data) are hierarchical in nature, and it may be more advantageous to create hierarchical bitmap indices, that can help answer queries over different sub-ranges of the domain. However, given a query workload, it is critical to choose the appropriate subset of bitmap indices from the given hierarchy. Thus, in this paper, we introduce the cut-selection problem, which aims to help identify a subset (cut) of the nodes of the domain hierarchy, with the appropriate bitmap indices. We discuss inclusive, exclusive, and hybrid strategies for cut-selection and show that the hybrid strategy can be efficiently computed and returns optimal (in terms of IO) results in cases where there are no memory constraints. We also show that when there is a memory availability constraint, the cut-selection problem becomes difficult and, thus, present efficient cut-selection strategies that return close to optimal results, especially in situations where the memory limitations are very strict (i.e., the data and the hierarchy are much larger than the available memory). Experiment results confirm the efficiency and effectiveness of the proposed cut-selection algorithms.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology - EDBT 2014: 17th International Conference on Extending Database Technology, Proceedings
PublisherOpenProceedings.org, University of Konstanz, University Library
Pages271-282
Number of pages12
ISBN (Electronic)9783893180653
DOIs
StatePublished - 2014
Event17th International Conference on Extending Database Technology, EDBT 2014 - Athens, Greece
Duration: Mar 24 2014Mar 28 2014

Other

Other17th International Conference on Extending Database Technology, EDBT 2014
CountryGreece
CityAthens
Period3/24/143/28/14

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Software

Fingerprint Dive into the research topics of 'HCS: Hierarchical cut selection for efficiently processing queries on data columns using hierarchical bitmap indices'. Together they form a unique fingerprint.

  • Cite this

    Nagarkar, P., & Candan, K. (2014). HCS: Hierarchical cut selection for efficiently processing queries on data columns using hierarchical bitmap indices. In Advances in Database Technology - EDBT 2014: 17th International Conference on Extending Database Technology, Proceedings (pp. 271-282). OpenProceedings.org, University of Konstanz, University Library. https://doi.org/10.5441/002/edbt.2014.26