Supporting OLAP operations over imperfectly integrated taxonomies

Yan Qi, Kasim Candan, Junichi Tatemura, Songting Chen, Fenglin Liao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

OLAP is an important tool in decision support. With the help of domain knowledge, such as hierarchies of attribute values, OLAP helps the user observe the effects of various decisions. One assumption of most OLAP operations is that the available domain knowledge is precise. In particular, they assume that the hierarchy of values over which the user can navigate forms a taxonomy. In this paper, we first note that when multiple lieterog sources are involved in the gathering of the data and the associated domain knowledge, the integrated knowledge-base, constructed by combining locally available taxonomies based on the concept matchings, may not be a taxonomy itself. Specifically, existence of intersections among concepts from different sources compromises the tree-structure of the integrated taxonomy and prevents effective use of hierarchical navigation techniques, such as drill-down and roll-up. To cope with this, we introduce concept un-classification, where a select few of the concepts are eliminated to ensure that the remaining structure is a navigable taxonomy, without concept intersections. Since un-classifying an originally classified data is not desirable, we consider ways to minimize un-classification in the process. We introduce a cost model which captures the imprecision caused by the un-classification process and we formulate the problem of finding an un-classification strategy which eliminates intersections and which adds minimal imprecision to the resulting structure. We show that, when performed naively, this task can be very costly and thus we propose a bottom-up preprocessing strategy which supports basic navigational analytics operations, such as drill-down and roll-up. Experiments over synthetic and real-life data verified the effectiveness and efficiency of our approach.

Original languageEnglish (US)
Title of host publicationSIGMOD 2008
Subtitle of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data 2008
Pages875-888
Number of pages14
DOIs
StatePublished - 2008
Event2008 ACM SIGMOD International Conference on Management of Data 2008, SIGMOD'08 - Vancouver, BC, Canada
Duration: Jun 9 2008Jun 12 2008

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Other

Other2008 ACM SIGMOD International Conference on Management of Data 2008, SIGMOD'08
Country/TerritoryCanada
CityVancouver, BC
Period6/9/086/12/08

Keywords

  • Imperfect integration
  • Imprecise data
  • OLAP
  • Taxonomy correction

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Supporting OLAP operations over imperfectly integrated taxonomies'. Together they form a unique fingerprint.

Cite this