Feedback-driven result ranking and query refinement for exploring semi-structured data collections

Huiping Cao, Yan Qi, Kasim Candan, Maria Luisa Sapino

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Scopus citations

Abstract

Feedback process has been used extensively in document-centric applications, such as text retrieval and multimedia retrieval. Recently, there have been efforts to apply feedback to semi-structured XML document collections as well. In this paper, we note that feedback can also be an effective tool for exploring (through result ranking and query refinement) large semi-structured data collections. In particular, in large scale data sharing and curation environments, where the user may not know the structure of the data, queries may initially be overly vague. Given a path query and a set of results identified by the system to this query over the data, we consider two types of feedback: Soft feedback captures the user's preference for some features over the others. Hard feedback, on the other hand, expresses users' assertions regarding whether certain features should be further enforced or, in contrast, are to be avoided. Both soft and hard feedback can be "positive" or "negative". For soft feedback, we develop a probabilistic feature significance measure and describe how to use this for ranking results in the presence of dependencies between the path features. To deal with the hard feedback efficiently (i.e., fast enough for interactive exploration), we present finite automata based query refinement solutions. In particular, we present a novel LazyDFA+ algorithm for managing hard feedback. We also describe optimizations that leverage the inherently iterative nature of the feedback process. We bring together these techniques in AXP, a system for adaptive and exploratory path retrieval. The experimental results show the effectiveness of the proposed techniques.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings
Pages3-14
Number of pages12
DOIs
StatePublished - 2010
Event13th International Conference on Extending Database Technology: Advances in Database Technology - EDBT 2010 - Lausanne, Switzerland
Duration: Mar 22 2010Mar 26 2010

Publication series

NameAdvances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings

Other

Other13th International Conference on Extending Database Technology: Advances in Database Technology - EDBT 2010
Country/TerritorySwitzerland
CityLausanne
Period3/22/103/26/10

Keywords

  • Data-centric XML
  • Feature cover
  • Inter-dependent structural feature
  • Relevance feedback

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Feedback-driven result ranking and query refinement for exploring semi-structured data collections'. Together they form a unique fingerprint.

Cite this