CuTeX: A system for extracting data from text tables

Hasan Davulcu, Saikat Mukherjee, Arvind Seth, I. V. Ramakrishnan

Research output: Contribution to journalConference articlepeer-review

Abstract

A system for extracting data from irregular text tables is designed and implemented. This system, CuteX, is an association between every items in a column. It is implemented in Java and is approximately about 3000 lines of code. The system automatically partitions the set of input text tables into directories containing correct and incorrect extractions. This paper focuses on the demonstration of illustrating the robustness and iterative process of improving the extraction yield of the clustering algorithm.

Original languageEnglish (US)
Number of pages1
JournalSIGIR Forum (ACM Special Interest Group on Information Retrieval)
StatePublished - Dec 1 2002
Externally publishedYes
EventProceedings of the Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Tampere, Finland
Duration: Aug 11 2002Aug 15 2002

ASJC Scopus subject areas

  • Management Information Systems
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'CuTeX: A system for extracting data from text tables'. Together they form a unique fingerprint.

Cite this