Automatic Table ground truth generation and a background-analysis-based table structure extraction method

Yalin Wang, Ihsin T. Phillips, Robert Haralick

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental protocol to evaluate the table detection algorithms. For a total of 1,125 document pages having 518 table entities and a total of 10,941 cell entities, our table detection algorithm takes line, word segmentation results as input and obtains around 90% cell correct detection rates.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Document Analysis and Recognition, ICDAR
PublisherIEEE Computer Society
Pages528-532
Number of pages5
Volume2001-January
ISBN (Print)0769512631, 0769512631, 0769512631
DOIs
StatePublished - 2001
Externally publishedYes
Event6th International Conference on Document Analysis and Recognition, ICDAR 2001 - Seattle, United States
Duration: Sep 10 2001Sep 13 2001

Other

Other6th International Conference on Document Analysis and Recognition, ICDAR 2001
CountryUnited States
CitySeattle
Period9/10/019/13/01

Fingerprint

Decomposition

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Cite this

Wang, Y., Phillips, I. T., & Haralick, R. (2001). Automatic Table ground truth generation and a background-analysis-based table structure extraction method. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR (Vol. 2001-January, pp. 528-532). [953845] IEEE Computer Society. https://doi.org/10.1109/ICDAR.2001.953845

Automatic Table ground truth generation and a background-analysis-based table structure extraction method. / Wang, Yalin; Phillips, Ihsin T.; Haralick, Robert.

Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2001-January IEEE Computer Society, 2001. p. 528-532 953845.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Wang, Y, Phillips, IT & Haralick, R 2001, Automatic Table ground truth generation and a background-analysis-based table structure extraction method. in Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. vol. 2001-January, 953845, IEEE Computer Society, pp. 528-532, 6th International Conference on Document Analysis and Recognition, ICDAR 2001, Seattle, United States, 9/10/01. https://doi.org/10.1109/ICDAR.2001.953845
Wang Y, Phillips IT, Haralick R. Automatic Table ground truth generation and a background-analysis-based table structure extraction method. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2001-January. IEEE Computer Society. 2001. p. 528-532. 953845 https://doi.org/10.1109/ICDAR.2001.953845
Wang, Yalin ; Phillips, Ihsin T. ; Haralick, Robert. / Automatic Table ground truth generation and a background-analysis-based table structure extraction method. Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. Vol. 2001-January IEEE Computer Society, 2001. pp. 528-532
@inproceedings{b344848af0d248a197a62f7cd86ba111,
title = "Automatic Table ground truth generation and a background-analysis-based table structure extraction method",
abstract = "In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental protocol to evaluate the table detection algorithms. For a total of 1,125 document pages having 518 table entities and a total of 10,941 cell entities, our table detection algorithm takes line, word segmentation results as input and obtains around 90{\%} cell correct detection rates.",
author = "Yalin Wang and Phillips, {Ihsin T.} and Robert Haralick",
year = "2001",
doi = "10.1109/ICDAR.2001.953845",
language = "English (US)",
isbn = "0769512631",
volume = "2001-January",
pages = "528--532",
booktitle = "Proceedings of the International Conference on Document Analysis and Recognition, ICDAR",
publisher = "IEEE Computer Society",

}

TY - GEN

T1 - Automatic Table ground truth generation and a background-analysis-based table structure extraction method

AU - Wang, Yalin

AU - Phillips, Ihsin T.

AU - Haralick, Robert

PY - 2001

Y1 - 2001

N2 - In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental protocol to evaluate the table detection algorithms. For a total of 1,125 document pages having 518 table entities and a total of 10,941 cell entities, our table detection algorithm takes line, word segmentation results as input and obtains around 90% cell correct detection rates.

AB - In this paper, we first describe an automatic table ground truth generation system which can efficiently generate a large amount of accurate table ground truth suitable for the development of table detection algorithms. Then a novel background-analysis-based, coarse-to-fine table identification algorithm and an X-Y cut table decomposition algorithm are described. We discuss an experimental protocol to evaluate the table detection algorithms. For a total of 1,125 document pages having 518 table entities and a total of 10,941 cell entities, our table detection algorithm takes line, word segmentation results as input and obtains around 90% cell correct detection rates.

UR - http://www.scopus.com/inward/record.url?scp=84951779521&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84951779521&partnerID=8YFLogxK

U2 - 10.1109/ICDAR.2001.953845

DO - 10.1109/ICDAR.2001.953845

M3 - Conference contribution

AN - SCOPUS:84951779521

SN - 0769512631

SN - 0769512631

SN - 0769512631

VL - 2001-January

SP - 528

EP - 532

BT - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR

PB - IEEE Computer Society

ER -