Zone content classification and its performance evaluation

Yalin Wang, Robert Haralick, Ihsin T. Phillips

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

This paper presents an improved zone content classification method and its performance evaluation. We added two new features to the feature vector from one previously published method [1]. We assumed different independence relationship in two zone sets. We used an optimized binary decision tree to estimate the maximum zone content class probability in one set while used Viterbi algorithm to find the optimal solution for a zone sequence in the other set. The training, pruning and testing data set for the algorithm include 1,600 images drawn from the UWCDROM HI document image database. The classifier is able to classify each given scientific and technical document zone into one of the nine classes, 2 text classes (of font size 4 - 18pt and font size 19 - 32 pt), math, table, halftone, map/drawing, ruling, logo, and others. Compared with our previous work [2], it raised the accuracy rate to 98.52% from 97.53% and reduced the mean false alarm rate to 0.53% from 1.26%.

Original languageEnglish (US)
Title of host publicationProceedings - 6th International Conference on Document Analysis and Recognition, ICDAR 2001
PublisherIEEE Computer Society
Pages540-544
Number of pages5
ISBN (Electronic)0769512631, 0769512631, 0769512631
DOIs
StatePublished - 2001
Externally publishedYes
Event6th International Conference on Document Analysis and Recognition, ICDAR 2001 - Seattle, United States
Duration: Sep 10 2001Sep 13 2001

Publication series

NameProceedings of the International Conference on Document Analysis and Recognition, ICDAR
Volume2001-January
ISSN (Print)1520-5363

Other

Other6th International Conference on Document Analysis and Recognition, ICDAR 2001
Country/TerritoryUnited States
CitySeattle
Period9/10/019/13/01

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Zone content classification and its performance evaluation'. Together they form a unique fingerprint.

Cite this