TY - GEN
T1 - Zone content classification and its performance evaluation
AU - Wang, Yalin
AU - Haralick, Robert
AU - Phillips, Ihsin T.
N1 - Publisher Copyright:
© 2001 IEEE.
PY - 2001
Y1 - 2001
N2 - This paper presents an improved zone content classification method and its performance evaluation. We added two new features to the feature vector from one previously published method [1]. We assumed different independence relationship in two zone sets. We used an optimized binary decision tree to estimate the maximum zone content class probability in one set while used Viterbi algorithm to find the optimal solution for a zone sequence in the other set. The training, pruning and testing data set for the algorithm include 1,600 images drawn from the UWCDROM HI document image database. The classifier is able to classify each given scientific and technical document zone into one of the nine classes, 2 text classes (of font size 4 - 18pt and font size 19 - 32 pt), math, table, halftone, map/drawing, ruling, logo, and others. Compared with our previous work [2], it raised the accuracy rate to 98.52% from 97.53% and reduced the mean false alarm rate to 0.53% from 1.26%.
AB - This paper presents an improved zone content classification method and its performance evaluation. We added two new features to the feature vector from one previously published method [1]. We assumed different independence relationship in two zone sets. We used an optimized binary decision tree to estimate the maximum zone content class probability in one set while used Viterbi algorithm to find the optimal solution for a zone sequence in the other set. The training, pruning and testing data set for the algorithm include 1,600 images drawn from the UWCDROM HI document image database. The classifier is able to classify each given scientific and technical document zone into one of the nine classes, 2 text classes (of font size 4 - 18pt and font size 19 - 32 pt), math, table, halftone, map/drawing, ruling, logo, and others. Compared with our previous work [2], it raised the accuracy rate to 98.52% from 97.53% and reduced the mean false alarm rate to 0.53% from 1.26%.
UR - http://www.scopus.com/inward/record.url?scp=1342297868&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=1342297868&partnerID=8YFLogxK
U2 - 10.1109/ICDAR.2001.953847
DO - 10.1109/ICDAR.2001.953847
M3 - Conference contribution
AN - SCOPUS:1342297868
T3 - Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
SP - 540
EP - 544
BT - Proceedings - 6th International Conference on Document Analysis and Recognition, ICDAR 2001
PB - IEEE Computer Society
T2 - 6th International Conference on Document Analysis and Recognition, ICDAR 2001
Y2 - 10 September 2001 through 13 September 2001
ER -