SqueezedText: A real-time scene text recognition by binary convolutional encoder-decoder network

Zichuan Liu, Yixing Li, Fengbo Ren, Wang Ling Goh, Hao Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Scopus citations

Abstract

A new approach for real-time scene text recognition is proposed in this paper. A novel binary convolutional encoder-decoder network (B-CEDNet) together with a bidirectional recurrent neural network (Bi-RNN). The B-CEDNet is engaged as a visual front-end to provide elaborated character detection, and a back-end Bi-RNN performs character-level sequential correction and classification based on learned contextual knowledge. The front-end B-CEDNet can process multiple regions containing characters using a one-off forward operation, and is trained under binary constraints with significant compression. Hence it leads to both remarkable inference run-time speedup as well as memory usage reduction. With the elaborated character detection, the back-end Bi-RNN merely processes a low dimension feature sequence with category and spatial information of extracted characters for sequence correction and classification. By training with over 1,000,000 synthetic scene text images, the B-CEDNet achieves a recall rate of 0.86, precision of 0.88 and F-score of 0.87 on ICDAR-03 and ICDAR-13. With the correction and classification by Bi-RNN, the proposed real-time scene text recognition achieves state-of-the-art accuracy while only consumes less than 1-ms inference run-time. The flow processing flow is realized on GPU with a small network size of 1.01 MB for B-CEDNet and 3.23 MB for Bi-RNN, which is much faster and smaller than the existing solutions.

Original languageEnglish (US)
Title of host publication32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PublisherAAAI press
Pages7194-7201
Number of pages8
ISBN (Electronic)9781577358008
StatePublished - 2018
Event32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
Duration: Feb 2 2018Feb 7 2018

Publication series

Name32nd AAAI Conference on Artificial Intelligence, AAAI 2018

Other

Other32nd AAAI Conference on Artificial Intelligence, AAAI 2018
CountryUnited States
CityNew Orleans
Period2/2/182/7/18

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint Dive into the research topics of 'SqueezedText: A real-time scene text recognition by binary convolutional encoder-decoder network'. Together they form a unique fingerprint.

  • Cite this

    Liu, Z., Li, Y., Ren, F., Goh, W. L., & Yu, H. (2018). SqueezedText: A real-time scene text recognition by binary convolutional encoder-decoder network. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7194-7201). (32nd AAAI Conference on Artificial Intelligence, AAAI 2018). AAAI press.