MHDT: A Deep-Learning-Based Text Detection Algorithm for Unstructured Data in Banking
Chapter
Accepted version
Åpne
Permanent lenke
http://hdl.handle.net/11250/2599801Utgivelsesdato
2019Metadata
Vis full innførselSamlinger
Originalversjon
Proceedings of the International Conference on Machine Learning and Computing 2019, 11, 295-300. 10.1145/3318299.3318327Sammendrag
Text detection in natural scene images becomes highly demanded for unstructured data in banking. In this paper, we propose a new deep learning algorithm called MSER, Hu-moment and Deep learning for Text detection (MHDT) based on Maximum Stable Extremal Regions (MSER) and Hu-moment features. Firstly, we extract MSERs as candidate characters. Secondly, a character classifier is introduced with Hu-moment features to reduce the number of input for clustering. After single linkage clustering, a text classifier trained from a Deep Brief Network is used to delete non-text. The proposed algorithm is evaluated on the ICDAR database, and the experimental results show that the proposed algorithm yields high precision and recall rate.