摘要
As an important step in text-based information extraction systems, scene text detection has become a popular subject of research in recent years. In this study, the authors present a novel approach to robustly detect texts which are variable in scales, colours, fonts, languages and orientations in scene images. To segment candidate text connected components (CCs) from images, both local contrast and colour consistency are considered in superpixel level. To filter out the non-text CCs, a hierarchical model is designed. This hierarchical model groups the CCs into three cascaded stages, and is equipped with a well-designed classifier in each stage. Experimental results on the public ICDAR 2005 dataset and the MSRA-TD500 dataset show that their approach obtains better performance than other state-of-the-art methods.
- 单位