ETV: efficient text vision for text localization in natural scene images
Abstract
In the current digital era, the extraction and comprehension of textual information from images have emerged as pivotal tasks. With the exponential growth of text documents, efficient processing and analysis have become imperative. However, text localization in images remains challenging due to complex backgrounds, uneven illumination, diverse text styles, and perspective distortions, rendering traditional optical character recognition (OCR) techniques inadequate. To address these challenges, this paper proposes an integrated method named efficient text vision (ETV). ETV combines the OCR capabilities of Tesseract with the efficient and accurate scene text detector (EAST) algorithm, supplemented by nonmaximum suppression (NMS). The Tesseract OCR component facilitates the extraction and identification of individual characters, while EAST excels in the efficient detection and localization of complete text sections. The incorporation of NMS enhances localization accuracy by eliminating redundant or overlapping bounding boxes.
Keywords
Deep learning; Scene text understanding; Text localization; Text recognition; Unconstrained conditions
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v41.i2.pp812-822
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).