Computationally efficient handwritten Telugu text recognition

Buddaraju Revathi, M. V. D. Prasad, Naveen Kishore Gattim

Abstract


Optical character recognition (OCR) for regional languages is difficult due to their complex orthographic structure, lack of dataset resources, a greater number of characters and similarity in structure between characters. Telugu is popular language in states of Andhra and Telangana. Telugu exhibits distinct separation between characters within a word, making a character-level dataset sufficient. With a smaller dataset, we can effectively recognize more words. However, challenges arise during the training of compound characters, which are combinations of vowels and consonants. These are considered as two or more characters based on associated vattus and dheerghams with the base character. To address this challenge, each compound character is encoded into a numerical value and used as input during training, with subsequent retrieval during recognition. The segmentation issue arises from overlapping characters caused by varying handwritten styles. For handling segmentation issues at the character level arising from handwritten styles, we have proposed an algorithm based on the language's features. To enhance word-level accuracy a dictionary-based model was devised. A neural network utilizing the inception module is employed for feature extraction at various scales, achieving word-level accuracy rates of 78% with fewer trainable parameters.


Keywords


Character recognition; Feature extraction; Inception; Neural network; Orthographic features

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v34.i3.pp1618-1626

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics