Information extraction model from Ge’ez texts
Abstract
Nowadays, voluminous and unstructured textual data is found on the Internet that could provide varied valuable information for different institutions such as health care, business-related, training, religion, culture, and history, among others. A such alarming growth of unstructured data fosters the need for various methods and techniques to extract valuable information from unstructured data. However, exploring helpful information to satisfy the needs of the stakeholders becomes a problem due to information overload via the internet. This paper, therefore, presents an effective model for extracting named entities from Ge'ez text using deep learning algorithms. A data set with a total of 5,270 sentences were used for training and testing purposes. Two experimental setups, i.e., long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM) were used to make an empirical evaluation with training and a testing split ratio of 80% to 20%, respectively. Experimental results showed that the proposed model could be a practical solution for building information extraction (IE) systems using Bi-LSTM, reaching a training, validation, and testing accuracy as high as 98.59%, 97.96%, and 96.21%, respectively. The performance evaluation results reflect a promising performance of the model compared with resource-rich languages such as English.
Bi-LSTM;
Deep learning;
Entity extraction;
Ge’ez text;
Information extractionKeywords
Bi-LSTM; Deep learning; Entity extraction; Ge’ez text; Information extraction
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v30.i2.pp787-795
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).