Language models and deep neural networks for Arabic named entity recognition

Somia Khedimi, Abdelghani Bouziane

Abstract


Token type identification lies at the core of named entity recognition, allowing models to distinguish named entities from non-entity tokens and thereby better capture sentence meaning. This paper presents a deep learning approach for the Arabic named entity recognition task, leveraging deep neural networks and pretrained language models. The proposed model is a combination of the AraELECTRA language model with the bidirectional long short-term memory (BiLSTM) neural network. We utilize the WojoodNER dataset, which provides fine-grained annotations of Arabic text across 21 entity types. The results of this approach are encouraging, with an accuracy of 98.29% and an F1-score of 87%.

Keywords


AraELECTRA; BiLSTM; Deep learning; Language models; Named entity recognition in Arabic; Wojood dataset

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v42.i1.pp142-148

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

shopify stats IJEECS visitor statistics