Language models and deep neural networks for Arabic named entity recognition

Somia Khedimi; Abdelghani Bouziane

doi:10.11591/ijeecs.v42.i1.pp142-148

Language models and deep neural networks for Arabic named entity recognition

Somia Khedimi, Abdelghani Bouziane

Abstract

Token type identification lies at the core of named entity recognition, allowing models to distinguish named entities from non-entity tokens and thereby better capture sentence meaning. This paper presents a deep learning approach for the Arabic named entity recognition task, leveraging deep neural networks and pretrained language models. The proposed model is a combination of the AraELECTRA language model with the bidirectional long short-term memory (BiLSTM) neural network. We utilize the WojoodNER dataset, which provides fine-grained annotations of Arabic text across 21 entity types. The results of this approach are encouraging, with an accuracy of 98.29% and an F1-score of 87%.

Keywords

AraELECTRA; BiLSTM; Deep learning; Language models; Named entity recognition in Arabic; Wojood dataset

Full Text:

PDF

DOI: http://doi.org/10.11591/ijeecs.v42.i1.pp142-148

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES).

IJEECS visitor statistics

Username
Password
Remember me