Sentiment analysis of Malayalam tweets using bidirectional encoder representations from transformers: a study

Syam Mohan Elankath, Sunitha Ramamirtham


Sentiment analysis on views and opinions expressed in Indian regional languages has become the current focus of research. But, compared to a globally accepted language like English, research on sentiment analysis in Indian regional languages like Malayalam are very low. One of the major hindrances is the lack of publicly available Malayalam datasets. This work focuses on building a Malayalam dataset for facilitating sentiment analysis on Malayalam texts and studying the efficiency of a pre-trained deep learning model in analyzing the sentiments latent in Malayalam texts. In this work, a Malayalam dataset has been created by extracting 2,000 tweets from Twitter. The bidirectional encoder representations from transformers (BERT) is a pretrained model that has been used for various natural language processing tasks. This work employs a transformer-based BERT model for Malayalam sentiment analysis. The efficacy of BERT in analyzing the sentiments latent in Malayalam texts has been studied by comparing the performance of BERT with various machine learning models as well as deep learning models. By analyzing the results, it is found that a substantial increase in accuracy of 5% for BERT when compared with that of Bi-GRU, which is the next bestperforming model.


BERT; Deep learning; Machine learning; Malayalam tweets; Sentiment analysis

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics