Multilabel sentiment analysis for classification of the spread of COVID-19 in Indonesia using machine learning
Abstract
This study aims to use datasets on Twitter to find out public opinion on the spread of coronavirus in Indonesia by conducting sentiment analysis. The resulting sentiment analysis will benefit the community by helping the Indonesian government take various strategic measures to prevent and counter the spread of the COVID-19. This research was conducted through the data collection stage, namely crawling data tweet words in Bahasa Indonesia containing the meaning of the spread of COVID-19, the next stage of the process of creating labels manually. Next, the pre-process stage by removing the character, symbols and special features from Twitter. The last stage, classification using learning machine with 3(three) methods namely K-nearest neighbor (K-NN), Naïve Bayes and decision tree. The study analyzed sentiment of 1,119 valid Tweets data and found that K-NN algorithm had the highest accuracy value compared to Naïve Bayes and decision tree algorithms, which was 95.10%. However, the Twitter data analyzed obtained 78.19% of Tweets that fall into the negative category and only 13.85% of public opinion that is positive. This indicates that most of the Tweets of Indonesians in twitter do not mean the spread of COVID-19 disease somewhere.
Keywords
COVID-19; Decision tree; Indonesia; K-nearest neighbor; Naïve Bayes; Sentiment analysis
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v31.i2.pp968-978
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).