Effects of using wordnet and spelling checker on classification methods in sentiment analysis for datasets using Bahasa
Abstract
Sentiment analysis was a system for recognizing and extracting opinions in documents. There were two weaknesses in sentiment analysis. The first weakness was preprocessing in sentiment analysis can’t recognize slang words so that important words that should have been recognized became unrecognizable. The Second was the feature extraction process in sentiment analysis only recognized words based on the form of the word but can’t recognize the similar word. In this paper, we proposed spelling checker and wordnet to fix these weaknesses. We also used k-nearest neighbor (KNN), Naïve Bayes, and decision tree as methods for check classify the text. The purpose of this research was first to know the effects of used Wordnet and spelling checkers in sentiment analysis and second was to improve the data processing process in the existing sentiment analysis. The dataset that we used in the research was a list of tweets in Bahasa. The results showed wordnet and spelling checker make a decrease in the valued of false positives, false negatives, and true negatives in the calculation of the confusion matrix. It can increase the accuracy of the K-NN from 43% to 72%, Naïve Bayes from 41% to 71% and decision tree from 47% to 75%.
Keywords
Bahasa; Sentiment analysis slang words wordnet; Spelling checker
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v25.i3.pp1662-1671
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).