Experimental of information gain and AdaBoost feature for machine learning classifier in media social data

Jasmir Jasmir, Dodo Zaenal Abidin, Fachruddin Fachruddin, Willy Riyadi

Abstract


In this research, we use several machine learning methods and feature selection to process social media data, namely restaurant reviews.
The selection feature used is a combination of information gain (IG) and adaptive boosting (AdaBoost) which is used to see its effect on the classification performance evaluation value of machine learning methods such as Naïve Bayes (NB), K-nearest neighbor (KNN), and random forest (RF) which is the aim of this research. NB is very simple and efficient and very sensitive to feature selection. Meanwhile, KNN is known for its weaknesses such as biased k values, overly complex computation, memory limitations, and ignoring irrelevant attributes. Then RF has weaknesses, including that the evaluation value can change significantly with only small data changes. In text classification, feature selection can improve the scalability, efficiency and accuracy of text classification. Based on tests that have been carried out on several machine learning methods and a combination of the two selection features, it was found that the best classifier is the RF algorithm. RF produces a significant increase in value after using the IG and AdaBoost features. Increased accuracy by 10%, precision by 12.43%, recall by 8.14% and F1-score by 10.37%. RF also produces even accuracy, precision, recall, and F1-score values after using IG and AdaBoost with an accuracy value of 84.5%; precision of 85.58%; recall was 86.36%; and F1-score was 85.97%.

Keywords


AdaBoost; Information gain; Machine learning; Social media; Text classification

Full Text:

PDF


DOI: http://doi.org/10.11591/ijeecs.v36.i2.pp1172-1181

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

shopify stats IJEECS visitor statistics