The effect of the TF-IDF algorithm in times series in forecasting word on social media
Abstract
Forecasting is one of the main topics in data mining or machine learning in which forecasting, a group of data used, has a label class or target. Thus, many algorithms for solving forecasting problems are categorized as supervised learning with the aim of conducting training. In this case, the things that were supervised were the label or target data playing a role as a 'supervisor' who supervise the training process in achieving a certain level of accuracy or precision. Time series is a method that is generally used to forecast based on time and can forecast words in social media. In this study had conducted the word forecasting on twitter with 1734 tweets which were interpreted as weighted documents using the TF-IDF algorithm with a frequency that often comes out in tweets so the TF-IDF value is getting smaller and vice versa. After getting the word weight value of the tweets, a time series forecast was performed with the test data of 1734 tweets that the results referred to 1203 categories of Slack words and 531 verb tweets as training data resulting in good accuracy. The division of word forecasting was classified into two groups i.e. inactive users and active users. The results obtained were processed with a MAPE calculation process of 50% for inactive users and 0.1980198% for active users.
Keywords
Forecasting; Social media; TF-IDF; Times series; Word
Full Text:
PDFDOI: http://doi.org/10.11591/ijeecs.v22.i2.pp976-984
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Indonesian Journal of Electrical Engineering and Computer Science (IJEECS)
p-ISSN: 2502-4752, e-ISSN: 2502-4760
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).