FastText Word Embedding and Random Forest Classifier for User Feedback Sentiment Classification in Bahasa Indonesia

  • Yehezkiel Gunawan
  • Julio Christian Young
  • Andre Rusli Universitas Multimedia Nusantara

Abstract

User feedback nowadays become a platform for software developer to identify and understand user requirements, preferences, and user’s complaints. It is important for the developer to identify the problem that exist in user feedback. According to software growth, user amount also growth. Read and classify one by one manually are wasting time and energy. As the solution for the problem, sentiment analysis system using Random Forest Classifier which use word embedding as the feature extraction is made to help to classify which feedback is positive, neutral, or negative. Random Forest Algorithm is chosen because it gives the best performance, even its need the larger resources. Furthermore, with word embedding, the words which has semantic or syntactic similarities will be detected. Word embedding does not need stemming and stop word removal, so the context of the sentences keep remains. This research is made to implement word embedding to classify sentiment of user feedbacks using Random Forest Classifier. 70.27% accuracy, 80% precision, 54 recall and 54% F1 score is reached when BYU dataset (200 dimension) as embedding dataset with the train and test ratio 80:20.

Downloads

Download data is not yet available.
Published
2022-01-23
How to Cite
Gunawan, Y., Young, J., & Rusli, A. (2022). FastText Word Embedding and Random Forest Classifier for User Feedback Sentiment Classification in Bahasa Indonesia. Ultimatics : Jurnal Teknik Informatika, 13(2), 101-107. https://doi.org/https://doi.org/10.31937/ti.v13i2.2124