A Comparison of Traditional Machine Learning Approaches for Supervised Feedback Classification in Bahasa Indonesia

  • Andre Rusli
  • Alethea Suryadibrata Universitas Multimedia Nusantara
  • Samiaji Bintang Nusantara
  • Julio Christian Young

Abstract

The advancement of machine learning and natural language processing techniques hold essential opportunities to improve the existing software engineering activities, including the requirements engineering activity. Instead of manually reading all submitted user feedback to understand the evolving requirements of their product, developers could use the help of an automatic text classification program to reduce the required effort. Many supervised machine learning approaches have already been used in many fields of text classification and show promising results in terms of performance. This paper aims to implement NLP techniques for the basic text preprocessing, which then are followed by traditional (non-deep learning) machine learning classification algorithms, which are the Logistics Regression, Decision Tree, Multinomial Naïve Bayes, K-Nearest Neighbors, Linear SVC, and Random Forest classifier. Finally, the performance of each algorithm to classify the feedback in our dataset into several categories is evaluated using three F1 Score metrics, the macro-, micro-, and weighted-average F1 Score. Results show that generally, Logistics Regression is the most suitable classifier in most cases, followed by Linear SVC. However, the performance gap is not large, and with different configurations and requirements, other classifiers could perform equally or even better.

Downloads

Download data is not yet available.
Published
2020-07-02
How to Cite
Rusli, A., Suryadibrata, A., Nusantara, S., & Young, J. (2020). A Comparison of Traditional Machine Learning Approaches for Supervised Feedback Classification in Bahasa Indonesia. IJNMT (International Journal of New Media Technology), 7(1), 28-32. https://doi.org/https://doi.org/10.31937/ijnmt.v1i1.1485