FastText Word Embedding and Random Forest Classifier for User Feedback Sentiment Classification in Bahasa Indonesia
Abstract
User feedback nowadays become a platform for software developer to identify and understand user requirements, preferences, and user’s complaints. It is important for the developer to identify the problem that exist in user feedback. According to software growth, user amount also growth. Read and classify one by one manually are wasting time and energy. As the solution for the problem, sentiment analysis system using Random Forest Classifier which use word embedding as the feature extraction is made to help to classify which feedback is positive, neutral, or negative. Random Forest Algorithm is chosen because it gives the best performance, even its need the larger resources. Furthermore, with word embedding, the words which has semantic or syntactic similarities will be detected. Word embedding does not need stemming and stop word removal, so the context of the sentences keep remains. This research is made to implement word embedding to classify sentiment of user feedbacks using Random Forest Classifier. 70.27% accuracy, 80% precision, 54 recall and 54% F1 score is reached when BYU dataset (200 dimension) as embedding dataset with the train and test ratio 80:20.
Downloads
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike International License (CC-BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
Copyright without Restrictions
The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.
The submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the ULTIMATICS or its Editorial Staff. The main (first/corresponding) author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission.