Deteksi Komentar Spam Bahasa Indonesia Pada Instagram Menggunakan Naive Bayes

Antonius Rachmat C; Yuan Lukito

doi:10.31937/ti.v9i1.564

Deteksi Komentar Spam Bahasa Indonesia Pada Instagram Menggunakan Naive Bayes

Authors

Antonius Rachmat C Universitas Kristen Duta Wacana Yogyakarta
Yuan Lukito Universitas Kristen Duta Wacana Yogyakarta

DOI:

https://doi.org/10.31937/ti.v9i1.564

Abstract

Instagram is the most famous pictures and videos media sharing based on the web & mobile application. Instagram users can have picture posts that can be commented by their followers. Indonesian public figures such as actors, actresses, musicians use Instagram to promote their activities to their followers. Unfortunately, there are a lot of spam comments in Instagram that need special attention and have to be removed. This research grabs Instagram comments and builds the dataset from Indonesian public figures who have more than one million followers. By using preprocessing (tokenization, stop words removal, and stemming), TF-IDF weighting, and supervised learning, Naive Bayes method is used to detect spam comments in Indonesian. Naive Bayes produces 74,31% accuracy rate on unbalanced datasets and 77,25% accuracy rate on balanced datasets. This result shows that Naí¯ve Bayes can be used to build an automatic Indonesian spam comments detector on Instagram with high accuracy rate. The novelty of this research is that Naive Bayes can be used to detect spam comment on our Indonesian Instagram comments dataset.

Index Terms”Instagram, Naive Bayes, Indonesian spam comments, spam comments detection.

Downloads

Download data is not yet available.

Downloads

Published

2017-04-26

How to Cite

C, A. R., & Lukito, Y. (2017). Deteksi Komentar Spam Bahasa Indonesia Pada Instagram Menggunakan Naive Bayes. Ultimatics : Jurnal Teknik Informatika, 9(1), 50–58. https://doi.org/10.31937/ti.v9i1.564

Download Citation

Issue

Vol. 9 No. 1 (2017): Ultimatics: Jurnal Ilmu Teknik Informatika

Section

Articles

License

Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlike International License (CC-BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.

Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.

Copyright without Restrictions

The journal allows the author(s) to hold the copyright without restrictions and will retain publishing rights without restrictions.

The submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the ULTIMATICS or its Editorial Staff. The main (first/corresponding) author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission.

Deteksi Komentar Spam Bahasa Indonesia Pada Instagram Menggunakan Naive Bayes

Authors

DOI:

Abstract

Downloads

Downloads

Published

How to Cite

Issue

Section

License

button

articletemplate

formcopyright

Information

ultimatics