Multiclass Emotion Detection on YouTube Comments Using IndoBERT

A Web-Based Incremental Learning System with Multiple Data Split Evaluation

Authors

  • Naufal Faculty of Science and Technology, Universitas PGRI Yogyakarta
  • Nurirwan Saputra

DOI:

https://doi.org/10.31937/ti.v17i2.4558

Abstract

YouTube comments contain rich emotional expressions, but their large volume makes manual analysis inefficient. This study proposes a multiclass emotion classification approach for Indonesian YouTube comments using the IndoBERT model integrated with a database-driven incremental learning system.

Comment data were collected through the YouTube Data API and labeled into six emotion categories: anger, sadness, happiness, fear, surprise, and neutral. Text preprocessing included lowercasing, text cleaning, and normalization of informal Indonesian words. The model was fine-tuned using three training–testing split scenarios (60:40, 70:30, and 80:20).

The results show that the 80:20 split achieved the highest accuracy of 68%, influenced by an imbalanced class distribution with underrepresented minority classes. In addition, the system supports continuous data storage and incremental retraining, allowing the model to learn from new data without retraining from scratch. This adaptive mechanism makes the proposed system suitable for long-term emotion analysis on YouTube comments.

Downloads

Download data is not yet available.

Additional Files

Published

2026-01-22

How to Cite

Naufal, & Saputra, N. (2026). Multiclass Emotion Detection on YouTube Comments Using IndoBERT: A Web-Based Incremental Learning System with Multiple Data Split Evaluation. Ultimatics : Jurnal Teknik Informatika, 17(2), 263–269. https://doi.org/10.31937/ti.v17i2.4558