Implementasi Algoritma Klasifikasi untuk Analisis Sentimen Media Sosial Tiktok Tahun 2025
DOI:
https://doi.org/10.55606/jutiti.v5i2.5644Keywords:
Classification, Naive Bayes, Sentiment Analysis, SVM, TikTokAbstract
TikTok has emerged as one of the fastest-growing social media platforms in 2025, especially among the younger generation. Beyond being a space for creative content sharing, TikTok has also become a vital platform for the exchange of public opinion, primarily through user comments. As user engagement intensifies, sentiment analysis on TikTok comments becomes increasingly essential to understanding public perception of various issues, trends, public figures, and brands. This study aims to analyze sentiment in TikTok user comments using machine learning classification algorithms. The research compares the performance of three widely used algorithms in text classification: Naive Bayes, Support Vector Machine (SVM), and Random Forest. A dataset of 5,000 public TikTok comments was collected through web scraping of trending videos from January to March 2025. The comments, written in Indonesian, underwent several text pre-processing steps, including tokenization, stopword removal, and stemming, to normalize the data. The TF-IDF method was then applied to extract numerical features from the textual data. A stratified data split was used to divide the dataset into training (80%) and testing (20%) subsets, ensuring balanced sentiment class distribution. Performance evaluation was conducted using accuracy, precision, recall, and F1-score metrics. Among the tested models, SVM achieved the highest accuracy of 89.7%, outperforming Naive Bayes and Random Forest across all metrics. These results indicate that SVM is particularly well-suited for classifying short, informal text such as TikTok comments. The findings contribute to the advancement of sentiment analysis in social media environments, specifically for Indonesian language data on TikTok. Moreover, the study provides valuable insights for industry stakeholders, marketers, and academic researchers seeking to implement data-driven public opinion analysis using machine learning techniques on emerging social media platforms.
Downloads
References
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5-32. https://doi.org/10.1023/A:1010933404324
Fajri, D., & Handayani, L. (2020). Preprocessing Data Teks Bahasa Indonesia untuk Analisis Sentimen. Jurnal Pengolahan Data, 9(1), 1-8.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Springer. https://doi.org/10.1007/978-0-387-84858-7
Joachims, T. (1998). Text Categorization with Support Vector Machines: Learning with Many Relevant Features. Proceedings of ECML, 137-142. https://doi.org/10.1007/BFb0026683
Koto, F., & Rahmaningtyas, D. (2016). IndoSum: A New Benchmark Dataset for Indonesian Text Summarization. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP).
Kurniawan, R. (2019). Pengaruh Teknik Preprocessing Terhadap Akurasi Analisis Sentimen di Twitter. Seminar Nasional Sistem Informasi Indonesia, 5(1), 88-93.
Liu, B. (2012). Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers. https://doi.org/10.1007/978-3-031-02145-9
Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071
Oktaviani, R. A., & Nurul, H. (2023). Implementasi Random Forest untuk Analisis Sentimen pada Ulasan Produk Shopee. Jurnal Sistem Informasi, 12(2), 98-106.
Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 2(1-2), 1-135. https://doi.org/10.1561/1500000011
Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment Classification using Machine Learning Techniques. https://doi.org/10.3115/1118693.1118704
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. https://doi.org/10.1016/0306-4573(88)90021-0
Sastrawi (2023). Sastrawi: Python library for Indonesian stemming. Diakses dari: https://github.com/sastrawi/sastrawi
Sebastiani, F. (2002). Machine learning in automated text categorization. https://doi.org/10.1145/505282.505283
Sun, A., & Lim, E. (2014). Hierarchical Text Classification and Evaluation. Data Mining and Knowledge Discovery, 28(3), 719-761.
TikTok Global Report. (2025). TikTok User Trends and Engagement. [Online]
TikTok Indonesia. (2025). Laporan Statistik Pengguna TikTok Indonesia Tahun 2025. ByteDance Research.
Wibowo, A., & Saputra, R. (2021). Implementasi Naive Bayes dan SVM pada Analisis Sentimen Twitter. Jurnal Teknologi Informasi.
Wibowo, A., & Setiawan, D. (2021). Analisis Sentimen Komentar YouTube Menggunakan Metode Naive Bayes dan SVM. Jurnal Teknologi Informasi dan Ilmu Komputer, 8(1), 45-51.
Zahra, F., & Prasetyo, D. (2022). Perbandingan Algoritma Klasifikasi Naive Bayes dan SVM dalam Analisis Sentimen Twitter tentang Vaksin COVID-19. Jurnal Ilmu Komputer dan Informasi, 15(3), 112-119.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Jurnal Teknik Informatika dan Teknologi Informasi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.





