Comparative Analysis of Naive Bayes and Support Vector Machine for Sentiment Classification of Indonesian-Language Mobile Application Reviews on Google Play Store
DOI:
https://doi.org/10.69533/informatech.volume3number1.515Keywords:
Sentiment Analysis, Google Play Store, Naive Bayes, Support Vector Machine, TF-IDFAbstract
This study conducted a comparative performance evaluation of Multinomial Naive Bayes and Support Vector Machine (SVM) with a linear kernel in classifying the sentiment of Indonesian-language mobile application reviews collected from the Google Play Store. A total of 2,847 reviews targeting the GoPay digital wallet application were gathered via web scraping using the google-play-scraper library. After preprocessing, including case folding, cleansing, tokenization, stopword removal, and stemming using the Sastrawi library, the final dataset comprised 2,634 usable reviews. Sentiment labeling was conducted automatically based on star ratings: ratings of 4 and 5 were assigned as positive (1,841 reviews, 69.9%), while ratings of 1 and 2 were assigned as negative (793 reviews, 30.1%). Feature extraction used TF-IDF with a vocabulary size of 8,432 unique terms. Model training used an 80:20 train-test split with stratified sampling. SVM parameters were set to kernel=linear and C=1.0; Naive Bayes used alpha=1.0 (Laplace smoothing). Experimental results show that SVM achieved an accuracy of 88.3%, precision of 0.89, recall of 0.88, and F1-score of 0.88, while Naive Bayes obtained an accuracy of 82.1%, precision of 0.84, recall of 0.82, and F1-score of 0.83. SVM demonstrated superior performance across all four evaluation metrics, with the largest gap observed in the F1-score for the negative class (SVM: 0.71 vs. Naive Bayes: 0.56). These findings confirm that SVM is more robust against class imbalance in informal Indonesian-language review data.
Downloads
References
Statista, "Number of mobile app downloads worldwide from 2016 to 2023," Statista Research Department, 2024. [Online]. Available: https://www.statista.com/statistics/271644/worldwide-free-and-paid-mobile-app-store-downloads/
[P. R. Sari et al., "Comparison of Naive Bayes and SVM Algorithms for Sentiment Analysis of PUBG Mobile on Google Play Store," Sistemasi: Jurnal Sistem Informasi. [Online]. Available: http://sistemasi.ftik.unisi.ac.id
[E. Noei, F. Zhang, and Y. Zou, "Too Many User-Reviews! What Should App Developers Look at First?," IEEE Transactions on Software Engineering, vol. 47, no. 2, pp. 367-378, Feb. 2021, doi: 10.1109/TSE.2019.2893171.
J. O. Leandro and M. I. Fianty, "Evaluation of Sentiment Analysis Methods for Social Media Applications: A Comparison of SVM and Naive Bayes," International Journal on Informatics Visualization. [Online]. Available: www.joiv.org/index.php/joiv
W. Lu, Y. Zhang, W. Wen, H. Yan, and C. Li, Eds., Cyber Security, vol. 1506. Singapore: Springer Nature Singapore, 2022, doi: 10.1007/978-981-16-9229-1.
M. R. L. Cahya and E. Y. Hidayat, "Sentiment Analysis and Emotional Reviews of Hospital Services Using Naive Bayes and SVM," Inform: Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, vol. 11, no. 1, pp. 121-129, Feb. 2026, doi: 10.25139/inform.v11i1.11257.
S. A. Ghaffar, "Comparative Sentiment Analysis of Digital Wallet Applications in Indonesia Using Naive Bayes," IJIIS: International Journal of Informatics and Information Systems, vol. 8, no. 2, pp. 55-66, Mar. 2025, doi: 10.47738/ijiis.v8i2.251.
A. F. Aufar, M. A. Rosid, A. Eviyanti, and I. R. I. Astutik, "Optimizing Text Preprocessing for Accurate Sentiment Analysis on E-Wallet Reviews," JICTE, vol. 7, no. 2, pp. 42-50, Oct. 2023, doi: 10.21070/jicte.v7i2.1650.
B. Gunawan et al., "Sistem Analisis Sentimen pada Ulasan Produk Menggunakan Metode Naive Bayes," JEPIN, vol. 4, no. 2, pp. 17-29, 2018.
M. Das, S. Kamalanathan, and P. Alphonse, "A Comparative Study on TF-IDF Feature Weighting Method and its Analysis using Unstructured Dataset," IJEAST, vol. 4, no. 11, 2020.
D. A. Fatah et al., "Sentiment Analysis of Public Opinion Towards Tourism in Bangkalan Regency Using Naive Bayes Method," in E3S Web of Conferences, EDP Sciences, Mar. 2024, doi: 10.1051/e3sconf/202449901016.
B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Press, 2002.
A. M. Fajria, A. Faqih, and G. Dwilestari, "The Impact of Principal Component Analysis on Sentiment Classification Performance Using SVM," Journal of Artificial Intelligence and Engineering Applications, 2025. [Online]. Available: https://ioinformatic.org/
S. Thakur, V. K. Tiwari, and J. Agrawal, "Performance Analysis of Linear Kernel SVM Models on Real-World Datasets," Int. J. Advanced Networking and Applications, 2025.
J. Liu, Z. Liu, Q. Li, W. Kong, and X. Li, "Multi-Domain Controversial Text Detection Based on a Machine Learning and Deep Learning Stacked Ensemble," Mathematics, vol. 13, no. 9, May 2025, doi: 10.3390/math13091529.
N. F. Hidayah, K. R. P. Kartika, and S. N. Budiman, "Penerapan Metode Naive Bayes dalam Analisis Sentimen Aplikasi Sentuh Tanahku pada Google Play," 2022.
P. M. N. Dharmapatni and N. L. P. Merawati, "Penerapan Algoritma Support Vector Machine dalam Sentimen Analisis Terkait Kenaikan Tarif BPJS Kesehatan," Jurnal Bumigora Information Technology (BITe), vol. 2, no. 2, pp. 105-112, Sep. 2020, doi: 10.30812/bite.v2i2.904.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, "IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding," in Proc. 1st AACL-IJCNLP, Dec. 2020, pp. 843-857. [Online]. Available: https://arxiv.org/abs/2009.05387
A. D. Fitriyanto and P. Purwanto, "Analisis Sentimen Ulasan DANA dari Play Store dengan Metode SVM, Logistic Regression, Naive Bayes dan KNN," Building of Informatics, Technology and Science (BITS), vol. 7, no. 3, pp. 1887-1899, Dec. 2025, doi: 10.47065/bits.v7i3.8769.
M. M. Rahman, A. I. Shiplu, Y. Watanobe, and M. A. Alam, "RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis," arXiv preprint arXiv:2406.00367, 2024. [Online]. Available: https://arxiv.org/abs/2406.00367 M. M. Rahman, A. I. Shiplu, Y. Watanobe, and M. A. Alam, "RoBERTa-BiLSTM: A Context-Aware Hybrid Model for Sentiment Analysis," arXiv preprint arXiv:2406.00367, 2024. [Online]. Available: https://arxiv.org/abs/2406.00367
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Norris Elden Salassa, Arpen Patanduk, Ade Yusupa, Yaulie Deo Y. Rindengan (Author)

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.










