Analisis Perbandingan Algoritma Random Forest, SVM, dan Logistic Regression untuk Menentukan Model Terbaik Prediksi Penyakit Diabetes

Authors

  • Alghifar Firgiawan Universitas Bina Sarana Informatika
  • Fauzan Nawwir Andriansyah Universitas Bina Sarana Informatika
  • Raihan Naufal Ramadhan Universitas Bina Sarana Informatika
  • Sumanto Sumanto Universitas Bina Sarana Informatika
  • Imam Budiawan Universitas Bina Sarana Informatika
  • Roida Pakpahan Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.55606/jutiti.v5i3.6213

Keywords:

data mining, machine learning, Random Forest, support vector machine, Logistic Regression

Abstract

Diabetes is a chronic metabolic disorder characterized by elevated blood glucose levels caused by the body’s inability to produce or effectively respond to insulin. The increasing prevalence of diabetes in Indonesia requires accurate data-driven early detection systems to assist the diagnostic process. This study aims to compare the performance of three machine learning algorithms—Support Vector Machine (SVM), Random Forest, and Logistic Regression—in predicting diabetes disease based on patient clinical data. The dataset used was obtained from the Kaggle repository titled 100,000 Diabetes Clinical Dataset. The research process was conducted using the Orange Data Mining software through several stages, including data preprocessing, One-Hot Encoding transformation, model training, and evaluation using the 10-Fold Cross Validation method. The results show that the Random Forest algorithm achieved the best performance with an accuracy of 97.1%, followed by Logistic Regression at 96.0% and SVM at 92.3%. These findings indicate that ensemble-based methods such as Random Forest outperform others in producing stable and accurate predictions for diabetes diagnosis

Downloads

Download data is not yet available.

References

Akhsani, R., Prayoga, S., Basatha, R., Akbar, M. S., Aisyah Elfaiz, E., Putra, C. D., Surabaya, N., Kec, J. K., & Surabaya, G. (n.d.). Penerapan metode Naïve Bayes untuk klasifikasi performa siswa. Sistemasi: Jurnal Sistem Informasi. http://sistemasi.ftik.unisi.ac.id

Choksi, P. (2023). Comprehensive diabetes clinical dataset (100k rows). Kaggle.

Citra Mawani, A., Li Hin, L., & Anubhakti, D. (2023). Deteksi dini gejala awal penyakit diabetes menggunakan algoritma Random Forest. Idealis: Indonesia Journal Information System, 6(2). http://jom.fti.budiluhur.ac.id/index.php/IDEALIS/index

Dita Ayuningtiyas Tuti, Fitriyani, N. L., & Maulana, J. (2023). Literature study: Risk factors for the incidence of diabetes mellitus in productive age in Indonesia. Journal of Multidisciplinary Science, 2(6), 288–296. https://doi.org/10.58330/prevenire.v2i6.413

Exploring the non-communicable disease burden in Indonesia – Findings from the 2023 health survey. (2025). Indonesia Journal of Public Health Nutrition, 5(2). https://doi.org/10.7454/ijphn.v5i2.1064

Fadli Kurniawan, M., & Ayu Megawaty, D. (2025). Comparison of logistic regression, random forest, support vector machine (SVM) and K-nearest neighbor (KNN) algorithms in diabetes prediction. Journal of Applied Informatics and Computing, 9(5). http://jurnal.polibatam.ac.id/index.php/JAIC

Fadlianda, D., Prananto, A., Eriska, C. A., Anjanira, S., Syadzwina, N., & Ula, M. (n.d.). Diagnosis penyakit jantung menggunakan algoritma Support Vector Machine (SVM). SENASTIKA Universitas Malikussaleh. https://www.kaggle.com/code/rafiromolo/prediksi-

Hakim, L., Sobri, A., Sunardi, L., & Nurdiansyah, D. (2025). Prediksi penyakit jantung berbasis machine learning dengan menggunakan metode K-NN. Jurnal Digital Teknologi Informasi, 7(2), 14. https://doi.org/10.32502/digital.v7i2.9429

International Diabetes Federation. (2024, Oktober). Indonesia – Western Pacific members. International Diabetes Federation.

Khairunnisa, A. (n.d.). Analisis perbandingan model regresi logistik dan probit dengan K-fold cross validation dalam mengidentifikasi faktor signifikan pada penyakit diabetes melitus. https://doi.org/10.26555/konvergensi.30879

Lu, W., Zhang, Y., Wen, W., Yan, H., & Li, C. (Eds.). (2022). Cyber security (Vol. 1506). Springer Nature Singapore. https://doi.org/10.1007/978-981-16-9229-1

Rahaman, M. J. (2024). A comprehensive review to understand the definitions, advantages, disadvantages and applications of machine learning algorithms. International Journal of Computer Applications, 186(31), 43–47. https://doi.org/10.5120/ijca2024923868

Sanhaji, G., Febrianti, A., & Teknik, F. (n.d.). Aplikasi DIATECT untuk prediksi penyakit diabetes menggunakan SVM berbasis web (Vol. 18, No. 1).

Siswoyo, B., & Iqbal Nurhafidz, M. (n.d.). Penerapan algoritma Random Forest untuk prediksi risiko diabetes berdasarkan data kesehatan pasien. JTID Integrasi Publikasi Digital, 1(1).

Syahputra, H., & Wibowo, A. (2023). Comparison of Support Vector Machine (SVM) and Random Forest algorithm for detection of negative content on websites. Jurnal Ilmiah Teknik Elektro Komputer dan Informatika, 9(1), 165–173. https://doi.org/10.26555/jiteki.v9i1.25861

Syamsudin, T., Handhayani, T., Muhammad, _____, & Syaifudin, I. (n.d.). Perbandingan klasifikasi penyakit diabetes menggunakan metode machine learning. Jurnal Ilmu Komputer dan Sistem Informasi. https://www.kaggle.com/datasets/nanditapore/healthcar

Teknika, J., & Supriyatna, A. R. (n.d.). Prediksi penyakit diabetes menggunakan algoritma Random Forest. Teknika, 17(1), 163–172.

Yanti, D. E., Framesti, L., & Desiani, A. (n.d.). Perbandingan algoritma C4.5 dan SVM dalam klasifikasi penyakit anemia. JIP (Jurnal Informatika Polinema). https://www.kaggle.com/datasets/biswaranjanrao/an

Yusoff, M. I. M. (2024). Machine learning: An overview. Open Journal of Modelling and Simulation, 12(3), 89–99. https://doi.org/10.4236/ojmsi.2024.123006

Downloads

Published

2025-11-26

How to Cite

Firgiawan, A., Fauzan Nawwir Andriansyah, Raihan Naufal Ramadhan, Sumanto Sumanto, Imam Budiawan, & Pakpahan, R. (2025). Analisis Perbandingan Algoritma Random Forest, SVM, dan Logistic Regression untuk Menentukan Model Terbaik Prediksi Penyakit Diabetes. Jurnal Teknik Informatika Dan Teknologi Informasi, 5(3), 113–130. https://doi.org/10.55606/jutiti.v5i3.6213

Similar Articles

<< < 3 4 5 6 7 8 9 10 11 12 > >> 

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)