Application of SMOTE Random Forest Classification and Gradient Boosting on Imbalanced Tuberculosis Data
Article Metrics
Abstract view : 194 timesAbstract
Tuberculosis (TB) is an infectious disease that remains a serious problem in Indonesia due to its spread and imbalanced data in cases. This study aims to compare the performance of Random Forest and Gradient Boosting algorithms in classifying tuberculosis in imbalanced data. The methods used include the application of the Synthetic Minority Oversampling Technique (SMOTE) as a data balancing method, as well as model evaluation using the metrics of accuracy, precision, sensitivity, specificity, and AUC. The results show that Gradient Boosting without SMOTE produces the best performance with an accuracy of 93% and an AUC of 0.91, while the application of SMOTE actually reduces the performance of the model. Meanwhile, Random Forest showed stable results in both conditions with an accuracy of 93% and an AUC of 0.89. Thus, it can be concluded that Gradient Boosting without SMOTE provides the most optimal classification results and can be the basis for developing classification methods for Imbalanced Data in tuberculosis.
Abstract is a brief representation of the whole article which contains the context of the problem (background), the purpose of the research, the principal methods, the results and the major conclusion (contribution). An abstract is often presented separately from the article, so it must be able to stand alone. Thus, the reference must be avoided. Abstract must be written in Nunito , with no more than 300 words in one paragraph.
References
Ardiyansyah, Rahayuningsih, P. A., & Maulana, R. (2018). Analisis Perbandingan Algoritma Klasifikasi Data Mining Untuk Dataset Blogger Dengan Rapid Miner. JURNAL KHATULISTIWA INFORMATIKA, 6(1).
Arifin, O., & Sasongko, T. B. (2018). Analisa perbandingan tingkat performansi metode support vector machine dan naïve bayes classifier. Seminar Nasional Teknologi Informasi Dan Multimedia 2018, 6(1), 67–72. https://ojs.amikom.ac.id/index.php/semnasteknomedia/article/view/2059/1868
BREIMAN, L. (2001). Random Forest. Kluwer Academic Publishers, 45, 5–32.
Chawla, N. V, Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chawla, N. V, Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). LNAI 2838 - SMOTEBoost: Improving Prediction of the Minority Class in Boosting. 107–119.
Fajri, M., & Primajaya, A. (2023). Komparasi Teknik Hyperparameter Optimization pada SVM untuk Permasalahan Klasifikasi dengan Menggunakan Grid Search dan Random Search. Journal of Applied Informatics and Computing, 7(1), 14–19. https://doi.org/10.30871/jaic.v7i1.5004
Fernández, A., García, S., Galar, M., Prati, R. C., Krawczyk, B., & Herrera, F. (2018). Learning from Imbalanced Data Sets. Learning from Imbalanced Data Sets. https://doi.org/10.1007/978-3-319-98074-4
Friedman, J. H. (1999). Stochastic Gradient Boosting. CSIRO CMIS.
Hastuti, K. (2012). ANALISIS KOMPARASI ALGORITMA KLASIFIKASI DATA MINING UNTUK PREDIKSI MAHASISWA NON AKTIFANALISIS KOMPARASI ALGORITMA KLASIFIKASI DATA MINING UNTUK PREDIKSI MAHASISWA NON AKTIF. Seminar Nasional Teknologi Informasi & Komunikasi Terapan.
Kamber, M. (n.d.). Data Mining : Concept and Techniques Second Edition. Morgan Kaufmann Publishers.
Santi, V. M., Nafisah, L., & Meidianingsih, Q. (2022). Penerapan Metode SMOTE CHAID dalam Klasifikasi Tuberkulosis Relapse. Jurnal Statistika Dan Aplikasinya, 6(1).
Sari, G. K., Sarifuddin, & Setyawati, T. (2022). TUBERKULOSIS PARU POST WODEC PLEURAL EFUSION: LAPORAN KASUS PULMONARY TUBERCULOSIS POST WODEC PLEURAL EFFUSION: CASE REPORT. Jurnal Medical Profession (MedPro), Vol 4(No 2), 175–177.
Septi Pangastuti, S., PEMBIMBING Kartika Fithriasari, D., Nur Iriawan, Ms., Magister Departemen Statistika Fakultas Matematika, P., & Sains Data, D. (2018). PERBANDINGAN METODE ENSEMBLE RANDOM FOREST DENGAN SMOTE-BOOSTING DAN SMOTE-BAGGING PADA KLASIFIKASI DATA MINING UNTUK KELAS IMBALANCE (Studi Kasus : Data Beasiswa Bidikmisi Tahun 2017 di Jawa Timur).
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45, 427–437.
Suryana, E. S., Warsito, B., & Suparti. (2021). PENERAPAN GRADIENT BOOSTING DENGAN HYPEROPT UNTUK MEMPREDIKSI KEBERHASILAN TELEMARKETING BANK. JURNAL GAUSSIAN, 10(5), 617–623.
Copyright (c) 2025 Mutiara Amanda, Ismail Husein

This work is licensed under a Creative Commons Attribution 4.0 International License.











