Evektivitas Xgboost Lightgbm dan Catboost pada Dataset Imbalanced Predictive Maintenance

Authors

  • Moeng Sakmar Universitas Jenderal Soedirman
  • Nurul Tiara Kadir Universitas Jenderal Soedirman
  • Puteri Awaliatush Shofo Universitas Jenderal Soedirman
  • Agus Darmawan Universitas Jenderal Soedirman

DOI:

https://doi.org/10.61124/sinta.v3i1.145

Keywords:

predictive maintenance, class imbalance, SMOTE, machine learning

Abstract

In the era of Industry 4.0, unexpected machine failures have become a critical challenge, triggering unplanned downtime and significant financial losses for the manufacturing sector. A fundamental obstacle in the development of Machine Learning-based Predictive Maintenance systems is data imbalance, where damage incidents occur much less frequently than normal conditions, causing models to become biased and fail to recognize vital anomalies. This study aims to evaluate the effectiveness of the Synthetic Minority Over-sampling Technique (SMOTE) in optimizing failure detection performance on the AI4I 2020 dataset. It uses a comparative approach with three Gradient Boosting algorithms: XGBoost, LightGBM, and CatBoost. This study highlights the Accuracy Paradox phenomenon in scenarios without resampling, where high spurious accuracy masks the model's inability to detect failures or low Recall. The findings of this study show that the integration of SMOTE successfully reconstructs the model's decision boundaries, thereby significantly increasing sensitivity to minority classes. Based on an in-depth analysis using the Confusion Matrix, the XGBoost algorithm combined with SMOTE was identified as the most optimal model, as it effectively balanced critical trade-offs by achieving a high Recall to ensure asset safety, while minimizing false alarms (False Positives) that impact technician work efficiency, compared to its competitors. This study concludes that addressing data imbalance is a deterministic step in building a predictive maintenance system that is not only technically precise but also reliable and safe for implementation in real industrial ecosystems.

Author Biographies

Nurul Tiara Kadir, Universitas Jenderal Soedirman

Department of Informatics

Puteri Awaliatush Shofo, Universitas Jenderal Soedirman

Department of Informatics

Agus Darmawan, Universitas Jenderal Soedirman

Department of Informatics

References

T. Zonta, C. A. Da Costa, R. da Rosa Righi, M. J. de Lima, E. S. Da Trindade, and G. P. Li, “Predictive maintenance in the Industry 4.0: A systematic literature review,” Comput. Ind. Eng., vol. 150, p. 106889, 2020.

Z. M. Çınar, A. Abdussalam Nuhu, Q. Zeeshan, O. Korhan, M. Asmael, and B. Safaei, “Machine learning in predictive maintenance towards sustainable smart manufacturing in industry 4.0,” Sustainability, vol. 12, no. 19, p. 8211, 2020.

P. Mallioris, E. Aivazidou, and D. Bechtsis, “Predictive maintenance in Industry 4.0: A systematic multi-sector mapping,” CIRP J. Manuf. Sci. Technol., vol. 50, pp. 80–103, 2024, doi: https://doi.org/10.1016/j.cirpj.2024.02.003.

M. Achouch et al., “On predictive maintenance in industry 4.0: Overview, models, and challenges,” Appl. Sci., vol. 12, no. 16, p. 8081, 2022.

M. T. Y. Taimun, S. M. I. Sharan, M. A. Azad, and M. M. I. Joarder, “Smart maintenance and reliability engineering in manufacturing,” Saudi J. Eng. Technol., vol. 10, no. 4, pp. 189–199, 2025.

J. Dalzochio et al., “Machine learning and reasoning for predictive maintenance in Industry 4.0: Current status and challenges,” Comput. Ind., vol. 123, p. 103298, 2020.

T. P. Carvalho, F. A. Soares, R. Vita, R. da P. Francisco, J. P. Basto, and S. G. S. Alcalá, “A systematic literature review of machine learning methods applied to predictive maintenance,” Comput. Ind. Eng., vol. 137, p. 106024, 2019.

M. Pech, J. Vrchota, and J. Bednář, “Predictive maintenance and intelligent sensors in smart factory,” Sensors, vol. 21, no. 4, p. 1470, 2021.

A. Kane, A. Kore, A. Khandale, S. Nigade, and P. Joshi, Predictive Maintenance using Machine Learning. 2022. doi: 10.48550/arXiv.2205.09402.

O. Merkt, “Predictive Models for Maintenance Optimization: an Analytical Literature Survey of Industrial Maintenance Strategies *,” Jan. 2020.

B. Lu, Z. Chen, and X. Zhao, “Data-driven dynamic predictive maintenance for a manufacturing system with quality deterioration and online sensors,” Reliab. Eng. [?] Syst. Saf., vol. 212, Mar. 2021, doi: 10.1016/j.ress.2021.107628.

H. Kaur, H. S. Pannu, and A. K. Malhi, “A systematic review on imbalanced data challenges in machine learning: Applications and solutions,” ACM Comput. Surv., vol. 52, no. 4, pp. 1–36, 2019.

S. Cicak and U. Avci, Handling Imbalanced Data in Predictive Maintenance: A Resampling-Based Approach. 2023. doi: 10.1109/HORA58378.2023.10156799.

D. B. Arianto and S. Nurrahmasita, “A Comparative Study For Imbalanced Data Techniques Of Classification Algorithms,” Citiz. J. Ilm. Multidisiplin Indones., vol. 5, no. 4, pp. 1064–1073, 2025.

Y. Zhang, X. Li, L. Gao, L. Wang, and L. Wen, “Imbalanced data fault diagnosis of rotating machinery using synthetic oversampling and feature learning,” J. Manuf. Syst., vol. 48, pp. 34–50, Sep. 2018, doi: 10.1016/j.jmsy.2018.04.005.

S. Liu, H. Jiang, Z. Wu, Z. Yi, and R. Wang, “Intelligent fault diagnosis of rotating machinery using a multi-source domain adaptation network with adversarial discrepancy matching,” Reliab. Eng. Syst. Saf., vol. 231, p. 109036, 2023, doi: https://doi.org/10.1016/j.ress.2022.109036.

F. Thabtah, S. Hammoud, F. Kamalov, and A. Gonsalves, “Data imbalance in classification: Experimental evaluation,” Inf. Sci. (Ny)., vol. 513, pp. 429–441, 2020.

B. Pes and G. Lai, “Cost-sensitive learning strategies for high-dimensional and imbalanced data: a comparative study,” PeerJ Comput. Sci., vol. 7, p. e832, Dec. 2021, doi: 10.7717/peerj-cs.832.

D. Elreedy and A. F. Atiya, “A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance,” Inf. Sci. (Ny)., vol. 505, pp. 32–64, 2019.

D. Arifah, T. H. Saragih, D. Kartini, and M. I. Mazdadi, “Application of SMOTE to Handle Imbalance Class in Deposit Classification Using the Extreme Gradient Boosting Algorithm,” J. Ilm. Tek. Elektro Komput. dan Inf., vol. 9, no. 2, pp. 396–410, 2023.

M. Douiba, S. Benkirane, A. Guezzaz, and M. Azrour, “Anomaly detection model based on gradient boosting and decision tree for IoT environments security,” J. Reliab. Intell. Environ., vol. 9, Jul. 2022, doi: 10.1007/s40860-022-00184-3.

J. T. Hancock and T. M. Khoshgoftaar, “CatBoost for big data: an interdisciplinary review,” J. big data, vol. 7, no. 1, p. 94, 2020.

A. Odeh, Q. Abu Al-Haija, A. Aref, and A. abu taleb, “Comparative Study of CatBoost, XGBoost, and LightGBM for Enhanced URL Phishing Detection: A Performance Assessment,” J. Internet Serv. Inf. Secur., vol. 13, pp. 1–11, Dec. 2023, doi: 10.58346/JISIS.2023.I4.001.

D. Guidotti, L. Pandolfo, and L. Pulina, “A Systematic Literature Review of Supervised Machine Learning Techniques for Predictive Maintenance in Industry 4.0,” IEEE Access, vol. PP, p. 1, Jan. 2025, doi: 10.1109/ACCESS.2025.3578686.

S. Matzka, “Explainable artificial intelligence for predictive maintenance applications,” in 2020 third international conference on artificial intelligence for industries (ai4i), IEEE, 2020, pp. 69–74.

N. V Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002.

Downloads

Published

12/31/2025

How to Cite

Moeng Sakmar, Kadir, N. T., Puteri Awaliatush Shofo, & Agus Darmawan. (2025). Evektivitas Xgboost Lightgbm dan Catboost pada Dataset Imbalanced Predictive Maintenance. Jurnal SINTA: Sistem Informasi Dan Teknologi Komputasi, 3(1), 36–44. https://doi.org/10.61124/sinta.v3i1.145