Diabetes Prediction Using Machine Learning: Methods, Challenges, and Insights from a Systematic Literature Review (SLR)

Authors

  • Mustafa M. Abd zaid Islamic University, Najaf, Iraq
  • Ahmed Abed Mohammed College of Computer Science and Information Technology, University of Al-Qadisiyah

DOI:

https://doi.org/10.31642/JoKMC/2018/120204

Keywords:

Diabetes Prediction, ; Machine Learning, SLR, Research Gaps, Data Preprocessing.

Abstract

This study systematically reviews and critically evaluates the current state of machine learning models for diabetes prediction, addressing key methodologies, challenges, and insights. Despite the growing body of research, a comprehensive systematic literature review (SLR) on this topic has been lacking. By consulting five major scientific databases (IEEE, ScienceDirect, Springer, Scopus, and ACM), this paper offers an in-depth analysis of existing studies' strengths, limitations, and research gaps. Key challenges discussed include data collection and preprocessing, handling missing values, feature importance assessment, standardization, and addressing class imbalance in datasets. Additionally, the review identifies underexplored areas and highlights opportunities for future research, such as developing standardized preprocessing frameworks and exploring advanced hybrid models. This SLR aims to guide researchers by summarizing existing evidence, resolving conflicts in the literature, and providing actionable directions for advancing diabetes prediction through machine learning.

Downloads

Download data is not yet available.

References

[1] H. Sun et al., “IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045,” Diabetes Research and Clinical Practice, vol. 183, p. 109119, Jan. 2022, doi: 10.1016/j.diabres.2021.109119.

[2] S. Alam, Md. K. Hasan, S. Neaz, N. Hussain, Md. F. Hossain, and T. Rahman, “Diabetes Mellitus: Insights from Epidemiology, Biochemistry, Risk Factors, Diagnosis, Complications and Comprehensive Management,” Diabetology, vol. 2, no. 2, pp. 36–50, Apr. 2021, doi: 10.3390/diabetology2020004.

[3] D. J. Hunter, “Gene–environment interactions in human diseases,” Nature Reviews Genetics, vol. 6, no. 4, pp. 287–298, Apr. 2005, doi: 10.1038/nrg1578.

[4] E. Afsaneh, A. Sharifdini, H. Ghazzaghi, and M. Z. Ghobadi, “Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: a comprehensive review,” Diabetology & Metabolic Syndrome, vol. 14, no. 1, p. 196, Dec. 2022, doi: 10.1186/s13098-022-00969-9.

[5] I. D. Dinov, “Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data,” GigaScience, vol. 5, no. 1, p. 12, Dec. 2016, doi: 10.1186/s13742-016-0117-6.

[6] A. F. Fadhlullah and T. Widiyaningtyas, “Comparative Analysis of Decision Tree and Random Forest Algorithms for Diabetes Prediction,” JTAM (Jurnal Teori dan Aplikasi Matematika), vol. 8, no. 4, p. 1121, Oct. 2024, doi: 10.31764/jtam.v8i4.24388.

[7] A. Z. Arrayyan, H. Setiawan, and K. T. Putra, “Naive Bayes for Diabetes Prediction: Developing a Classification Model for Risk Identification in Specific Populations,” Semesta Teknika, vol. 27, no. 1, pp. 28–36, Apr. 2024, doi: 10.18196/st.v27i1.21008.

[8] B. Thuraka, V. Pasupuleti, C. S. Kodete, R. S. Chigurupati, N. S. K. M. K. Tirumanadham, and V. Shariff, “Enhancing Diabetes Prediction using Hybrid Feature Selection and Ensemble Learning with AdaBoost,” in 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), IEEE, Oct. 2024, pp. 1132–1139. doi: 10.1109/I-SMAC61858.2024.10714776.

[9] M. A. B. Khan, M. J. Hashim, J. K. King, R. D. Govender, H. Mustafa, and J. al Kaabi, “Epidemiology of Type 2 Diabetes – Global Burden of Disease and Forecasted Trends,” Journal of Epidemiology and Global Health, vol. 10, no. 1, p. 107, 2019, doi: 10.2991/jegh.k.191028.001.

[10] W. Strielkowski, A. Vlasov, K. Selivanov, K. Muraviev, and V. Shakhnov, “Prospects and Challenges of the Machine Learning and Data-Driven Methods for the Predictive Analysis of Power Systems: A Review,” Energies, vol. 16, no. 10, p. 4025, May 2023, doi: 10.3390/en16104025.

[11] M. Bangar and P. Chaudhary, "A novel approach for the classification of diabetic maculopathy using discrete wavelet transforms and a support vector machine," *AIMS Electronics & Electrical Engineering*, vol. 7, no. 1, 2023.

[12] M. Khanna, L. K. Singh, and H. Garg, "A novel approach for human diseases prediction using nature inspired computing & machine learning approach," *Multimedia Tools and Applications*, vol. 83, no. 6, pp. 17773–17809, 2024.

Downloads

Published

2026-01-05

How to Cite

Abd zaid, M. M., & Mohammed , A. A. . (2026). Diabetes Prediction Using Machine Learning: Methods, Challenges, and Insights from a Systematic Literature Review (SLR). Journal of Kufa for Mathematics and Computer, 12(2), 21-27. https://doi.org/10.31642/JoKMC/2018/120204

Share