Diabetes Prediction Using Machine Learning: Methods, Challenges, and Insights from a Systematic Literature Review (SLR)
DOI:
https://doi.org/10.31642/JoKMC/2018/120204Keywords:
Diabetes Prediction, ; Machine Learning, SLR, Research Gaps, Data Preprocessing.Abstract
This study systematically reviews and critically evaluates the current state of machine learning models for diabetes prediction, addressing key methodologies, challenges, and insights. Despite the growing body of research, a comprehensive systematic literature review (SLR) on this topic has been lacking. By consulting five major scientific databases (IEEE, ScienceDirect, Springer, Scopus, and ACM), this paper offers an in-depth analysis of existing studies' strengths, limitations, and research gaps. Key challenges discussed include data collection and preprocessing, handling missing values, feature importance assessment, standardization, and addressing class imbalance in datasets. Additionally, the review identifies underexplored areas and highlights opportunities for future research, such as developing standardized preprocessing frameworks and exploring advanced hybrid models. This SLR aims to guide researchers by summarizing existing evidence, resolving conflicts in the literature, and providing actionable directions for advancing diabetes prediction through machine learning.
Downloads
References
[1] H. Sun et al., “IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045,” Diabetes Research and Clinical Practice, vol. 183, p. 109119, Jan. 2022, doi: 10.1016/j.diabres.2021.109119.
[2] S. Alam, Md. K. Hasan, S. Neaz, N. Hussain, Md. F. Hossain, and T. Rahman, “Diabetes Mellitus: Insights from Epidemiology, Biochemistry, Risk Factors, Diagnosis, Complications and Comprehensive Management,” Diabetology, vol. 2, no. 2, pp. 36–50, Apr. 2021, doi: 10.3390/diabetology2020004.
[3] D. J. Hunter, “Gene–environment interactions in human diseases,” Nature Reviews Genetics, vol. 6, no. 4, pp. 287–298, Apr. 2005, doi: 10.1038/nrg1578.
[4] E. Afsaneh, A. Sharifdini, H. Ghazzaghi, and M. Z. Ghobadi, “Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: a comprehensive review,” Diabetology & Metabolic Syndrome, vol. 14, no. 1, p. 196, Dec. 2022, doi: 10.1186/s13098-022-00969-9.
[5] I. D. Dinov, “Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data,” GigaScience, vol. 5, no. 1, p. 12, Dec. 2016, doi: 10.1186/s13742-016-0117-6.
[6] A. F. Fadhlullah and T. Widiyaningtyas, “Comparative Analysis of Decision Tree and Random Forest Algorithms for Diabetes Prediction,” JTAM (Jurnal Teori dan Aplikasi Matematika), vol. 8, no. 4, p. 1121, Oct. 2024, doi: 10.31764/jtam.v8i4.24388.
[7] A. Z. Arrayyan, H. Setiawan, and K. T. Putra, “Naive Bayes for Diabetes Prediction: Developing a Classification Model for Risk Identification in Specific Populations,” Semesta Teknika, vol. 27, no. 1, pp. 28–36, Apr. 2024, doi: 10.18196/st.v27i1.21008.
[8] B. Thuraka, V. Pasupuleti, C. S. Kodete, R. S. Chigurupati, N. S. K. M. K. Tirumanadham, and V. Shariff, “Enhancing Diabetes Prediction using Hybrid Feature Selection and Ensemble Learning with AdaBoost,” in 2024 8th International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), IEEE, Oct. 2024, pp. 1132–1139. doi: 10.1109/I-SMAC61858.2024.10714776.
[9] M. A. B. Khan, M. J. Hashim, J. K. King, R. D. Govender, H. Mustafa, and J. al Kaabi, “Epidemiology of Type 2 Diabetes – Global Burden of Disease and Forecasted Trends,” Journal of Epidemiology and Global Health, vol. 10, no. 1, p. 107, 2019, doi: 10.2991/jegh.k.191028.001.
[10] W. Strielkowski, A. Vlasov, K. Selivanov, K. Muraviev, and V. Shakhnov, “Prospects and Challenges of the Machine Learning and Data-Driven Methods for the Predictive Analysis of Power Systems: A Review,” Energies, vol. 16, no. 10, p. 4025, May 2023, doi: 10.3390/en16104025.
[11] M. Bangar and P. Chaudhary, "A novel approach for the classification of diabetic maculopathy using discrete wavelet transforms and a support vector machine," *AIMS Electronics & Electrical Engineering*, vol. 7, no. 1, 2023.
[12] M. Khanna, L. K. Singh, and H. Garg, "A novel approach for human diseases prediction using nature inspired computing & machine learning approach," *Multimedia Tools and Applications*, vol. 83, no. 6, pp. 17773–17809, 2024.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 Mustafa M. Abd zaid, Ahmed Abed Mohammed

This work is licensed under a Creative Commons Attribution 4.0 International License.
which allows users to copy, create extracts, abstracts, and new works from the Article, alter and revise the Article, and make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work.









