An Analysis of Datasets for  Student Performance Evaluation Based on Machine Learning

Authors

  • Husam Kadhim Gharkan Ministry of Education, Al-Qadisiyah Directorate of Education, Al-Qadisiyah,5009 Iraq
  • Mustafa Radif College of Computer Science and Information Technology, University of Al-Qadisiyah, Al-Qadisiyah,5009 Iraq
  • Ali Hakem Alsaeedi College of Computer Science and Information Technology, University of Al-Qadisiyah, Al-Qadisiyah,5009 Iraq https://orcid.org/0000-0003-0966-9993

DOI:

https://doi.org/10.112222/ijits.v1.i1.19507

Keywords:

Student performance prediction, data analysis, machine learning , data mining, student performance dataset

Abstract

The prediction of student academic performance through data analysis and machine learning has become increasingly significant for improving educational systems and student outcomes. This study explores three publicly available datasets—Open University Learning Analytics Dataset (OULAD), xAPI-Edu-Data, and the UCI Student Performance dataset—to identify critical factors influencing academic success across secondary and higher education levels. These datasets include diverse data sources such as institutional records, behavioral engagement logs, and survey responses. The analysis employs various machine learning and deep learning techniques to evaluate the predictive performance of traditional models, ensemble methods, and hybrid architectures. An experimental validation was performed using a Random Forest classifier on the xAPI-Edu-Data dataset, achieving 80.56% accuracy and supporting the reliability of ensemble approaches. Results show that advanced ensemble and hybrid models like RF-SVM and ECNN-ResNet outperform classical methods in terms of accuracy, precision, recall, and F1-score. This research underscores the potential of intelligent systems to enable early identification of at-risk students, facilitate personalized learning interventions, and inform strategic academic decisions. These findings contribute to a growing body of work that promotes data-driven, equitable, and effective educational practices.

References

J. W. Osborne, Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. Thousand Oaks, CA: Sage Publications, 2012.

B. Baesens, Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications. Hoboken, NJ: John Wiley & Sons, 2014.

W.-J. Chen et al., Systems of Insight for Digital Transformation: Using IBM Operational Decision Manager Advanced and Predictive Analytics. IBM Redbooks, 2015.

C. B. Davis, Making Sense of Open Data: From Raw Data to Actionable Insight. Next Generation Infrastructures Foundation, 2012.

J. Horton, “Identifying at-risk factors that affect college student success,” Int. J.Process Educ., vol. 7, no. 1, pp. 83-101, 2015.

S. Larose and G. M. Tarabulsy, “Academically at-risk students,” Handb. Youth Mentoring, vol. 7, no. 3, pp. 440–453, 2005.

D. J. Moran and R. W. Malott, Evidence-Based Educational Methods. Elsevier, 2004.

W. J. Jordan, “Defining equity: Multiple perspectives to analyzing the performance of diverse learners,” Rev. Res. Educ., vol. 34, no. 1, pp. 142–178, 2010.

S. Peters and L. A. Oliver, “Achieving quality and equity through inclusive education in an era of high-stakes testing,” Prospects, vol. 39, pp. 265–279, 2009.

R. Bodily and K. Verbert, “Review of research on student-facing learning analytics dashboards and educational recommender systems,” IEEE Trans. Learn.Technol., vol. 10, no. 4, pp. 405–418, 2017.

J. Kuzilek, M. Hlosta, and Z. Zdrahal, “Open university learning analytics dataset,” Sci. Data, vol.4, no. 1, pp. 1–8, 2017.

P. Cortez. "Student Performance," UCI Machine Learning Repository, 2008. [Online]. Available: https://doi.org/10.24432/C5TG7T.

M. R. Alzahrani, “Predicting student performance using ensemble models and learning analytics techniques,” 2024. [Unpublished].

S. Rizvi, B. Rienties, and S. A. Khoja, “The role of demographics in online learning: A decision tree based approach,” Comput. Educ., vol. 137, pp. 32–47, 2019.

J. Bravo-Agapito, S. J. Romero, and S. Pamplona, “Early prediction of undergraduate student's academic performance in completely online learning: A five-year study,” Comput. Human Behav., vol. 115, p.106595, 2021.

M. Adnan et al., “Predicting at-risk students at different percentages of course length for early intervention using machine learning models,” IEEE Access, vol. 9, pp. 7519–7539, 2021.

K. T. Chui et al., “Predicting at-risk university students in a virtual learning environment via a machine learning algorithm,” Comput. Human Behav., vol. 107, p. 105584, 2020.

A. Al-Ameri et al., “Student academic success prediction using learning management multimedia data with convoluted features and ensemble model,” ACM J. Data Inf. Qual., 2024.

N. Sharma and R. Bhardwaj, “Applying predictive analytics using clickstream data for improving the students performance.” [Unpublished].

H. Waheed et al., “Predicting academic performance of students from VLE big data using deep learning models,” Comput. Human Behav., vol. 104, p. 106189, 2020.

H. Waheed et al., “Early prediction of learners at risk in self-paced education: Aneural network approach,” Expert Syst. Appl., vol. 213, p. 118868, 2023.

M. Hooda et al., “Integrating LA and EDM for improving students success in higher education using FCN algorithm,” Math. Probl. Eng., vol. 2022, no. 1, p.7690103, 2022.

J. A. I. S. Masood et al., “A hybrid deep learning model to predict high-risk students in virtual learning environments,” IEEE Access, 2024.

U. Ashfaq, P. Booma, and R. Mafas, “Managing student performance: A predictive analytics using imbalanced data,” Int. J. Recent Technol. Eng., vol. 8, no. 6, p. 6, 2020.

A. A. Alsulami, A. S. A.-M. Al-Ghamdi, and M. Ragab, “Enhancement of e- learning student’s performance based on ensemble techniques,” Electronics, vol. 12, no. 6, p. 1508, 2023.

W. Xiao, P. Ji, and J. Hu, “RnkHEU: A hybrid feature selection method for predicting students’performance,” Sci. Program., vol. 2021, no. 1, p. 1670593, 2021.

V. Vijayalakshmi and K. Venkatachalapathy, “Deep neural network for multi- class prediction of student performance in educational data,” Int. J. Recent Technol. Eng., vol. 8, no. 2, pp. 5073– 5081, 2019.

C. Liu et al., “A predictive model for student achievement using spiking neural networks based on educational data,” Appl. Sci., vol. 12, no. 8, p. 3841, 2022.

N. M. Khanian, N. Sarasvathi, and T. S. Mojtaba, “Predictive modeling for student performance data using decision tree and support vector machine,” INTI J., vol. 2019, no. 8, 2019.

B. K. Yousafzai et al., “Student-performulator: Student academic performance using hybrid deep neural network,” Sustainability, vol. 13, no. 17, p. 9775, 2021.

Downloads

Published

2025-05-22

How to Cite

[1]
H. K. . Gharkan, M. . Radif, and A. H. Alsaeedi, “An Analysis of Datasets for  Student Performance Evaluation Based on Machine Learning”, Iraqi j. inf. technol. syst., vol. 1, no. 1, pp. 46–53, May 2025, doi: 10.112222/ijits.v1.i1.19507.

Share