An Analysis of Datasets for Student Performance Evaluation Based on Machine Learning
DOI:
https://doi.org/10.112222/ijits.v1.i1.19507Keywords:
Student performance prediction, data analysis, machine learning , data mining, student performance datasetAbstract
The prediction of student academic performance through data analysis and machine learning has become increasingly significant for improving educational systems and student outcomes. This study explores three publicly available datasets—Open University Learning Analytics Dataset (OULAD), xAPI-Edu-Data, and the UCI Student Performance dataset—to identify critical factors influencing academic success across secondary and higher education levels. These datasets include diverse data sources such as institutional records, behavioral engagement logs, and survey responses. The analysis employs various machine learning and deep learning techniques to evaluate the predictive performance of traditional models, ensemble methods, and hybrid architectures. An experimental validation was performed using a Random Forest classifier on the xAPI-Edu-Data dataset, achieving 80.56% accuracy and supporting the reliability of ensemble approaches. Results show that advanced ensemble and hybrid models like RF-SVM and ECNN-ResNet outperform classical methods in terms of accuracy, precision, recall, and F1-score. This research underscores the potential of intelligent systems to enable early identification of at-risk students, facilitate personalized learning interventions, and inform strategic academic decisions. These findings contribute to a growing body of work that promotes data-driven, equitable, and effective educational practices.
References
J. W. Osborne, Best Practices in Data Cleaning: A Complete Guide to Everything You Need to Do Before and After Collecting Your Data. Thousand Oaks, CA: Sage Publications, 2012.
B. Baesens, Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications. Hoboken, NJ: John Wiley & Sons, 2014.
W.-J. Chen et al., Systems of Insight for Digital Transformation: Using IBM Operational Decision Manager Advanced and Predictive Analytics. IBM Redbooks, 2015.
C. B. Davis, Making Sense of Open Data: From Raw Data to Actionable Insight. Next Generation Infrastructures Foundation, 2012.
J. Horton, “Identifying at-risk factors that affect college student success,” Int. J.Process Educ., vol. 7, no. 1, pp. 83-101, 2015.
S. Larose and G. M. Tarabulsy, “Academically at-risk students,” Handb. Youth Mentoring, vol. 7, no. 3, pp. 440–453, 2005.
D. J. Moran and R. W. Malott, Evidence-Based Educational Methods. Elsevier, 2004.
W. J. Jordan, “Defining equity: Multiple perspectives to analyzing the performance of diverse learners,” Rev. Res. Educ., vol. 34, no. 1, pp. 142–178, 2010.
S. Peters and L. A. Oliver, “Achieving quality and equity through inclusive education in an era of high-stakes testing,” Prospects, vol. 39, pp. 265–279, 2009.
R. Bodily and K. Verbert, “Review of research on student-facing learning analytics dashboards and educational recommender systems,” IEEE Trans. Learn.Technol., vol. 10, no. 4, pp. 405–418, 2017.
J. Kuzilek, M. Hlosta, and Z. Zdrahal, “Open university learning analytics dataset,” Sci. Data, vol.4, no. 1, pp. 1–8, 2017.
P. Cortez. "Student Performance," UCI Machine Learning Repository, 2008. [Online]. Available: https://doi.org/10.24432/C5TG7T.
M. R. Alzahrani, “Predicting student performance using ensemble models and learning analytics techniques,” 2024. [Unpublished].
S. Rizvi, B. Rienties, and S. A. Khoja, “The role of demographics in online learning: A decision tree based approach,” Comput. Educ., vol. 137, pp. 32–47, 2019.
J. Bravo-Agapito, S. J. Romero, and S. Pamplona, “Early prediction of undergraduate student's academic performance in completely online learning: A five-year study,” Comput. Human Behav., vol. 115, p.106595, 2021.
M. Adnan et al., “Predicting at-risk students at different percentages of course length for early intervention using machine learning models,” IEEE Access, vol. 9, pp. 7519–7539, 2021.
K. T. Chui et al., “Predicting at-risk university students in a virtual learning environment via a machine learning algorithm,” Comput. Human Behav., vol. 107, p. 105584, 2020.
A. Al-Ameri et al., “Student academic success prediction using learning management multimedia data with convoluted features and ensemble model,” ACM J. Data Inf. Qual., 2024.
N. Sharma and R. Bhardwaj, “Applying predictive analytics using clickstream data for improving the students performance.” [Unpublished].
H. Waheed et al., “Predicting academic performance of students from VLE big data using deep learning models,” Comput. Human Behav., vol. 104, p. 106189, 2020.
H. Waheed et al., “Early prediction of learners at risk in self-paced education: Aneural network approach,” Expert Syst. Appl., vol. 213, p. 118868, 2023.
M. Hooda et al., “Integrating LA and EDM for improving students success in higher education using FCN algorithm,” Math. Probl. Eng., vol. 2022, no. 1, p.7690103, 2022.
J. A. I. S. Masood et al., “A hybrid deep learning model to predict high-risk students in virtual learning environments,” IEEE Access, 2024.
U. Ashfaq, P. Booma, and R. Mafas, “Managing student performance: A predictive analytics using imbalanced data,” Int. J. Recent Technol. Eng., vol. 8, no. 6, p. 6, 2020.
A. A. Alsulami, A. S. A.-M. Al-Ghamdi, and M. Ragab, “Enhancement of e- learning student’s performance based on ensemble techniques,” Electronics, vol. 12, no. 6, p. 1508, 2023.
W. Xiao, P. Ji, and J. Hu, “RnkHEU: A hybrid feature selection method for predicting students’performance,” Sci. Program., vol. 2021, no. 1, p. 1670593, 2021.
V. Vijayalakshmi and K. Venkatachalapathy, “Deep neural network for multi- class prediction of student performance in educational data,” Int. J. Recent Technol. Eng., vol. 8, no. 2, pp. 5073– 5081, 2019.
C. Liu et al., “A predictive model for student achievement using spiking neural networks based on educational data,” Appl. Sci., vol. 12, no. 8, p. 3841, 2022.
N. M. Khanian, N. Sarasvathi, and T. S. Mojtaba, “Predictive modeling for student performance data using decision tree and support vector machine,” INTI J., vol. 2019, no. 8, 2019.
B. K. Yousafzai et al., “Student-performulator: Student academic performance using hybrid deep neural network,” Sustainability, vol. 13, no. 17, p. 9775, 2021.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Husam Kadhim Gharkan, Mustafa Radif , Ali Hakem Alsaeedi

This work is licensed under a Creative Commons Attribution 4.0 International License.
which allows users to copy, create extracts, abstracts, and new works from the Article, alter and revise the Article, and make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work.





