A A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data

Authors

  • Suhaam Adnan Abdul kareem Department of Postgraduate of Affairs,University of Baghdad
  • Zena Fouad Rasheed University of Baghdad https://orcid.org/0000-0002-0190-164X

DOI:

https://doi.org/10.31642/JoKMC/2018/100227%20

Keywords:

Cancer, Machine Learning, SVM , Decision tress, Random Forest.

Abstract

Cancer is one of the top causes of death globally. Recently, microarray gene expression data has been used to aid in cancers effective and early detection. The use of machine learning techniques in biomedicine and bioinformatics to categorize cancer patients into high- or low-risk groups was investigated by numerous research teams. It is necessary that machine learning tools can recognize important features in complex datasets. Here we present a machine learning approach to cancer detection, and to the identification of genes critical for the diagnosis of cancer .We used the Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbor (KNN), and Gradient Boosting (GB) that provide results that are more accurate than those of current models. Each model's accuracy, including SVM, KNN, RF, and GB, was (97.41%, 89.3%, 88.1%, and 85.7%), respectively. The SVM has the highest precision among machine learning algorithms. By creating a machine learning-based predictive system for early detection, our findings can help to decrease the prevalence of cancer disease.

Downloads

Download data is not yet available.

References

K. Hall, V. Chang, and P. Mitchell, “Machine Learning Techniques for Breast Cancer Detection,” no. January, pp. 116–122, 2022.

S. T. Ahmed and S. M. Kadhem, “Early Alzheimer’s Disease Detection Using Different Techniques Based on Microarray Data: A Review,” International journal of online and biomedical engineering, vol. 18, no. 4, pp. 106–126, 2022.

S. T. Ahmed and S. M. Kadhem, “Using Machine Learning via Deep Learning Algorithms to Diagnose the Lung Disease Based on Chest Imaging: A Survey,” International Journal of Interactive Mobile Technologies, vol. 15, no. 16, pp. 95–112, 2021.

S. T. Ahmed and S. M. Kadhem, “Alzheimer’s disease prediction using three machine learning methods,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 27, no. 3, pp. 1689–1697, 2022.

H. S. M. Alsultani, S. T. Ahmed, B. J. Khadhim, and Q. K. Kadhim, “The use of spatial relationships and objectidentification in image understanding,” International Journal of Civil Engineering and Technology, vol. 9, no. 5, pp. 487–496, 2018.

S. T. Ahmed, Q. K. Kadhim, H. S. Mahdi, and W. S. A. Almahdy, “Applying the MCMSI for Online Educational Systems Using the Two-Factor Authentication,” International Journal of Interactive Mobile Technologies, vol. 15, no. 13, pp. 162–171, 2021.

M. Dashtban and M. Balafar, “Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts,” Genomics, vol. 109, no. 2, pp. 91–107, 2017.

A. Bir-Jmel, S. M. Douiri, and S. Elbernoussi, “Gene selection via a new hybrid ant colony optimization algorithm for cancer classification in high-dimensional data,” Computational and mathematical methods in medicine, vol. 2019, 2019.

H. S. Basavegowda and G. Dagnew, “Deep learning approach for microarray cancer data classification,” CAAI Transactions on Intelligence Technology, vol. 5, no. 1, pp. 22–33, 2020.

M. Mandal, P. K. Singh, M. F. Ijaz, J. Shafi, and R. Sarkar, “A tri-stage wrapper-filter feature selection framework for disease classification,” Sensors, vol. 21, no. 16, p. 5571, 2021.

Saroj, J. Vashishtha, P. Goyal, and J. Ahuja, “A Novel Fitness Computation Framework for Nature Inspired Classification Algorithms,” Procedia Computer Science, vol. 132, pp. 208–217, 2018.

Y. Saeys, “Inza, I. n.; and Larranaga,” A review of feature selection techniques in bioinformatics. Bioinformatics, vol. 23, no. 19, pp. 2507–2517.

K. Mukesh, “nitish, KR, Amitav, S., & Santanu, KR (2015). Feature Selection and Classification of Microarray Data Using Map Reduce Based ANOVA and K-Nearest Neighbor, 11th International Multi-Conference on Information Processing,” Procedia Computer Science, vol. 54, pp. 301–310.

N. Q. K. Le, D. T. Do, T.-T.-D. Nguyen, N. T. K. Nguyen, T. N. K. Hung, and N. T. T. Trang, “Identification of gene expression signatures for psoriasis classification using machine learning techniques,” Medicine in Omics, vol. 1, no. May 2020, p. 100001, 2021.

Y. Tian and Z. Qi, “Review on: Twin Support Vector Machines,” Annals of Data Science, vol. 1, no. 2, pp. 253–277, 2014.

A. N. Parveen, H. H. Inbarani, and E. N. S. Kumar, “Performance analysis of unsupervised feature selection methods,” in 2012 International Conference on Computing, Communication and Applications, 2012, pp. 1–7.

R. Liu, C. A. Mancuso, A. Yannakopoulos, K. A. Johnson, and A. Krishnan, “Supervised learning is an accurate method for network-based gene classification,” Bioinformatics, vol. 36, no. 11, pp. 3457–3465, 2020.

A. Masood et al., “Computer-Assisted Decision Support System in Pulmonary Cancer detection and[1] K. Hall, V. Chang, and P. Mitchell, “Machine Learning Techniques for Breast Cancer Detection,” no. January, pp. 116–122, 2022.

S. T. Ahmed and S. M. Kadhem, “Early Alzheimer’s Disease Detection Using Different Techniques Based on Microarray Data: A Review,” International journal of online and biomedical engineering, vol. 18, no. 4, pp. 106–126, 2022.

S. T. Ahmed and S. M. Kadhem, “Using Machine Learning via Deep Learning Algorithms to Diagnose the Lung Disease Based on Chest Imaging: A Survey,” International Journal of Interactive Mobile Technologies, vol. 15, no. 16, pp. 95–112, 2021.

S. T. Ahmed and S. M. Kadhem, “Alzheimer’s disease prediction using three machine learning methods,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 27, no. 3, pp. 1689–1697, 2022.

H. S. M. Alsultani, S. T. Ahmed, B. J. Khadhim, and Q. K. Kadhim, “The use of spatial relationships and object identification in image understanding,” International Journal of Civil Engineering and Technology, vol. 9, no. 5, pp. 487–496, 2018.

S. T. Ahmed, Q. K. Kadhim, H. S. Mahdi, and W. S. A. Almahdy, “Applying the MCMSI for Online Educational Systems Using the Two-Factor Authentication,” International Journal of Interactive Mobile Technologies, vol. 15, no. 13, pp. 162–171, 2021.

M. Dashtban and M. Balafar, “Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts,” Genomics, vol. 109, no. 2, pp. 91–107, 2017.

A. Bir-Jmel, S. M. Douiri, and S. Elbernoussi, “Gene selection via a new hybrid ant colony optimization algorithm for cancer classification in high-dimensional data,” Computational and mathematical methods in medicine, vol. 2019, 2019.

H. S. Basavegowda and G. Dagnew, “Deep learning approach for microarray cancer data classification,” CAAI Transactions on Intelligence Technology, vol. 5, no. 1, pp. 22–33, 2020.

M. Mandal, P. K. Singh, M. F. Ijaz, J. Shafi, and R. Sarkar, “A tri-stage wrapper-filter feature selection framework for disease classification,” Sensors, vol. 21, no. 16, p. 5571, 2021.

Saroj, J. Vashishtha, P. Goyal, and J. Ahuja, “A Novel Fitness Computation Framework for Nature Inspired Classification Algorithms,” Procedia Computer Science, vol. 132, pp. 208–217, 2018.

Y. Saeys, “Inza, I. n.; and Larranaga,” A review of feature selection techniques in bioinformatics. Bioinformatics, vol. 23, no. 19, pp. 2507–2517.

K. Mukesh, “nitish, KR, Amitav, S., & Santanu, KR (2015). Feature Selection and Classification of Microarray Data Using Map Reduce Based ANOVA and K-Nearest Neighbor, 11th International Multi-Conference on Information Processing,” Procedia Computer Science, vol. 54, pp. 301–310.

N. Q. K. Le, D. T. Do, T.-T.-D. Nguyen, N. T. K. Nguyen, T. N. K. Hung, and N. T. T. Trang, “Identification of gene expression signatures for psoriasis classification using machine learning techniques,” Medicine in Omics, vol. 1, no. May 2020, p. 100001, 2021.

Y. Tian and Z. Qi, “Review on: Twin Support Vector Machines,” Annals of Data Science, vol. 1, no. 2, pp. 253–277, 2014.

A. N. Parveen, H. H. Inbarani, and E. N. S. Kumar, “Performance analysis of unsupervised feature selection methods,” in 2012 International Conference on Computing, Communication and Applications, 2012, pp. 1–7.

R. Liu, C. A. Mancuso, A. Yannakopoulos, K. A. Johnson, and A. Krishnan, “Supervised learning is an accurate method for network-based gene classification,” Bioinformatics, vol. 36, no. 11, pp. 3457–3465, 2020.

A. Masood et al., “Computer-Assisted Decision Support System in Pulmonary Cancer detection and stage classification on CT images,” Journal of Biomedical Informatics, vol. 79, no. January, pp. 117–128, 2018.

H. Akkar and S. Q. Haddad, “Diagnosis of Lung Cancer Disease Based on Back-Propagation Artificial Neural Network Algorithm,” Engineering and Technology Journal, vol. 38, no. 3B, pp. 184–196, 2020.

E. Alhenawi, R. Al-Sayyed, A. Hudaib, and S. Mirjalili, “Feature selection methods on gene expression microarray data for cancer classification: A systematic review,” Computers in Biology and Medicine, vol. 140, p. 105051, 2022.

C. M. Rosett and A. Hagerty, “Introducing Machine Learning,” in Introducing HR Analytics with Machine Learning, Springer, 2021, pp. 107–127.

Z. Mao, W. Cai, and X. Shao, “Selecting significant genes by randomization test for cancer classification using gene expression data,” Journal of biomedical informatics, vol. 46, no. 4, pp. 594–601, 2013.

Downloads

Published

2023-08-31

How to Cite

Abdul kareem, S. A., & Rasheed, Z. F. (2023). A A Machine Learning Model for Cancer Disease Diagnosis using Gene Expression Data. Journal of Kufa for Mathematics and Computer, 10(2), 179–185. https://doi.org/10.31642/JoKMC/2018/100227

Similar Articles

1 2 3 4 > >> 

You may also start an advanced similarity search for this article.