An Approach for Solving Missing Values in Data Set Using Clustering-Curve Fitting Technique
DOI:
https://doi.org/10.31642/JoKMC/2018/0202012Keywords:
Data Mining, Missing Values, Clustering, Curve Fitting.Abstract
Missing values in data sets represent one of the greatest challenge in analyzing data to extract knowledge from the data set. The work in this paper presents a new approach for solving the missing values problems by using and merging two different techniques; clustering (K-means and Expectation Maximization) and curve fitting. More than twenty thousand records of real health data set collected from different Iraqi hospitals were used to create and test the proposed approach that showed better results than the most popular techniques for estimation missing values such as most common values, overall overage, class average, and class most common values. Different software were used in the proposed work including WEKA (Waikato Environment for Knowledge Analysis), Matlab, Excel and C++.Downloads
References
Jiawei Han and Micheline Kamber," Data Mining: Concepts and Techniques", Second Edition, Morgan Kaufmann, 2006.
Jiawei Han, Micheline Kamber and Jian Pei, " Data Mining: Concepts and Techniques" 3rd Edition, Morgan Kaufmann, 2012. [3] Wei, W. --- Tang, Y., " A generic neural network approach for filling missing data in data mining", Systems, Man andCybernetics, 2003, IEEE International Conference on ISBN: 0780379527.
Luai Al Shalabi, Mohannad Najjar and Ahmad Al Kayed, "A framework to Deal with Missing Data in Data Sets",Journal of Computer Science 2 (9): 740-745, 2006, ISSN. DOI: https://doi.org/10.3844/jcssp.2006.740.745
Kadhim B. Swadi Aljanabi, " An Improved Algorithm for Data Preprocessing in Mining Crime Data Set", Journal of Kufa for Mathematics and Computer, V o l . 1, N o . 4, N o v ., 2 011, pp.81- 87.
David Hand, Heikki Mannila and Padhraic Smyth," Principles of Data Mining", MIT Press, 2001.
Dr. Bushra M. Hussan, " Data Mining based Prediction of Medical data Using K-means algorithm", Basrah Journal of Science (A), Vol.30(1), 46-56 2012.
Joe D. Hoffman, "Numerical Methods for Engineers and Scientists", Second Edition,, NEW YORK. BASEL, 1992, ISBN: 0-8247-0443-6.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Kadhim AlJanabi AlJanabi, Mansoor Habeebi Habeebi, Nawras Riyadh Neamah
This work is licensed under a Creative Commons Attribution 4.0 International License.
which allows users to copy, create extracts, abstracts, and new works from the Article, alter and revise the Article, and make commercial use of the Article (including reuse and/or resale of the Article by commercial entities), provided the user gives appropriate credit (with a link to the formal publication through the relevant DOI), provides a link to the license, indicates if changes were made and the licensor is not represented as endorsing the use made of the work.