Takagi-Sugeno-Kang(zero-order) model for diagnosis hepatitis disease

The aim of this paper is to use TakagiSugeno-Kang(zero-order) model as fuzzy neural network for the medical diagnosis of hepatitis diseases which represent a major public health problem all around the world . For further improve the accuracy and the speed of the diagnosis, the Microarray Attribute Reduction Scheme (MARS) for reduction features (or attributes) and Mean Imputation (MI) method for treatment the missing values were used in this work. The used data source of hepatitis diseases was taken from UCI machine learning repository. After treat the missing values problem by apply MI method, the dataset is partitioned into three training–testing partitions (30%–70%, 40–60% and 20%– 80% respectively) and apply MARS with different values of thr(from 0.1-0.9 ) in order to determine the number attributes (that represent the number of inputs to the fuzzy neural network), the results record in each case of thr values and each case of partitions. The high diagnosis accuracy has been achieved for the 40–60% training–testing, namely, 100% for training and 95.77% for testing with thr equal to 0.4 and with less training cycle and fuzzy sets number. This work was implemented in MATLAB 7.0


Introduction
Medical Diagnosis can be stated as the process of determining or identifying a possible disease or a disorder.A clinician uses several sources of data and classifies this data in order to find the disorder.A medical diagnosis is made by a physician based on assessment of symptoms and diagnostic tests [1].
Nowadays, the use of computer technology in the field of medicine has highly increased [2].The use of intelligent systems such as neural network, fuzzy logic, genetic algorithm and fuzzy neural systems has highly helped in complex and uncertain medical tasks such as diagnosis of diseases [3].Over the last few decades, neural networks and fuzzy systems have established their reputation as alternative approaches to intelligent information processing systems.Both have certain advantages over classical methods, especially when vague data or prior knowledge is involved.However, their applicability suffered from several weaknesses of the individual models.Therefore, combinations of neural networks with fuzzy systems have been proposed, where both models complement each other.
Fuzzy neural hybridization results in a hybrid intelligent system that synergizes these two techniques by combining the human-like reasoning style of fuzzy systems with the learning and connectionist structure of neural networks [4].The basic idea of combining fuzzy systems and neural networks is to design an architecture that uses a fuzzy system to represent knowledge in an interpretable manner and the learning ability of a neural network to optimize its parameters [5].
In Section 2 the method that is used for treatments missing values was presented.The schema for attribute reduction was presented in section 3. The structure and learning parameters of the fuzzy neural network was presented in section 4 and 5. Section 6 and 7 describes the dataset used and describe this work.The experimental results are described in Section 8.The conclusion and future work of the work is given in section 9

Mean Imputation
Mean imputation (MI) is one of the most frequently used methods for treat the missing values.It consists of replacing the missing data for a given feature (attribute) by the mean of all known values of that attribute in the class where the instance with missing attribute belongs.The value x ij of the k-th class, C k , is missing then it will be replaced by: where n k represents the number of nonmissing values in the j-th feature of the kth class.According to Little and Rubin [6] the drawbacks of mean imputation are: 1) Sample size is overestimated.2) Variance is underestimated.3) Correlation is negatively biased.4) The distribution of new values is an incorrect representation of the population values because the shape of the distribution is distorted by adding values equal to the mean.Replacing all missing records with a single value will deflate the variance and artificially inflate the significance of any statistical tests based on it.Surprisingly though, mean imputation has given good experimental results in data sets used for supervised classification purposes [7].

Microarray Attribute Reduction Scheme
The Microarray Attribute Reduction Scheme (MARS) [8] is a schema for attribute reduction works as follows: 1. Given a training dataset S with n samples, and m is the number of features.In S, all samples associated with a class label k ∈{1,2,...c} where c is the number of total classes.
2. For each feature x i , the μik that is the mean of i-th features for class k is calculated as: Where |Sk| represents the number of samples in the k-th class.

The value µ i that is the mean of i-
th attributes of all samples is denoted as: 4. The score(i, k), indicating the ability of the i-th feature to identify the samples associated with the k class label can then be obtained by: where v(xij, k) denoted in the following is a weighted voting scheme: By measuring the distance between class k and class l∈{1,2,...c} under the i-th feature, the score(i, k) metric reflects the fact that the higher a score value, the more ability the corresponding i-th feature can distinguish the k class samples.In those features whose scores exceed a threshold value thr are characterized as the most discriminatory features and recorded in the vector v g = (v 1 , v 2 ,.., v i , …, v m ) T , where the attribute v i is defined as: Where Each v i represents whether the i-th attribute is a selected attribute or not.

Structure of Fuzzy neural network
The biggest advantage of neural network methods is that they are general and the disadvantage of neural networks is that they are notoriously slow, and it is very difficult to determine how the net is making its decision.
In the field of artificial intelligence, fuzzy neural network refers to combinations of artificial neural networks and fuzzy logic.Fuzzy neural network hybridization results in a hybrid intelligent system that synergizes these two techniques by combining the human-like reasoning style of fuzzy systems with the learning and connectionist structure of neural networks.The main strength fuzzy neural system is that they are universal approximates with the ability to solicit interpretable IF-THEN rules [9].
Figure (1) shown the structure of Takagi-Sugeno-Kang fuzzy neural network which consist of four layers describe as follow [10,11]: Layer 1: Each node in this layer, only transmits input values to the next layer directly.Where: c is the center of membership for input i with it fuzzy set j.
is the variance of membership for input i with it fuzzy set j.
N is the number of inputs.fsetno is the number of fuzzy sets for each input.

Parameters learning of fuzzy neural network
Based on the above structure, the learning algorithm was used to determine the proper centers and variance of each fuzzy sets in the system.In this paper the backpropagation algorithm to tune the parameters of the fuzzy neural network.
 The weights (w) in layer 4 are updated by [10,11]  The centers and variances of the fuzzy sets in layer 2 are updated by [10,11]

Dataset description
The hepatitis database taken from UCI machine learning repository [10] The dataset contains 155 samples.
Each sample in the dataset has 20 attribute including the class attribute DIE and LIVE (as output) distribute as DIE: 32 and LIVE: 123.The output shows whether patients with hepatitis are alive or dead.
Table 1 shows the details of the hepatitis data.There are number of missing values in each attribute.Table 2 shows the description of these missing attribute values: (indicated by "?"). .

Experimental results
This section evaluates the performance of this work on diagnosis of the hepatitis disease.The data sets representing this problem were obtained from the UCI machine learning benchmark repository and were real world data.Table 1 and Table 2 shows the descriptions of the data sets.In this work as shown in Table 3 the number of patterns for training less than the number of patterns for testing.
The diagnosis accuracy on the testing data for the reduced feature subset is shown in tables (5)(6)(7)(8)(9) in different values of thr with three partition sets.In general the high diagnosis accuracy has been achieved for the 40-60% training-testing partition in each values of the thr and the highest accuracy, namely, 95.77% has been achieved with thr equal to 0.4 with less training cycle and fuzzy sets number.

Layer 2 :
Each node in this layer corresponds to one fuzzy set (linguistic label) of one of the input variables in Layer 1.The operation in this layer called fuzzification where each crisp input value convert to fuzzy value by using membership function (in this work use Gaussian function) as bellow:

Layer 3 :Layer 4 :
Nodes in this layer are rule nodes, and constitute the antecedents of the fuzzy rule base.Every node in this layer is a fixed node labeled as Π, whose output is the product of all incoming signals as follow: This act as the defuzzifier and normalize.This layer are called "Normalized and summation of firing strengths".To get output of fuzzy neural network, every node in this layer is represent the rule's firing strength to the sum of all rules' firing strengths and normalize the firing strength as:Where w i strength of rule i.

4 .
To start train the fuzzy model, the initial weights are chosen as follow: o The following steps shown how the weights of membership function parameters (c is center and b ( ) is variance of membership functionfor consequent part. rulehno represent number of rule in layer 3.  Desireout(y a ) represent the desired output in training-test partition that the number of its patterns for training is patternno (N). fsetno represent the number of fuzzy sets for each input  After prepare the training patterns and testing by apply MI algorithm and MARS algorithm on the original patterns or samples, run the fuzzy neural model many times, in each times input different number of fuzzy sets 6,7,8,9,10,11. the training and testing rate, the error in training stage and testing stage , number of fuzzy sets and number of training cycles for two best run in each training-testing partition.Tables 5-9 shown accuracy for three different training and testing sets in different values of thr.

Table ( 5
) Accuracy for three different training and testing sets in case thr = 0.1 In order to keep the whole real world data as it, we must treat any missing values in this work use MI as algorithm for this purpose. The number of fuzzy sets of fuzzy neural network which select by trail and error plays an important role for give higher diagnosis accuracy as shown in tables 5-9.For a particular input, any feature(s) for hepatitis problem not be effective to the number of fuzzy sets.By extracting these features by using MARS we can minimize the training time.As a future work, we will try to extend the algorithm for improving backpropagation using another feature selection algorithm, another algorithm for treatments missing values and use genetic algorithm to predict a suitable number of fuzzy sets..  This paper focusing at the using algorithm for reduces the attributes and treatments the missing values and ignored all the factors that improve the performance of backpropagation algorithm. From the tables 5-9 the above existing methods are effective in diagnosis; their accuracy may be reduced when handling the datasets with a large amount of attributes and very few attributes.