Analysis of Breast Cancer Data using Kaplan – Meier Survival Analysis

The Kaplan–Meier estimator is a very popular that provides better estimates to determine the median when the sample size is reasonably large. The aim of this research was mainly concerned with a study and analysis an estimation of the survivorship time of real data of breast cancer patients in Iraq.


Introduction
The Kaplan-Meierprocedure is a method of estimating time-to-event models in the presence of censored cases.It's an intrinsic characteristic of survival data is the possibility for censoring of observations (that is, the actual time until the event is not observed).Such censoring can arise from withdrawal from the experiment or termination of the experiment [8]   .The Kaplan-Meiermodel is based on estimating conditional probabilities at each time point when an event occursand taking the product limit of those probabilities to estimate the survival rate at each point in time.The Kaplan-Meierestimator december be obtained as the limiting case of the classical actuarial estimator, and it seems to have been first proposed by (Bohmer 1912) [5]   .Kaplan and   Meier (1958) were the first who carried out the solution of a problem to estimate the survival curve in a simple way while considering the right censoring.
Bland and Altman (1998) [4] contained some statistical notes on survival probabilities (Kaplan-Meiermethod).Tovey and et al(2009) [9] presented Kaplan-Meier survival curves by using breast cancer-specific death as an outcome endpoint (log-rank testing).Rajaeefard and et al (2009) [7] they concluded that the higher stage, grade, age and history of benign tumor were, the most important risk factors were correlated to mortality in breast cancer patients.Zino(2010) [10]  the Kaplan-Meiermethod and to make comprise by using the log-rank testwas studied byAl-A'bidy (2011) [2]   .
The summarize of this paper is as follows.
Section 2 reviews the data and method and section 3 explain the Kaplan-Meiermethod and log-rank test.Section 4 displays analysis of results to compare between breast tumor groups and this is followed by the conclusion in section 5.

Data and Methods
The collected simple random sample data was the specialized breast diseasesclinic in Al- Other tumors group comprised 100 patients with ages between 16-70 years.The data was summarized by using tables and graphs.(1 = Patient is Still in Remission, 0 = Censored)

Kaplan-Meier Survival Analysis
The Kaplan-Meier estimator (K-M) is a nonparametric estimator which december be used to estimate the survival distribution function from censored data.The Kaplan-Meier estimator is also called product-limit estimator (PL) because of its typical product structure [1] .
The estimator is similar to the actuarial estimator except that the lengths of the intervals are variable.In fact, let , the right endpoint of , be the ordered censored or uncensored observation.We observe the pairs .For now, assume no ties.Let be the order statistics of , and with an abuse of notation, define to be the value of associated with , that is, when .Note that are not ordered.Let denote the risk set at time , which is the set of subjects still alive at time , and let From the estimates The variance of the estimator is given by: This is known as Greenwood's formula [6] .

The Log-rank Test
The log-rank test is a non-parametric method for testing the null hypothesis that the groups being compared are samples from the same population as regards survival experience.The first step is to arrange the survival times, both observed and censored.Suppose, for illustration, that there are two groups, O and P. For each minute with a failure we calculate the numbers at risk in each group ( Q and R ) and the numbers of observed failures ( Q and R ).If at time in groups O and P , respectively, then the data can be arranged in a F S F table as follows [3] : Except for tied survival times, 4 and each of Q and R is 0 or 1.Note also that if a subject is censored at then that subject is considered at risk at that time and so included in .
On the null hypothesis that the risk of death is the same in the two groups, then we would expect the number of deaths at any time to be distributed between the two groups in proportion to the numbers at risk.That is, Summing over all times of death, , gives \ Similar sums can be obtained for group P and it follows from (4) thatT test statistic for the equivalence of the death rates in the two groups is Which is approximately a ] .The log-rank statistic approaches to chi-square distribution with one degree of freedom [1]   .The hazard ratio sampling variability are given by

Results
In examined a potential role for sirtuins in breast cancer disease (including anti-tumor treatment).The Kaplan-Meier analysis and Cox regression analysis demonstrated the relative pathological prognostic markers.And to estimate the survivorship function of three distinct groups; malignant, benign, and other tumors for the breast tumor patients by using

Figure ( 1
Figure (1) demonstrated, a 95% confidence interval for the survival time for each group by remission status.
The graph also allowed us to represent visually the median survival time and survival rates representation of the life tables such as the 1-year survival rate.In Figure(2) shows that the horizontal axis shows the time to event.In this plot, drops the survival curve to reach to zero, and ascend the hazard curve to reach to above 1.5.While the vertical axis shows the probability of survival and the cumulative hazard.Thus, any point on the survival curve shows the probability that a patient on a given diagnosis will not have experienced relief by that time.The plot for malignant tumor below that of benign tumor throughout most of the trial, which suggests that malignant tumor december give faster relief than benign tumor.To determine whether these differences are due to chance, look at the comparisons Tables above.
With the Kaplan-Meier survival analysis procedure, you have examined the distribution of time to effect for two or more different groups.The comparison tests show that there is a statistically significant differencein survival times ) Vgbetween malignant and benign tumors group only.Mallon, EA, Cooke, TG and Edwards (2009)."Poor Survival Outcomes in HER2-Positive Breast Cancer Patients with low-grade, node-Negative Tumors".British Journal of Cancer, Cancer Research UK, Vol.100, No. 5, 680-683.[10] Zino, S.M.W. (2010)."Investigations into the expression of Sirtuins in breast cancer: in vivo and in vitro Studies".Ph.D. Thesis, University of Glasgow.
Test of Equality of Survival Distributions for the Different Levels of Diagnosis.
table offers a quick numerical comparison of the "typical" times to effect for each of the tumors.Since there is a lot of overlap in the confidence intervals, it is likely that there is much difference in the "average" survival time, shows in Tables(2)and (3).