Comparing Survival Curves
We are often interested in assessing whether there are differences in survival (or cumulative incidence of event) among different groups of participants. For example, in a clinical trial with a survival outcome, we might be interested in comparing survival between participants receiving a new drug as compared to a placebo (or standard therapy). In an observational study, we might be interested in comparing survival between men and women, or between participants with and without a particular risk factor (e.g., hypertension or diabetes). There are several tests available to compare survival among independent groups.
The Log Rank Test
The log rank test is a popular test to test the null hypothesis of no difference in survival between two or more independent groups. The test compares the entire survival experience between groups and can be thought of as a test of whether the survival curves are identical (overlapping) or not. Survival curves are estimated for each group, considered separately, using the KaplanMeier method and compared statistically using the log rank test. It is important to note that there are several variations of the log rank test statistic that are implemented by various statistical computing packages (e.g., SAS, R ^{4,6}). We present one version here that is linked closely to the chisquare test statistic and compares observed to expected numbers of events at each time point over the followup period.
Example:
A small clinical trial is run to compare two combination treatments in patients with advanced gastric cancer. Twenty participants with stage IV gastric cancer who consent to participate in the trial are randomly assigned to receive chemotherapy before surgery or chemotherapy after surgery. The primary outcome is death and participants are followed for up to 48 months (4 years) following enrollment into the trial. The experiences of participants in each arm of the trial are shown below.
Chemotherapy Before Surgery 

Chemotherapy After Surgery 


Month of Death 
Month of Last Contact 

Month of Death 
Month of Last Contact 
8 
8 

33 
48 
12 
32 

28 
48 
26 
20 

41 
25 
14 
40 


37 
21 



48 
27 



25 




43 
Six participants in the chemotherapy before surgery group die over the course of followup as compared to three participants in the chemotherapy after surgery group. Other participants in each group are followed for varying numbers of months, some to the end of the study at 48 months (in the chemotherapy after surgery group). Using the procedures outlined above, we first construct life tables for each treatment group using the KaplanMeier approach.
Life Table for Group Receiving Chemotherapy Before Surgery
Time, Months 
Number at Risk N_{t} 
Number of Deaths D_{t} 
Number Censored C_{t} 
Survival Probability


0 
10 


1 
8 
10 
1 
1 
0.900 
12 
8 
1 

0.788 
14 
7 
1 

0.675 
20 
6 

1 
0.675 
21 
5 
1 

0.540 
26 
4 
1 

0.405 
27 
3 
1 

0.270 
32 
2 

1 
0.270 
40 
1 

1 
0.270 
Life Table for Group Receiving Chemotherapy After Surgery
Time, Months 
Number at Risk N_{t} 
Number of Deaths D_{t} 
Number Censored C_{t} 
Survival Probability


0 
10 


1 
25 
10 

2 
1.000 
28 
8 
1 

0.875 
33 
7 
1 

0.750 
37 
6 

1 
0.750 
41 
5 
1 

0.600 
43 
4 

1 
0.600 
48 
3 

3 
0.600 
The two survival curves are shown below.
Survival in Each Treatment Group
The survival probabilities for the chemotherapy after surgery group are higher than the survival probabilities for the chemotherapy before surgery group, suggesting a survival benefit. However, these survival curves are estimated from small samples. To compare survival between groups we can use the log rank test. The null hypothesis is that there is no difference in survival between the two groups or that there is no difference between the populations in the probability of death at any point. The log rank test is a nonparametric test and makes no assumptions about the survival distributions. In essence, the log rank test compares the observed number of events in each group to what would be expected if the null hypothesis were true (i.e., if the survival curves were identical).
H_{0}: The two survival curves are identical (or S_{1t} = S_{2t}) versus H_{1}: The two survival curves are not identical (or S_{1t} ≠ S_{2t}, at any time t) (α=0.05).
The log rank statistic is approximately distributed as a chisquare test statistic. There are several forms of the test statistic, and they vary in terms of how they are computed. We use the following:
where ΣO_{jt} represents the sum of the observed number of events in the j^{th} group over time (e.g., j=1,2) and ΣE_{jt} represents the sum of the expected number of events in the jth group over time.
The sums of the observed and expected numbers of events are computed for each event time and summed for each comparison group. The log rank statistic has degrees of freedom equal to k1, where k represents the number of comparison groups. In this example, k=2 so the test statistic has 1 degree of freedom.
To compute the test statistic we need the observed and expected number of events at each event time. The observed number of events are from the sample and the expected number of events are computed assuming that the null hypothesis is true (i.e., that the survival curves are identical).
To generate the expected numbers of events we organize the data into a life table with rows representing each event time, regardless of the group in which the event occurred. We also keep track of group assignment. We then estimate the proportion of events that occur at each time (O_{t}/N_{t}) using data from both groups combined under the assumption of no difference in survival (i.e., assuming the null hypothesis is true). We multiply these estimates by the number of participants at risk at that time in each of the comparison groups (N_{1t} and N_{2t} for groups 1 and 2 respectively).
Specifically, we compute for each event time t, the number at risk in each group, N_{jt} (e.g., where j indicates the group, j=1, 2) and the number of events (deaths), O_{jt} ,in each group. The table below contains the information needed to conduct the log rank test to compare the survival curves above. Group 1 represents the chemotherapy before surgery group, and group 2 represents the chemotherapy after surgery group.
Data for Log Rank Test to Compare Survival Curves
Time, Months 
Number at Risk in Group 1
N_{1t} 
Number at Risk in Group 2
N_{2t} 
Number of Events (Deaths) in Group 1
O_{1t} 
Number of Events (Deaths) in Group 2
O_{2t} 

8 
10 
10 
1 
0 
12 
8 
10 
1 
0 
14 
7 
10 
1 
0 
21 
5 
10 
1 
0 
26 
4 
8 
1 
0 
27 
3 
8 
1 
0 
28 
2 
8 
0 
1 
33 
1 
7 
0 
1 
41 
0 
5 
0 
1 
We next total the number at risk, N_{t} = N_{1t}+N_{2t}, at each event time and the number of observed events (deaths), O_{t} = O_{1t}+O_{2t,} at each event time. We then compute the expected number of events in each group. The expected number of events is computed at each event time as follows:
E_{1t} = N_{1t}*(O_{t}/N_{t}) for group 1 and E_{2t} = N_{2t}*(O_{t}/N_{t}) for group 2. The calculations are shown in the table below.
Expected Numbers of Events in Each Group
Time, Months 
Number at Risk in Group 1 N_{1t} 
Number at Risk in Group 2 N_{2t} 
Total Number at Risk N_{t} 
Number of Events in Group 1 O_{1t} 
Number of Events in Group 2 O_{2t} 
Total Number of Events O_{t} 
Expected Number of Events in Group 1 E_{1t} = N_{1t}*(O_{t}/N_{t}) 
Expected Number of Events in Group 2 E_{2t} = N_{2t}*(O_{t}/N_{t}) 

8 
10 
10 
20 
1 
0 
1 
0.500 
0.500 
12 
8 
10 
18 
1 
0 
1 
0.444 
0.556 
14 
7 
10 
17 
1 
0 
1 
0.412 
0.588 
21 
5 
10 
15 
1 
0 
1 
0.333 
0.667 
26 
4 
8 
12 
1 
0 
1 
0.333 
0.667 
27 
3 
8 
11 
1 
0 
1 
0.273 
0.727 
28 
2 
8 
10 
0 
1 
1 
0.200 
0.800 
33 
1 
7 
8 
0 
1 
1 
0.125 
0.875 
41 
0 
5 
5 
0 
1 
1 
0.000 
1.000 
We next sum the observed numbers of events in each group (∑O_{1t} and ΣO_{2t}) and the expected numbers of events in each group (ΣE_{1t} and ΣE_{2t}) over time. These are shown in the bottom row of the next table below.
Total Observed and Expected Numbers of Observed in each Group
Time, Months 
Number at Risk in Group 1 N_{1t} 
Number at Risk in Group 2 N_{2t} 
Total Number at Risk N_{t} 
Number of Events in Group 1 O_{1t} 
Number of Events in Group 2 O_{2t} 
Total Number of Events O_{t} 
Expected Number of Events in Group 1 E_{1t} = N_{1t}*(O_{t}/N_{t}) 
Expected Number of Events in Group 2 E_{2t} = N_{2t}*(O_{t}/N_{t}) 

8 
10 
10 
20 
1 
0 
1 
0.500 
0.500 
12 
8 
10 
18 
1 
0 
1 
0.444 
0.556 
14 
7 
10 
17 
1 
0 
1 
0.412 
0.588 
21 
5 
10 
15 
1 
0 
1 
0.333 
0.667 
26 
4 
8 
12 
1 
0 
1 
0.333 
0.667 
27 
3 
8 
11 
1 
0 
1 
0.273 
0.727 
28 
2 
8 
10 
0 
1 
1 
0.200 
0.800 
33 
1 
7 
8 
0 
1 
1 
0.125 
0.875 
41 
0 
5 
5 
0 
1 
1 
0.000 
1.000 




6 
3 

2.620 
6.380 
We can now compute the test statistic:
The test statistic is approximately distributed as chisquare with 1 degree of freedom. Thus, the critical value for the test can be found in the table of Critical Values of the Χ^{2} Distribution.
For this test the decision rule is to Reject H_{0} if Χ^{2} > 3.84. We observe Χ^{2} = 6.151, which exceeds the critical value of 3.84. Therefore, we reject H_{0}. We have significant evidence, α=0.05, to show that the two survival curves are different.
Example:
An investigator wishes to evaluate the efficacy of a brief intervention to prevent alcohol consumption in pregnancy. Pregnant women with a history of heavy alcohol consumption are recruited into the study and randomized to receive either the brief intervention focused on abstinence from alcohol or standard prenatal care. The outcome of interest is relapse to drinking. Women are recruited into the study at approximately 18 weeks gestation and followed through the course of pregnancy to delivery (approximately 39 weeks gestation). The data are shown below and indicate whether women relapse to drinking and if so, the time of their first drink measured in the number of weeks from randomization. For women who do not relapse, we record the number of weeks from randomization that they are alcohol free.
Standard Prenatal Care 

Brief Intervention 


Relapse 
No Relapse 

Relapse 
No Relapse 
19 
20 

16 
21 
6 
19 

21 
15 
5 
17 

7 
18 
4 
14 


18 




5 
The question of interest is whether there is a difference in time to relapse between women assigned to standard prenatal care as compared to those assigned to the brief intervention.
 Step 1.
Set up hypotheses and determine level of significance.
H_{0}: Relapsefree time is identical between groups versus
H_{1}: Relapsefree time is not identical between groups (α=0.05)
 Step 2.
Select the appropriate test statistic.
The test statistic for the log rank test is
 Step 3.
Set up the decision rule.
The test statistic follows a chisquare distribution, and so we find the critical value in the table of critical values for the Χ^{2} distribution) for df=k1=21=1 and α=0.05. The critical value is 3.84 and the decision rule is to reject H_{0} if Χ^{2} > 3.84.
 Step 4.
Compute the test statistic.
To compute the test statistic, we organize the data according to event (relapse) times and determine the numbers of women at risk in each treatment group and the number who relapse at each observed relapse time. In the following table, group 1 represents women who receive standard prenatal care and group 2 represents women who receive the brief intervention.
Time, Weeks 
Number at Risk  Group 1 N_{1t} 
Number at Risk  Group 2 N_{2t} 
Number of Relapses  Group 1 O_{1t} 
Number of Relapses  Group 2 O_{2t} 

4 
8 
8 
1 
0 
5 
7 
8 
1 
0 
6 
6 
7 
1 
0 
7 
5 
7 
0 
1 
16 
4 
5 
0 
1 
19 
3 
2 
1 
0 
21 
0 
2 
0 
1 
We next total the number at risk, , at each event time, the number of observed events (relapses), , at each event time and determine the expected number of relapses in each group at each event time using and .
We then sum the observed numbers of events in each group (ΣO_{1t} and ΣO_{2t}) and the expected numbers of events in each group (ΣE_{1t} and ΣE_{2t}) over time. The calculations for the data in this example are shown below.
Time, Weeks 
Number at Risk Group 1 N_{1t} 
Number at Risk Group 2 N_{2t} 
Total Number at Risk N_{t} 
Number of Relapses Group 1 O_{1t} 
Number of Relapses Group 2 O_{2t} 
Total Number of Relapses O_{t} 
Expected Number of Relapses in Group 1

Expected Number of Relapses in Group 2


4 
8 
8 
16 
1 
0 
1 
0.500 
0.500 
5 
7 
8 
15 
1 
0 
1 
0.467 
0.533 
6 
6 
7 
13 
1 
0 
1 
0.462 
0.538 
7 
5 
7 
12 
0 
1 
1 
0.417 
0.583 
16 
4 
5 
9 
0 
1 
1 
0.444 
0.556 
19 
3 
2 
5 
1 
0 
1 
0.600 
0.400 
21 
0 
2 
2 
0 
1 
1 
0.000 
1.000 




4 
3 

2.890 
4.110 
We now compute the test statistic:
 Step 5.
Conclusion. Do not reject H_{0} because 0.726 < 3.84. We do not have statistically significant evidence at α=0.05, to show that the time to relapse is different between groups.
The figure below shows the survival (relapsefree time) in each group. Notice that the survival curves do not show much separation, consistent with the nonsignificant findings in the test of hypothesis.
RelapseFree Time in Each Group
As noted, there are several variations of the log rank statistic. Some statistical computing packages use the following test statistic for the log rank test to compare two independent groups:
where ΣO_{1t} is the sum of the observed number of events in group 1, and ΣE_{1t} is the sum of the expected number of events in group 1 taken over all event times. The denominator is the sum of the variances of the expected numbers of events at each event time, which is computed as follows:
There are other versions of the log rank statistic as well as other tests to compare survival functions between independent groups.^{79} For example, a popular test is the modified Wilcoxon test which is sensitive to larger differences in hazards earlier as opposed to later in followup.^{10}