Two-sample Tests
Given the variation within each sample, how likely is it that our two sample means were drawn from populations with the same average? A better way to answer this question is to work out the probability that our two samples were indeed drawn from populations with the same mean. If this probability is very low, then we can be reasonably certain that the means really are different from one another.
Two-sample Paired Test
Paired tests are used when there are two measurements on the same experimental unit. The paired t-test has the same assumptions of independence and normality as a one-sample t-test. Let us look at a data set on weight change (anorexia), also from the MASS library. The data are from 72 young female anorexia patients. The three variables are treatment (Treat), weight before study (Prewt), and weight after study (Postwt). Here we are interested in finding out whether there is a placebo effect (i.e. patients who do not get treated gain some weight in the study).
> detach(Boston) ### important
> attach(anorexia)
> dif <- Postwt - Prewt
> dif.Cont <- dif[which(Treat=="Cont")]
Apply the summary() function to variable dif.Cont and comment on the summary statistics.
|
Conducting a "paired" t-test is virtually identical to a one-sample test on the element-wise differences. Both the parametric pair-wise t-tests and non-parametric Wilcoxon signed-rank tests are shown below.
> t.test(dif.Cont)
One Sample t-test
data: dif.Cont
t = -0.2872, df = 25, p-value = 0.7763
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-3.676708 2.776708
sample estimates:
mean of x
-0.45
- H0: There is, on average, no difference in mean weight before and after the study period for those in the Control group.
- Ha: There is, on average, a difference in mean weight before and after the study period for those in the Control group.
We see that we fail to reject the null hypothesis (t = -0.29, df = 25, p-value = 0.7763) that there is no difference in mean birth weight before and after the study in the Control group. The sample mean difference is equal to -0.45 with a 95% confidence interval of (-3.67, 2.77).
> wilcox.test(dif.Cont)
Wilcoxon signed rank test with continuity correction
data: dif.Cont
V = 150, p-value = 0.7468
alternative hypothesis: true location is not equal to 0
- H0: The median difference in weight before and after the study period for those in the Control group is equal to 0.
- Ha: The median difference in weight before and after the study period for those in the Control group is not equal to 0.
We thus fail to reject the null hypothesis (V = 150, p-value = 0.7468) that there is no difference in the median birth weight before and after the study in the Control group.
It is not necessary to create the derived difference variable. Instead, you may turn on the paired argument in the R command as follows:
> t.test(Postwt[which(Treat=="Cont")], Prewt[which(Treat=="Cont")], paired=TRUE)
> wilcox.test(Prewt[which(Treat=="Cont")], Postwt[which(Treat=="Cont")], paired=TRUE)
Conduct an appropriate test to determine whether the treatment is effective in the anorexia dataset. (Hint: Create a new variable called trt that is named "Control" if the patient was not given treatment and "Treatment" otherwise). |
Paired t Test in R (R Tutorial 4.4) MarinStatsLectures [Contents]
Parametric Two-sample T-test
Now, we will analyze the Pima.tr dataset. The US National Institute of Diabetes and Digestive and Kidney Diseases collected data on 532 women who were at least 21 years old, of Pima Indian heritage and living near Phoenix, Arizona, who were tested for diabetes according to World Health Organization criteria. One simple question is whether the plasma glucose concentration is higher in diabetic individuals than it is in non-diabetic individuals.
To do this, we will perform a two-sample t-test which makes the following assumptions:
- Independent observations.
- Normal distribution for each of the two groups.
- Equal variance for each of the two groups.
The statistic is
ttwo-sample = [ (Ȳ1 - Ȳ2) – D0 ] / [Sp2 (1/n1+1/n2) ] ~ Tn1 + n2 − 2
(usually D0 is just 0)
Sp2 (pooled variance) = [(n1 − 1)S12 + (n2 − 1)S2]/(n1 + n2 − 2)
> detach(anorexia)
> attach(Pima.tr)
> ?Pima.tr
> t.test(glu ~ type)
Welch Two Sample t-test
data: glu by type
t = -7.3856, df = 121.756, p-value = 2.081e-11
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-40.51739 -23.38813
sample estimates:
mean in group No mean in group Yes
113.1061 145.0588
- H0: The mean glucose for those who are diabetic is the same as those who are not diabetic.
- Ha: The mean glucose for those who are diabetic is not the same as those who are not diabetic.
Here we see that we reject the null hypothesis that the mean glucose for those who are diabetic is the same as those who are not diabetic (t = -7.39, df = 121.76, p-value < 2.081e-11). The average glucose for those who are diabetic is 145.06 and for those who are not diabetic is 113.11. The 95% confidence interval for the difference in glucose between the two diabetic groups is (-40.52, -23.38).
One thing to remember about the t.test() function is that it assumes the variances are different by default. The argument var.equal=T can be used to accommodate the scenario of homogeneous variances.
(The unequal variances formula is known as Satterthwaite's formula—the degrees of freedom are approximated in the case of unequal variances.)
cf. http://apcentral.collegeboard.com/apc/public/repository/ap05_stats_allwood_fin4prod.pdf
> t.test(glu ~ type, var.equal=T)
In other words, we need to determine if the different groups share the same variance. As we did in the normality checking, we can collect information from summary statistics, plots and formal test and then make our final judgment call.
Are the variances of the plasma glucose concentration the same between diabetic individuals and non-diabetic individuals? Use the summary statistics and plots to support your argument. |
Comparison of Variance
R provides the var.test() function for testing the assumption that the variances are the same, this is done by testing to see if the ratio of the variances is equal to 1. The test of variances is called the same way as t.test:
> var.test(glu ~ type)
F test to compare two variances
data: glu by type
F = 0.7821, num df = 131, denom df = 67, p-value = 0.2336
alternative hypothesis:true ratio of variances is not equal to 1
95 percent confidence interval:
0.5069535 1.1724351
sample estimates:
ratio of variances
0.7821009
- H0: The variance in glucose for diabetics is equal to the variance in glucose for non-diabetics.
- Ha: The variance in glucose for diabetics is not equal to the variance in glucose for non-diabetics.
We fail to reject the null hypothesis that the variance in glucose is equal to the variance in glucose for non-diabetics (F131,67 = 0.7821, p-value = 0.2336). The ratio of the variances is estimated to be 0.78 with a 95% confidence interval of (0.51, 1.17).
So here for our t-test we would use the var.equal=T option.
Two-Sample t Test in R: Independent Groups (R Tutorial 4.2) MarinStatsLectures [Contents]
Non-parametric Wilcoxon Test
To perform a nonparametric equivalent of a 2 independent sample t-test we use the Wilcoxon rank sum test. To perform this test in R we need to put the formula argument into the wilcox.test function or provide two vectors for the test. The script below shows one example:
> wilcox.test(glu ~ type)
Wilcoxon rank sum test with continuity correction
data: glu by type
W = 1894, p-value = 2.240e-11
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(glu[type=="Yes"],glu[type=="No"]) # alternative way to call the test
- H0: The median glucose for those who are diabetic is the same as the median of glucose for those who are not diabetic.
- Ha: The median of glucose for those who are diabetic is not the same as the median of glucose for those who are not diabetic.
We reject the null hypothesis that the median glucose for those who are diabetic is equal to the median glucose for those who are not diabetic (W = 1894, p-value = 2.24e-11).
Wilcoxon Signed Rank Test in R (R Tutorial 4.5) MarinStats Lectures [Contents]