Tests for More Than Two Samples
In this section, we consider comparisons among more than two groups parametrically, using analysis of variance (ANOVA), as well as non-parametrically, using the Kruskal-Wallis test.
Parametric Analysis of Variance (ANOVA)
To test if the means are equal for more than two groups we perform an analysis of variance test. An ANOVA test will determine if the grouping variable explains a significant portion of the variability in the dependent variable. If so, we would expect that the mean of your dependent variable will be different in each group. The assumptions of an ANOVA test are as follows:
- Independent observations
- The dependent variable follows a normal distribution in each group
- Equal variance of the dependent variable in each group
Here, we will use the Pima.tr dataset. According to National Heart Lung and Blood Institute (NHLBI) website (http://www.nhlbisupport.com/bmi/), BMI can be classified into 4 categories:
- Underweight: < 18.5
- Normal weight: 18.5 ~ 24.9
- Overweight: 25 ~ 29.9
- Obesity: >= 30
|
An Aside |
---|
In this Pima.tr dataset the BMI is stored in numerical format, so we need to categorize BMI first since we are interested in whether categorical BMI is associated with the plasma glucose concentration. In the Exercise, you can use an "if-else-" statement to create the bmi.catvariable. Alternatively, we can use cut()function as well. Since we have very few individuals with BMI < 18.5, we will collapse categories "Underweight" and "Normal weight" together.
> bmi.label <- c("Underweight/Normalweight", "Overweight", "Obesity") > summary(bmi) > bmi.break <- c(18, 24.9, 29.9, 50) > bmi.cat <- cut(bmi, breaks=bmi.break, labels = bmi.label) > table(bmi.cat) bmi.cat Underweight/Normal weight Overweight Obesity 25 43 132 > tapply(glu, bmi.cat, mean) Normal/under weight Overweight Obesity 108.4800 116.6977 129.2727
|
Suppose we want to compare the means of plasma glucose concentration for our four BMI categories. We will conduct analysis of variance using bmi.catvariable as a factor.
> bmi.cat <- factor(bmi.cat)
> bmi.anova <- aov(glu ~ bmi.cat)
Before looking at the result, you may be interested in checking each category's glucose concentration average. One way it can be done is using the tapply() function. But alternatively, we can also use another function.
> print(model.tables(bmi.anova, "means"))
Tables of means
Grand mean
123.97
bmi.cat
Underweight/Normal weight Overweight Obesity
108.5 116.7 129.3
rep 25.0 43.0 132.0
Apparently, the glucose level varies in different categories. We can now request the ANOVA table for this analysis to check if the hypothesis testing result matches our observation in summary statistics.
> summary(bmi.anova)
Df Sum Sq Mean Sq F value Pr(>F)
bmi.cat 2 11984 5992 6.2932 0.002242 **
Residuals 197 187575 952
- H0: The mean glucose is equal for all levels of bmi categories.
- Ha: At least one of the bmi categories has a mean glucose that is not the same as the other bmi categories.
We see that we reject the null hypothesis that the mean glucose is equal for all levels of bmi categories (F2,197 = 6.29, p-value = 0.002242). The plasma glucose concentration means in at least two categories are significantly different.
Naturally, we will want to know which category pair has different glucose concentrations. One way to answer this question is to conduct several two-sample tests and then adjust for multiple testing using the Bonferroni correction.
Performing many tests will increase the probability of finding one of them to be significant; that is, the p-values tend to be exaggerated (our type I error rate increases). A common adjustment method is the Bonferroni correction, which adjusts for multiple comparisons by changing the level of significance α for each test to α / (# of tests). Thus, if we were performing 10 tests to maintain a level of significance α of 0.05 we adjust for multiple testing using the Bonferroni correction by using 0.05/10 = 0.005 as our new level of significance.
A function called pairwise.t.test computes all possible two-group comparisons.
> pairwise.t.test(glu, bmi.cat, p.adj = "none")
Pairwise comparisons using t tests with pooled SD
data: glu and bmi.cat
Underweight/Normalweight Overweight
Overweight 0.2910 -
Obesity 0.0023 0.0213
P value adjustment method: none
From this result we reject the null hypothesis that the mean glucose for those who are obese is equal to the mean glucose for those who are underweight/normal weight (p-value = 0.0023). We also reject the null hypothesis that the mean glucose for those who are obese is equal to the mean glucose for those who are overweight (p-value = 0.0213). We fail to reject the null hypothesis that the mean glucose for those who are overweight is equal to the mean glucose for those who are underweight (p-value = 0.2910).
We can also make adjustments for multiple comparisons, like so:
> pairwise.t.test(glu, bmi.cat, p.adj = "bonferroni")
Pairwise comparisons using t tests with pooled SD
data: glu and bmi.cat
Underweight/Normal weight Overweight
Overweight 0.8729 -
Obesity 0.0069 0.0639
P value adjustment method: bonferroni
However, the Bonferroni correction is very conservative. Here, we introduce an alternative multiple comparison approach using Tukey's procedure:
> TukeyHSD(bmi.anova)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = glu ~ bmi.cat)
$bmi.cat
diff lwr upr p adj
Overweight-Underweight/Normalweight 8.217674 -10.1099039 26.54525 0.5407576
Obesity-Underweight/Normal weight 20.792727 4.8981963 36.68726 0.0064679
Obesity-Overweight 12.575053 -0.2203125 25.37042 0.0552495
From the pairwise comparison, what do we find regarding the plasma glucose in the different weight categories?
It is important to note that when testing the assumptions of an ANOVA, the var.test function can only be performed for two groups at a time. To look at the assumption of equal variance for more than two groups, we can use side-by-side boxplots:
> boxplot(glu~bmi.cat)
To determine whether or not the assumption of equal variance is met we look to see if the spread is equal for each of the groups.
We can also conduct a formal test for homogeneity of variances when we have more than two groups. This test is called Bartlett's Test, which assumes normality. The procedure is performed as follows:
> bartlett.test(glu~bmi.cat)
Bartlett test of homogeneity of variances
data: glu by bmi.cat
Bartlett's K-squared = 3.6105, df = 2, p-value = 0.1644
H0: The variability in glucose is equal for all bmi categories.
Ha: The variability in glucose is not equal for all bmi categories.
We fail to reject the null hypothesis that the variability in glucose is equal for all bmi categories (Bartlett's K-squared = 3.6105, df = 2, p-value = 0.1644).