Tests for More Than Two Samples

In this section, we consider comparisons among more than two groups parametrically, using analysis of variance (ANOVA), as well as non-parametrically, using the Kruskal-Wallis test.

Parametric Analysis of Variance (ANOVA)

To test if the means are equal for more than two groups we perform an analysis of variance test. An ANOVA test will determine if the grouping variable explains a significant portion of the variability in the dependent variable. If so, we would expect that the mean of your dependent variable will be different in each group. The assumptions of an ANOVA test are as follows:

Independent observations
The dependent variable follows a normal distribution in each group
Equal variance of the dependent variable in each group

Here, we will use the Pima.tr dataset. According to National Heart Lung and Blood Institute (NHLBI) website (http://www.nhlbisupport.com/bmi/), BMI can be classified into 4 categories:

Underweight: < 18.5
Normal weight: 18.5 ~ 24.9
Overweight: 25 ~ 29.9
Obesity: >= 30

BMI For Adults Widget

Create a categorical variable bmi.new to categorize the continuous bmivariable into four classes based on the definition shown above. Note that we have very few underweight individuals, so collapse underweight and normal weight into "Normal/under weight."
Report the number of individuals in each category.
Calculate the average glucose concentration in each category

An Aside
In this *Pima.tr* dataset the BMI is stored in numerical format, so we need to categorize BMI first since we are interested in whether categorical BMI is associated with the plasma glucose concentration. In the Exercise, you can use an "if-else-" statement to create the *bmi.catvariable. Alternatively, we can use cut()*function as well. Since we have very few individuals with BMI < 18.5, we will collapse categories "Underweight" and "Normal weight" together. > bmi.label <- c("Underweight/Normalweight", "Overweight", "Obesity") > summary(bmi) > bmi.break <- c(18, 24.9, 29.9, 50) > bmi.cat <- cut(bmi, breaks=bmi.break, labels = bmi.label) > table(bmi.cat) bmi.cat Underweight/Normal weight Overweight Obesity 25 43 132 > tapply(glu, bmi.cat, mean) Normal/under weight Overweight Obesity 108.4800 116.6977 129.2727

An Aside

In this Pima.tr dataset the BMI is stored in numerical format, so we need to categorize BMI first since we are interested in whether categorical BMI is associated with the plasma glucose concentration. In the Exercise, you can use an "if-else-" statement to create the bmi.catvariable. Alternatively, we can use cut()function as well. Since we have very few individuals with BMI < 18.5, we will collapse categories "Underweight" and "Normal weight" together.

> bmi.label <- c("Underweight/Normalweight", "Overweight", "Obesity")

> summary(bmi)

> bmi.break <- c(18, 24.9, 29.9, 50)

> bmi.cat <- cut(bmi, breaks=bmi.break, labels = bmi.label)

> table(bmi.cat)

bmi.cat

Underweight/Normal weight Overweight Obesity

25 43 132

> tapply(glu, bmi.cat, mean)

Normal/under weight Overweight Obesity

108.4800 116.6977 129.2727

Suppose we want to compare the means of plasma glucose concentration for our four BMI categories. We will conduct analysis of variance using bmi.catvariable as a factor.

> bmi.cat <- factor(bmi.cat)

> bmi.anova <- aov(glu ~ bmi.cat)

Before looking at the result, you may be interested in checking each category's glucose concentration average. One way it can be done is using the tapply() function. But alternatively, we can also use another function.

> print(model.tables(bmi.anova, "means"))

Tables of means

Grand mean

123.97

bmi.cat

Underweight/Normal weight Overweight Obesity

108.5 116.7 129.3

rep 25.0 43.0 132.0

Apparently, the glucose level varies in different categories. We can now request the ANOVA table for this analysis to check if the hypothesis testing result matches our observation in summary statistics.

> summary(bmi.anova)

Df Sum Sq Mean Sq F value Pr(>F)

bmi.cat 2 11984 5992 6.2932 0.002242 **

Residuals 197 187575 952

H₀: The mean glucose is equal for all levels of bmi categories.
H_a: At least one of the bmi categories has a mean glucose that is not the same as the other bmi categories.

We see that we reject the null hypothesis that the mean glucose is equal for all levels of bmi categories (F_2,197 = 6.29, p-value = 0.002242). The plasma glucose concentration means in at least two categories are significantly different.

Naturally, we will want to know which category pair has different glucose concentrations. One way to answer this question is to conduct several two-sample tests and then adjust for multiple testing using the Bonferroni correction text annotation indicator .

Performing many tests will increase the probability of finding one of them to be significant; that is, the p-values tend to be exaggerated (our type I error rate increases). A common adjustment method is the Bonferroni correction, which adjusts for multiple comparisons by changing the level of significance α for each test to α / (# of tests). Thus, if we were performing 10 tests to maintain a level of significance α of 0.05 we adjust for multiple testing using the Bonferroni correction by using 0.05/10 = 0.005 as our new level of significance.

A function called pairwise.t.test computes all possible two-group comparisons.

> pairwise.t.test(glu, bmi.cat, p.adj = "none")

Pairwise comparisons using t tests with pooled SD

data: glu and bmi.cat

Underweight/Normalweight Overweight

Overweight 0.2910 -

Obesity 0.0023 0.0213

P value adjustment method: none

From this result we reject the null hypothesis that the mean glucose for those who are obese is equal to the mean glucose for those who are underweight/normal weight (p-value = 0.0023). We also reject the null hypothesis that the mean glucose for those who are obese is equal to the mean glucose for those who are overweight (p-value = 0.0213). We fail to reject the null hypothesis that the mean glucose for those who are overweight is equal to the mean glucose for those who are underweight (p-value = 0.2910).

We can also make adjustments for multiple comparisons, like so:

> pairwise.t.test(glu, bmi.cat, p.adj = "bonferroni")

Pairwise comparisons using t tests with pooled SD

data: glu and bmi.cat

Underweight/Normal weight Overweight

Overweight 0.8729 -

Obesity 0.0069 0.0639

P value adjustment method: bonferroni

However, the Bonferroni correction is very conservative. Here, we introduce an alternative multiple comparison approach using Tukey's procedure:

> TukeyHSD(bmi.anova)

Tukey multiple comparisons of means

95% family-wise confidence level

Fit: aov(formula = glu ~ bmi.cat)

$bmi.cat

diff lwr upr p adj

Overweight-Underweight/Normalweight 8.217674 -10.1099039 26.54525 0.5407576

Obesity-Underweight/Normal weight 20.792727 4.8981963 36.68726 0.0064679

Obesity-Overweight 12.575053 -0.2203125 25.37042 0.0552495

From the pairwise comparison, what do we find regarding the plasma glucose in the different weight categories?

It is important to note that when testing the assumptions of an ANOVA, the var.test function can only be performed for two groups at a time. To look at the assumption of equal variance for more than two groups, we can use side-by-side boxplots:

> boxplot(glu~bmi.cat)

To determine whether or not the assumption of equal variance is met we look to see if the spread is equal for each of the groups.

We can also conduct a formal test for homogeneity of variances when we have more than two groups. This test is called Bartlett's Test, which assumes normality. The procedure is performed as follows:

> bartlett.test(glu~bmi.cat)

Bartlett test of homogeneity of variances

data: glu by bmi.cat

Bartlett's K-squared = 3.6105, df = 2, p-value = 0.1644

H₀: The variability in glucose is equal for all bmi categories.

H_a: The variability in glucose is not equal for all bmi categories.

We fail to reject the null hypothesis that the variability in glucose is equal for all bmi categories (Bartlett's K-squared = 3.6105, df = 2, p-value = 0.1644).

return to top | previous page | next page