Introduction
Link to Video Transcript in a Word file
The previous module focused on using t-tests to comparie continuously distributed outcomes, such as body mass index, but how can you compare categorical outcomes?
This module will introduce the chi-square test of independence to test associations between categorical exposures and categorical outcomes. After completing this section you should be able to perform the chi-square test of independence by hand or using the R statistical package.
The chi-square test of independence can be used to test for differences with several types of variables that were introduced in module 1:
- Categorical variables: Variables that fall into two or more categories that do not have any inherent ranking or ordering, such as race and ethnicity (e.g., white, black, Hispanic, Asian, etc.)
- Dichotomous variables: Variables that have just two possible values (e.g., male or female; occupational exposure to asbestos: Yes or No; death: Yes or No; developed coronary heart disease: Yes or No)
- Ordinal variables: Categorical variables that have more than two ranked or ordered values (e.g., physical activity: <30 minutes/week, 30-180 minutes/week, >180 minutes/week; amount of current smoking: none, <10/day, 10-20/day, 21-30/day, >30/day); or number of past heart attacks: 0, 1, 2, 3, etc.)
Essential Questions
- How do we test the statistical significance of associations between exposures and categorical outcomes?
- How do we determine whether an exposure increases or decreases the risk of a particular health outcome?
- When is it most appropriate to assess data categorically?
Learning Objectives
After completing this module, you will be able to:
- Identify situations in which a chi-square test is appropriate
- Perform chi-square tests by hand and using the R computing package
- Interpret results of chi-square tests correctly
- Identify the appropriate hypothesis testing procedure for evaluating categorical data
- Communicate the results of chi-square tests for a non-technical/lay audience