Selection of a Comparison Group
The major challenge for the Air Force Health Study (AFHS) and other special cohort studies is selection of an appropriate comparison group. The goal of analytic studies is to compare health outcomes in exposed and unexposed groups that are otherwise as similar as possible, i.e., having the same distributions of all other factors that could have any association with health outcomes. We will see that intervention studies with large numbers of subjects randomly assigned to two or more treatment groups (exposures) can usually achieve this so that the groups being compared have similar distributions of age, sex, smoking, physical activity, etc., but random assignment does not occur in cohort studies. Suppose that a cohort study had smokers who were older than the non-smokers. It is well established that the risk of heart disease increases with age, i.e., it is an independent risk factor for heart disease, and if the smokers are older, they have an additional risk factor that will cause an overestimate of the association between smoking and heart disease. This phenomenon, called confounding, occurs when the exposure groups that are being compared differ in the distribution of other determinants of the outcome of interest. Another concern is that the exposure groups being compared may differ in the quality or accuracy of the data that is being collected, and this can also bias the results (so-called information bias). Confounding and bias will be discussed later in the course, but for now, it is important to recognize the importance of selecting a comparison group that differs in exposure status but is as similar as possible to the exposed group in all other ways including:
- Other factors that can influence the health outcome
- The quality and accuracy of their data
The figure below depicts three studies of cardiovascular disease illustrating the general approaches to selecting a comparison group for a cohort study.
As noted earlier, general cohorts employ an internal comparison group, e.g., dividing the cohort into quintiles of BMI or quintiles of activity and using the quintile with the lowest BMI or the lowest activity as the reference group. This is the best comparison group for a general cohort study, because the subjects are likely to be similar in some ways, but they may still differ with respect to potentially confounding factors. For example, nurses who exercise regularly may be generally more health conscious (e.g., less likely to smoke; more likely to eat a healthier diet; more likely to take vitamins, etc.).
The second method is to use an external comparison group. A special exposure cohort consisting of workers in a rayon factory, was selected to study the association between disulfide exposure and risk of cardiovascular disease, and the comparison group consisted of workers in a paper mill. These two groups may be similar in age distribution, socioeconomic status, and other factors, but they may also differ with respect to other confounding factors. In addition, paper mills have their own mix of occupational exposures, which might also affect the likelihood of cardiovascular disease and bias the results.
The third approach is to use the general population as a comparison group, for example, if trying to determine whether workers in a rayon factory had higher mortality rates. This approach is less costly, and it is sometimes used for studies of occupational exposures when it is difficult to find an appropriate internal or external comparison group. However, using rates of death or disease in the general population has a number of limitations:
- General population data are frequently limited to studies of mortality since accurate rates on specific health outcomes may not be available.
- General population rates include exposed and unexposed individuals.
- The general population is not really comparable because there are many confounding variables that cannot be controlled for.
- The general population includes people who are unable to work because of disease or disability (the "healthy worker effect" which is discussed in the module on bias).
Test Yourself