Confidence Intervals
Strictly speaking, a 95% confidence interval means that if the same population were sampled on infinite occasions and confidence interval estimates were made on each occasion, the resulting intervals would contain the true population parameter in approximately 95% of the cases, assuming that there was no systematic error (bias or confounding). However, because we don't sample the same population or do exactly the same study on numerous (much less infinite) occasions, we need an interpretation of a single confidence interval. The interpretation turns out to be surprisingly complex, but for purposes of our course, we will say that it has the following interpretation: A confidence interval is a range around a point estimate within which the true value is likely to lie with a specified degree of probability, assuming there is no systematic error (bias or confounding). If the sample size is small and subject to more random error, then the estimate will not be as precise, and the confidence interval would be wide, indicating a greater amount of random error. In contrast, with a large sample size, the width of the confidence interval is narrower, indicating less random error and greater precision. One can, therefore, use the width of confidence intervals to indicate the amount of random error in an estimate. The most frequently used confidence intervals specify either 95% or 90% likelihood, although one can calculate intervals for any level between 0-100%. Confidence intervals can also be computed for many point estimates: means, proportions, rates, odds ratios, risk ratios, etc. For this course we will be primarily using 95% confidence intervals for a) a proportion in a single group and b) for estimated measures of association (risk ratios, rate ratios, and odds ratios), which are based on a comparison of two groups.
It is important to note that 95% confidence intervals only address random error, and do not take into account known or unknown biases or confounding, which invariably occur in epidemiologic studies. Consequently, Rothman cautions that it is better to regard confidence intervals as a general guide to the amount of random error in the data. Failure to account for the fact that the confidence interval does not account for systematic error is common and leads to incorrect interpretations of results of studies.
Confidence Interval for a Proportion
In the example above in which I was interested in estimating the case-fatality rate among humans infected with bird flu, I was dealing with just a single group, i.e., I was not making any comparisons. Lye et al. performed a search of the literature in 2007 and found a total of 170 cases of human bird flu that had been reported in the literature. Among these there had been 92 deaths, meaning that the overall case-fatality rate was 92/170 = 54%. How precise is this estimate?
Link to the article by Lye et al.
There are several methods of computing confidence intervals, and some are more accurate and more versatile than others. The EpiTool.XLS spreadsheet created for this course has a worksheet entitled "CI - One Group" that will calculate confidence intervals for a point estimate in one group. The top part of the worksheet calculates confidence intervals for proportions, such as prevalence or cumulative incidences, and the lower portion will compute confidence intervals for an incidence rate in a single group.
A Quick Video Tour of "Epi_Tools.XLSX" (9:54)
Link to a transcript of the video
Spreadsheets are a valuable professinal tool. To learn more about the basics of using Excel or Numbers for public health applications, see the online learning module on
Link to online learning module on Using Spreadsheets - Excel (PC) and Numbers (Mac & iPad)
Use "Epi_Tools" to compute the 95% confidence interval for the overall case-fatality rate from bird flu reported by Lye et al. (NOTE: You should download the Epi-Tools spreadsheet to your computer; there is also a link to EpiTools under Learn More in the left side navigation of the page.) Open Epi_Tools.XLSX and compute the 95% confidence; then compare your answer to the one below.
How would you interpret this confidence interval in a single sentence? Jot down your interpretation before looking at the answer.
In the hypothetical case series that was described on page two of this module the scenario described 8 human cases of bird flu, and 4 of these died. use Epi_Tools to compute the 95% confidence interval for this proportion. How does this confidence interval compare to the one you computed from the data reported by Lye et al.?
The key to reducing random error is to increase sample size. The table below illustrates this by showing the 95% confidence intervals that would result for point estimates of 30%, 50% and 60%. For each of these, the table shows what the 95% confidence interval would be as the sample size is increased from 10 to 100 or to 1,000. As you can see, the confidence interval narrows substantially as the sample size increases, reflecting less random error and greater precision.
Observed Frequency |
Sample Size =10 |
Sample Size =100 |
Sample Size =1000 |
---|---|---|---|
0.30 |
0.11 - 0.60 |
0.22 - 0.40 |
0.27 - 0.33 |
0.50 |
0.24 - 0.76 |
0.40 - 0.60 |
0.47 - 0.53 |
0.60 |
0.31 - 0.83 |
0.50 - 0.69 |
0.57 - 0.63 |
Video Summary - Confidence Interval for a Proportion in a Single Group (5:11)
Link to transcript of the video