Confidence Interval for a Population Proportion
Suppose we want to compute the proportion of subjects on anti-hypertensive medication in the Framingham Offspring Study and also wanted the 95% confidence interval for the estimated proportion.
The 95% confidence interval for a proportion is:
This formula is appropriate whenever there are at least 5 subjects with the outcome and at least 5 without the outcome. You should always use Z scores (not t-scores) to compute the confidence interval for a proportion. If the numbers are less than 5, there is a correction that can be used in R, which will be illustrated below.
Example:
In the Framingham Offspring study 1,219 subjects were on anti-hypertensive medication out of 3,532 total subjects. Therefore, the point estimate is computed as follows:
The 95% confidence interval is computed as follows:
Interpretation: We are 95% confident that the true proportion of patients on anti-hypertensives is between 33% and 36%.
Computing the 95% Confidence Interval for a Proportion Using R
R makes it easy to compute a proportion and its 95% confidence interval.
Example:
> prop.test(1219,3532,correct=FALSE)
Output:
1-sample proportions test without continuity correction
data: 1219 out of 3532, null probability 0.5
X-squared = 338.855, df = 1, p-value < 2.2e-16
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
0.3296275 0.3609695
sample estimates:
p
0.3451302
R also generates a p-value here, testing the null hypothesis that the proportion is 0.5, i.e., equal proportions.
Test Yourself
A sample of n=100 patients free of diabetes have their body mass index (BMI) measured. 32% of these patients have BMI ≥30 and meet the criteria for obesity. Generate a 95% confidence interval for the proportion of patients free of diabetes who are obese.