Confidence Intervals for RRs, ORs in R

The "base package" in R does not have a command to calculate confidence intervals for RRs, ORs. However, there are supplemental packages that can be loaded into R to add additional analytical tools, including confidence intervals for RR and OR. These tools are in the " epitools " package.

You must first install the package on your computer (just once), but each time you want to use it in an active R session, you need to load it.

Installing the epitools Package into R

Type the following to install the epitools package (this only needs to be done once):

>install.packages("epitools")

You should see the following message as a response in red:
Installing package into 'C:/Users/healeym/Documents/R/win-library/3.3' (as 'lib' is unspecified) trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.3/epitools_0.5-7.zip' Content type 'application/zip' length 228486 bytes (223 KB) downloaded 223 KB
package 'epitools' successfully unpacked and MD5 sums checked The downloaded binary packages are in C:\Users\yourusername\AppData\Local\Temp\RtmpsLajiU\downloaded_packages

Loading the epitools Package When You Want to Use It

You only have to install the epitools package once, but you have to call it up each time you use it.

>library(epitools)
Warning message: package 'epitools' was built under R version 3.4.2

RRs and ORs from R

Situation #1: Starting with Counts from a Contingency Table:

If you are given the counts in a contingency table, i.e., you do not have the raw data set, you can re-create the table in R and then compute the risk ratio and its 95% confidence limits using the riskratio.wald() function in Epitools.

No CVD

CVD

Total

No HTN

1017

165

1182

HTN

2260

992

3252

Total

3277

1157

4434

 

 

 

 

This is where the orientation of the contingency table is critical, i.e., with the unexposed (reference) group in the first row and the subjects without the outcome in the first column.
We create the contingency table in R using the matrix function and entering the data for the 1st column, then 2nd column. Note that we only enter the observed counts for each of the exposure-disease categories; we do not enter the totals in the margins. The solution in R is as follows:

Risk Ratio and Confidence Interval in R

R Code:

# The 1stline below creates the contingency table; the 2nd line prints the table so you can check the orientation
> RRtable<-matrix(c(1017,2260,165,992),nrow = 2, ncol = 2)
> RRtable
     [,1] [,2]
[1,] 1017  165
[2,] 2260  992
# The next line asks R to compute the RR and 95% confidence interval
> riskratio.wald(RRtable)
$data
          Outcome
Predictor  Disease1 Disease2 Total
Exposed1      1017      165  1182
Exposed2      2260      992  3252
Total         3277     1157  4434

$measure
risk ratio with 95% C.I.
Predictor  estimate    lower    upper
Exposed1 1.000000        NA     
Exposed2 2.185217 1.879441 2.540742

$p.value
two-sided
Predictor  midp.exact fisher.exact  chi.square
Exposed1          NA                  
Exposed2           0 7.357611e-31 1.35953e-28zz

$correction [1] FALSE

attr(,"method") [1] "Unconditional MLE & normal approximation (Wald) CI"

The risk ratio and 95% confidence interval are listed in the output under $measure.

Odds Ratio and 95% Confidence Interval in R

Case-control studies use an odds ratio as the measure of association, but this procedure is very similar to the analysis above for RR.

> ORtable<-matrix(c(1017,2260,165,992),nrow = 2, ncol = 2)
>
ORtable
     [,1] [,2]
[1,] 1017  165
[2,] 2260  992
> oddsratio.wald(ORtable)
$data
          Outcome
Predictor  Disease1 Disease2 Total
 Exposed1     1017      165  1182
 Exposed2     2260      992  3252
 Total        3277     1157  4434

$measure
odds ratio with 95% C.I.
Predictor  estimate    lower    upper
Exposed1 1.000000        NA     
Exposed2 2.705455 2.258339 3.241093

$p.value two-sided
Predictor  midp.exact fisher.exact  chi.square
Exposed1          NA                  
Exposed2           0 7.357611e-31 1.35953e-28

$correction [1] FALSE

attr(,"method")
[1] "Unconditional MLE & normal approximation (Wald) CI"

Situation #2: Starting with Counts from a Raw Data Set:

If you have a raw data set, computing risk ratios and odds ratios and their corresponding 95% confidence intervals is even easier, because the contingency table can be created using the table() command instead of the matrix function.

 

For example, if I have data from the Framingham Heart Study and I want to compute the risk ratio for the association between type 2 diabetes and risk of being hospitalized with a myocardial infarction, I first use the table() command.

> table(diabetes,hospmi)

              hospmi

diabetes       0       1

            0 2557   210

            1 183     48

Then, to compute the risk ratio and confidence limits, I insert the table parameters into the riskratio.wald() function:

> riskratio.wald(table(diabetes,hospmi))

$data

              hospmi

diabetes       0     1 Total

    0         2557 210   2767

    1         183   48   231

    Total 2740 258   2998

 

$measure

              risk ratio with 95% C.I.

diabetes estimate       lower     upper

            0   1.00000   NA       NA

            1   2.73791 2.062282 3.63488

 Using the same data, I can similarly compute an odds ratio and its confidence interval using the oddsratio.wald()function:

> oddsratio.wald(table(diabetes,hospmi))

$data

              hospmi

diabetes       0     1 Total

    0         2557 210  2767

    1          183   48   231

    Total 2740 258  2998

 

$measure

              odds ratio with 95% C.I.

diabetes estimate       lower      upper

            0 1.000000   NA       NA

            1 3.193755 2.256038 4.521233

  Note that, since this is a cohort study, it makes sense to compute the risk ratio, but I also have the option of computing an odds ratio, although in a case-control study one can only calculate an odds ratio. Notice also that in the example above, the odds ratio was somewhat more extreme than the risk ratio.

Test Yourself

Problem #1

A clinical trial was conducted to compare a new blood pressure-lowering medication to a placebo. Patients were enrolled and randomized to receive either the new medication or a placebo. The data below were collected at the end of the 6 week study.

Treatment (n=100)

Placebo (n=100)
Systolic Blood Pressure, mean (sd) 120.2 (15.4) 131.4 (18.9)
Hypertensive, % 14% 22%
Side Effects, % 6% 8%

Generate a point estimate and 95% confidence interval for the risk ratio of side effects in patients assigned to the experimental group as compared to placebo. Use both the hand calculation method and the method using R to see if you get the same answers. Interpret the results in a sentence or two.

Link to Answer in a Word file

 

Problem #2

The table below summarizes parental characteristics for children of normal weight and children classified as overweight or obese. Perform a chi-square test by hand to determine if there is an association between the mother's BMI and the child's weight status. Compute the p-value and report your conclusion.

Characteristics

Child - Normal Weight (n=62) Child - Overweight/Obese (n=38) Total
(n=100)
Mean (SD) Age, years 13.4 (2.6) 11.1 (2.9) 12.5 (2.7)
% Male 45% 51% 47%
Mother's BMI
  Normal (BMI<25) 40 (65%) 16 (41%) 56 (56%)
  Overweight (BMI 25-29.9) 15 (24%) 14 (38%) 29 (29%)
  Obese (BMI > 30) 7 (11%) 8 (21%) 15 (15%)
Father's BMI
  Normal (BMI<25) 34 (55%) 16 (41%) 50 (50%)
  Overweight (BMI 25-29.9) 20 (32%) 14 (38%) 34 (34%)
  Obese (BMI > 30) 8 (13%) 8 (21%) 16 (16%)
Mean (SD) Systolic BP 123 (15) 139 (12) 129 (14)
Mean (SD) Total Cholesterol 186 (25) 211 (28) 196 (26)

 Link to Answer in a Word file