Exercise 4 - Analysis of a Cohort Study


Is there an association between High Density Lipoprotein Cholesterol (HDLC) levels and myocardial infarction?

In Handouts on the right-hand side of the spreadsheet module there is an Excel file with actual data from the Framingham Heart Study. In this data set there a variable called labeled HDLC in column R. This is the subject's value for high density lipoprotein cholesterol at the beginning of the study (baseline). HDLC is sometimes called the 'good' cholesterol, because higher levels of HDLC seem to be associated with a lower risk of atherosclerotic heart disease. [For more on lipoproteins and heart disease see the discussion of this topic in the PH709 module on Atherosclerosis]

In this analysis you will examine the association between low HDLC (i.e. values < 40) and risk of myocardial infarction. In the spreadsheet with the Framingham data set there is a variable called MI_FCHD in column W which indicates whether the subject was hospitalized for a myocardial infarction or died of a myocardial infarction during this phase of the study. This is the outcome of interest for this particular analysis. TIMEMIFC in column AD indicates the disease free observation time (in days) for having a fatal or non-fatal myocardial infarction (heart attack). This time represents the number of days elapsed from baseline until one of three things happened: 1) they had a myocardial infarction, 2) they were lost to follow-up, or 3) the subject reached the last date for this phase of the study without an event. Note that many subjects have 8,766 days of TIMEMIFC; these are all subjects who had the maximum number of observation days because they were observed from baseline until the end of this phase of the study (8.766) and did not have a myocardial infarction during that entire period.

Example

This is a 21 minute video which walks you through a similar analysis comparing the risk of myocardial infarction between subjects with high LDL (> 130) and low LDL (<130). This will provide detailed instructions on performing an analysis in a cohort study.

alternative accessible content

Detailed Instructions:

Open the Framingham data set in the Handouts section of the online module and save it to your computer using "File", "Save as". Then analyze the association between low HDLC (HDLC <40) and myocardial infarction (MI_FCHD). Compute BOTH the cumulative incidence and the incidence rates in the exposed and unexposed groups. Use these results to compute the Risk Ratio (using cumulative incidence) and the Rate Ratio (using the incidence rates).

Note: if there are individuals who do not have a recorded value for HDLC, these subjects should be excluded from the analysis.]

You will need to compute:

  1. The number of 'events' (MI_FCHD) in exposed and unexposed groups.
  2. The number of subjects in exposed (HDLC<40) and unexposed (HDLC>40) groups.
  3. The person-time (in years) for the appropriate groups.
  4. The incidence rate in person-years for each group.
  5. The cumulative incidence for each group.

With this information complete your analysis by using the Cohort Study worksheet in Epi_Tools.xls to compute the

  1. Risk ratio (using cumulative incidences), 95% confidence interval for the risk ratio, and the p-value for this comparison.
  2. Rate ratio (using incidence rates), 95% confidence interval for the rate ratio, and the p-value for this comparison.

Finally, using the Excel data file that you used to sort and tabulate the data, insert a text box that summarizes the results of your analysis, and interpret your findings in a few sentences. Label the summary worksheet tab as "Summary."