# An Exercise Using R to Compute Chi-Squared from a Data Set

In this exercise you will examine associations between baseline health characteristics measured during the 1st  visit in the Framingham Heart Study and development of heart disease & death over 20 years of follow-up. You will use a subset (n=500) of the data saved in the file FramHSn500.csv.

The data is coded as shown in the table below:

 Variable Names Description RANDID subject ID number SEX 1=male, 2=female AGE age in years SYSBP systolic blood pressure in mmHg DIABP diastolic blood pressure in mmHg TOTCHOL total cholesterol in mg/dL CURSMOKE current smoking status, 1= smoker, 0=nonsmoker FRAM_BMI body mass index measured in kg/m2 COFFEE cups of coffee per day; (6= 6 or more) DIABETES developed diabetes over 20 years of follow-up, 1=yes, 0=no HEARTDIS developed heart disease over 20 yrs of follow-up, 1=yes, 0=no ANYDEATH death (any cause) over 20 yrs. of follow-up;  1=died;0=living

Save your R code and output as you answer the following questions.

1. Are smokers at higher risk of death over follow-up than non-smokers?  Test this with a chi-square test, reporting the proportion of smokers and non-smokers who have died over 20 years of follow-up, the risk ratio, the value of the chi-square statistic, degrees of freedom, and p-value.  Summarize your conclusions.
2. Are smokers at higher risk of death over follow-up than non-smokers?  Find the risk ratio of death for smokers vs. non-smokers, and the 95% confidence interval for this risk ratio (remember that the orientation of the table matters when finding a RR).  Given this confidence interval, do smokers have significantly higher risk of death?  Explain.
3.  Is there an association between coffee consumption and death over the 20-year follow-up?  Test this with a chi-square statistic, reporting the proportion who have died in each category of coffee consumption, the value of the test statistic, degrees of freedom, and p-value. Summarize your conclusions.
4. Do those who develop heart disease have a higher risk of death over follow-up?  What percent of those with and without heart disease die over follow-up?  Test through a chi-square statistic, reporting the value of the test statistic, degrees of freedom, and p-value.  Summarize your conclusions.
5. Do those who develop heart disease have a higher risk of death over follow-up?  Find the risk ratio of death for with vs. without heart disease, and the 95% confidence interval for this relative risk (remember that the orientation of the table matters when finding a RR).  Given this confidence interval, do those who develop heart disease have significantly higher risk of death?  Explain.

Test Yourself

#1 - Starship Enterprise

Now let's return to the question of mortality rates on the Enterprise. Here is the data. Is the mortality rate significantly greater among the Red crew members? Use R to analyze these data.

 Color Areas Crew Fatalities Blue Science and Medical 136 7 Gold Command and Helm 55 9 Red Operations, Engineering, and Security 239 4 Ship's total All 430 40

#2 - The Titanic

Now let's reconsider 1st, 2nd, or 3rd class passengers differed significantly in their risk of death after the Titanic struck an iceberg. Here is the data.

 Women Alive Dead Total 1st Class 137 4 141 2nd Class 79 13 92 3rd Class 88 91 179

Use R to compute the risk ratio of death for 2nd class compared to 1st class (reference group) and for 3rd class compared to 1st class. Also compute the 95% confidence intervals for these risk ratio estimates and the p-values.

Was the risk of death significantly higher in 2nd class and 3rd class passengers?