Application of the Chi-Squared Test of Independence


The chi-squared test of indendence can be used to analyze data from cross-sectional surveys, retrospective and prospective cohort studies, randomized clinical trials, and case-control studies.

A Cross-sectional Survey

Consider the results of a cross-sectional survey in which assistant professors at colleges were asked to indicate their sex (the exposure of interest) and whether their starting salary was greater or less than $60,000 per year.

 

< $60,000

> $60,000

Male

122

75

Female

64

50

The prevalence of a salary less than $60,000 per year was 122/(122+75) = 0.619=61.9% in males compared to a prevalence of 64/(64+50) = 0.561=56.1% in females. The prevalence ratio for lower salary in males compated to females was therefore:

I can compare the frequencies with the following code:

> salarytable<-matrix(c(122,64,75,50,12,21),nrow=2,ncol=2)

> salarytable

[,1] [,2]

[1,] 122 75

[2,] 64 50

> chisq.test(salarytable,correct=FALSE)

Pearson's Chi-squared test

data: salarytable

X-squared = 1.0066, df = 1, p-value = 0.3157

Since the p-value is 0.2157 there is not sufficient evidence to conclude that the frequency of a starting salary above $60,000 per year differs between males and females.

A Prospective Cohort Study

Antonia Trichopoulou, M.D., et al: Adherence to a Mediterranean Diet and Survival in a Greek Population. N Engl J Med 2003;348:2599-608.

From 1994 to1999 a study was conducted to identify nutritional and lifestyle behaviors associated with survival in Greek adults. A total of 28,572 participants, 20 to 86 years old, were recruited from all regions of Greece. One goal was to study the extent to which close adherence to a traditional Mediterranean (Greek) diet was associated with survival, but the investigators also examined a number of other potential risk factors. After enrollment (i.e. at the baseline or beginning of the study), subjects completed extensive questionnaires administered in person by specially trained interviewers. The dietary questionnaire documented food intake during the past year using a semi-quantitative food-frequency questionnaire that included 150 foods and beverages commonly consumed in Greece. Adherence to the traditional Mediterranean diet was assessed by a 10-point Mediterranean-diet scale. Some of the results in menare shown in the table below.

Adherence to Greek Diet

Died During Study

Not Dead

Total

Low

74

2383

2457

Medium

61

3747

3808

High

44

2586

2630

> diettable<-matrix(c(74,61,44,2383,3747,2586),nrow=3,ncol=2)

> diettable

[,1] [,2]

[1,] 74 2383

[2,] 61 3747

[3,] 44 2586

> chisq.test(diettable,correct=FALSE)

Pearson's Chi-squared test

data: diettable

X-squared = 17.2361, df = 2, p-value = 0.0001808

Therefore, I would reject the null hypothesis and conclude that the frequency of death among Greek makes does differ significantly among the three categories of dietary adherence.

I could also compute risk ratios by using the men with high adherence as a reference group and comparing the other two categories to them. For example, the cumulative incidence in men with low adherence was 74/2457 = 0.0301 =30.1 deaths per 1,000 men over the five years of observation. The cumulative incidence in men with high adherence was 44/2630 = 0.0167 =16.7 deaths per 1,000 men over the five years of observation. Therefore, the risk ratio for men with low adherence compared to those with high adherence was as follows:

One might interpret these results as follows: Men with low adherence to a traditional Greek diet had 1.8 times the risk of dying during a five year period of observation, and this difference was statistically significant (p=0.0002).

A Randomized Clinical Trial

In 1982 the Physicians' Health Study enrolled over 22,000 male physicians in the US between the ages of 40-84 in order to study whether low-dose aspiring (one tablet every other day) was protective against myocardial infarctions (heart attacks). The subjects were randomly assigned to take low-dose aspirin or a placebo, and they were followed for about five years. One of the endpoints of interest was whether aspirin reduced the incidence of fatal myocardial infarctions. There findings are summarized in the table below.

 

Fatal MI

No Fatal MI

Total

Aspirin

10

11,027

11,037

Placebo

26

11,008

11,0034

 

> MItable<-matrix(c(10,26,11027,11008),nrow=2,ncol=2)

> MItable

[,1] [,2]

[1,] 10 11027

[2,] 26 11008

> chisq.test(MItable, correct=FALSE)

Pearson's Chi-squared test

data: MItable

X-squared = 7.1271, df = 1, p-value = 0.007593

The cumulative incidence in the men treated with aspirin was 10/11037 = 0.00090604 = 9 per 10,000 over 5 years.

The cumulative incidence in the men receiving the placebo was 26/11034 = 0.002356 = 23 per 10,000 over 5 years.

Therefore the risk ratio was:

Interpretation: Male physicians who took an aspiri every other day had 0.39 times the risk (or a 61% reduction in risk) compared to male physicians treated with placebo (p=0.008).

A Case-Control Study

D'Souza et al. conducted a study on the association between human papillomavirus and oropharyngeal cancer (N Engl J Med 2007;356:1944-56). They identified 100 patients with newly diagnosed squamous-cell carcinomas of the head and neck in Baltimore from 2000 through 2005. The comparison group consisted of 200 patients without a history of cancer who were seen for benign conditions between 2000 and 2005 in the same clinic. All patients completed a computer-assisted self-administered interview that recorded information about demographic characteristics, past oral hygiene, medical history, family history of cancer, lifetime sexual behaviors, and lifetime history of marijuana, tobacco, and alcohol use. Part of their results focused on the association between oral hygiene and oropharyngeal cancer, as shown in this table.

 

Patients with Oropharyngeal  Cancer
(N=100)

Control Patients
(N=200)

Tooth Loss

 

 

None

62

163

Some

16

20

Complete

22

17

 

> CCtable<-matrix(c(62,16,22,163,20,17),nrow=3,ncol=2)

> CCtable

[,1] [,2]

[1,] 62 163

[2,] 16 20

[3,] 22 17

> chisq.test(C-Ctable,correct=FALSE)

Pearson's Chi-squared test

data: CCtable

X-squared = 14.7262, df = 2, p-value = 0.0006342

Since this is a case-control study, I cannot calculate the incidence, and I cannot calculate a risk ratio per se. However, I can compute odds ratios, for example, using the subjects with no tooth loss as the reference and comparing each of the other two exposure groups to them. For example, comparing the group with complete tooth loss to those with no tooth loss the odds ratio is as follows:

Interpetation: Those who had complete tooth loss had 3.4 times the odds of having oropharngeal cancer.