The dataset Outbreak contains information from an investigation of outbreak of acute gastrointestinal illness on a national handicapped sports day in Thailand in 1990. This dataset can be found in the epicalc package in R.
Dichotomous variables for exposures and symptoms were coded as follows:
- 0 = no
- 1 = yes
- 9 = missing or unknown
Outbreak is a data frame with 1094 observations on the following variables:
- sex: a numeric vector (0 = female, 1 = male)
- age: a numeric vector- age in years
- beefcurry: a numeric vector- whether the subject had eaten beef curry
- saltegg: a numeric vector- whether the subject had eaten salted eggs
- eclair: a numeric vector- pieces of eclair eaten; Note: 80 = ate but could not remember how much, 90 = missing
- water: a numeric vector- whether the subject had drunk water
- nausea: a numeric vector
- vomiting: a numeric vector
- abdpain: a numeric vector (abdominal pain)
- diarrhea: a numeric vector
[Reference: Thaikruea, L., Pataraarechachai, J., Savanpunyalert, P., Naluponjiragul, U. 1995. An unusual outbreak of food poisoning. Southeast Asian J Trop Med Public Health 26(1):78-85.]
From this data we can try to answer a number of questions relating to tracing the cause of and comparing the severity of the food poisoning outbreak among various exposure populations.
It was agreed among the investigators that a food poisoning case should be defined as a person who had any of the four symptoms:
. A case can then be computed as follows (attach the data set before you do this):
> case <- ifelse((nausea==1)|(vomiting==1)|(abdpain==1)|(diarrhea==1),1,0)
Append the case status as a new variable in the data set, called "case", and attach the new data set after detaching the previous one.
We can now look at some of the relative frequencies of cases among different groups of exposure. For instance, let us first tabulate the frequencies of cases among people at the sports day who ate salted eggs.
> eggcase <- table(case, saltegg)
> prop.table(eggcase, 2) ## why is this 2?
Using appropriate commands, tabulate the frequencies of cases among people at the sports day who had/hadn't eaten beef curry (you can include the missing individuals in your table). Also display the proportions through an appropriate bar chart.