InterQuartile Range (IQR)
When a data set has outliers or extreme values, we summarize a typical value using the median as opposed to the mean. When a data set has outliers, variability is often summarized by a statistic called the interquartile range, which is the difference between the first and third quartiles. The first quartile, denoted Q_{1}, is the value in the data set that holds 25% of the values below it. The third quartile, denoted Q_{3}, is the value in the data set that holds 25% of the values above it. The quartiles can be determined following the same approach that we used to determine the median, but we now consider each half of the data set separately. The interquartile range is defined as follows:
Interquartile Range = Q_{3}Q_{1}
With an Even Sample Size:
For the sample (n=10) the median diastolic blood pressure is 71 (50% of the values are above 71, and 50% are below). The quartiles can be determined in the same way we determined the median, except we consider each half of the data set separately.
Figure 9  Interquartile Range with Even Sample Size
There are 5 values below the median (lower half), the middle value is 64 which is the first quartile. There are 5 values above the median (upper half), the middle value is 77 which is the third quartile. The interquartile range is 77 – 64 = 13; the interquartile range is the range of the middle 50% of the data.

With an Odd Sample Size:
When the sample size is odd, the median and quartiles are determined in the same way. Suppose in the previous example, the lowest value (62) were excluded, and the sample size was n=9. The median and quartiles are indicated below.
Figure 10  Interquartile Range with Odd Sample Size
When the sample size is 9, the median is the middle number 72. The quartiles are determined in the same way looking at the lower and upper halves, respectively. There are 4 values in the lower half, the first quartile is the mean of the 2 middle values in the lower half ((64+64)/2=64). The same approach is used in the upper half to determine the third quartile ((77+81)/2=79).
Outliers and Tukey Fences:
When there are no outliers in a sample, the mean and standard deviation are used to summarize a typical value and the variability in the sample, respectively. When there are outliers in a sample, the median and interquartile range are used to summarize a typical value and the variability in the sample, respectively.
Table 13 displays the means, standard deviations, medians, quartiles and interquartile ranges for each of the continuous variables in the subsample of n=10 participants who attended the seventh examination of the Framingham Offspring Study.
Table 13  Summary Statistics on n=10 Participants
Characteristic 
Mean 
Standard Deviation 
Median 
Q1 
Q3 
IQR 

Systolic Blood Pressure 
121.2 
11.1 
122.5 
113.0 
127.0 
14.0 
Diastolic Blood Pressure 
71.3 
7.2 
71.0 
64.0 
77.0 
13.0 
Total Serum Cholesterol 
202.3 
37.7 
206.5 
163.0 
227.0 
64.0 
Weight 
176.0 
33.0 
169.5 
151.0 
206.0 
55.0 
Height 
67.175 
4.205 
69.375 
63.0 
70.0 
7.0 
Body Mass Index 
27.26 
3.10 
26.60 
24.9 
29.6 
4.7 
Table 14 displays the observed minimum and maximum values along with the limits to determine outliers using the quartile rule for each of the variables in the subsample of n=10 participants. Are there outliers in any of the variables? Which statistics are most appropriate to summarize the average or typical value and the dispersion?
Table 14  Limits for Assessing Outliers in Characteristics Measured in the n=10 Participants
Characteristic 
Minimum 
Maximum 
Lower Limit^{1} 
Upper Limit^{2} 

Systolic Blood Pressure 
105 
141 
92 
148 
Diastolic Blood Pressure 
62 
81 
44.5 
96.5 
Total Serum Cholesterol 
150 
275 
67 
323 
Weight 
138 
235 
68.5 
288.5 
Height 
60.75 
72.00 
52.5 
80.5 
Body Mass Index 
22.8 
31.9 
17.85 
36.65 
^{1} Determined byQ_{1}1.5(Q_{3}Q_{1})
^{2} Determined by Q_{3}+1.5(Q_{3}Q_{1})
Since there are no suspected outliers in the subsample of n=10 participants, the mean and standard deviation are the most appropriate statistics to summarize average values and dispersion, respectively, of each of these characteristics.
The Full Framingham Cohort
For clarity, we have so far used a very small subset of the Framingham Offspring Cohort to illustrate calculations of summary statistics and determination of outliers. For your interest, Table 15 displays the means, standard deviations, medians, quartiles and interquartile ranges for each of the continuous variable displayed in Table 13 in the full sample (n=3,539) of participants who attended the seventh examination of the Framingham Offspring Study.
Table 15  Summary Statistics on Sample of (n=3,539) Participants
Characteristic 
Mean

Standard Deviation (s) 
Median 
Q1 
Q3 
IQR 
Systolic Blood Pressure 
127.3 
19.0 
125.0 
114.0 
138.0 
24.0 
Diastolic Blood Pressure 
74.0 
9.9 
74.0 
67.0 
80.0 
13.0 
Total Serum Cholesterol 
200.3 
36.8 
198.0 
175.0 
223.0 
48.0 
Weight 
174.4 
38.7 
170.0 
146.0 
198.0 
52.0 
Height 
65.957 
3.749 
65.750 
63.000 
68.750 
5.75 
Body Mass Index 
28.15 
5.32 
27.40 
24.5 
30.8 
6.3 
Table 16 displays the observed minimum and maximum values along with the limits to determine outliers using the quartile rule for each of the variables in the full sample (n=3,539).
Table 16  Limits for Assessing Outliers in Characteristics Presented in Table 15



Tukey Fences 

Characteristic 
Minimum 
Maximum 
Lower Limit^{1} 
Upper Limit^{2} 

Systolic Blood Pressure 
81.0 
216.0 
78 
174 
Diastolic Blood Pressure 
41.0 
114.0 
47.5 
99.5 
Total Serum Cholesterol 
83.0 
357.0 
103 
295 
Weight 
90.0 
375.0 
68.0 
276.0 
Height 
55.00 
78.75 
54.4 
77.4 
Body Mass Index 
15.8 
64.0 
15.05 
40.25 
^{1} Determined byQ_{1}1.5(Q_{3}Q_{1})
^{2} Determined by Q_{3}+1.5(Q_{3}Q_{1})
Click below the question to view the answer.  
