Wilcoxon Signed Rank Test
Another popular nonparametric test for matched or paired data is called the Wilcoxon Signed Rank Test. Like the Sign Test, it is based on difference scores, but in addition to analyzing the signs of the differences, it also takes into account the magnitude of the observed differences.
Let's use the Wilcoxon Signed Rank Test to re-analyze the data in Example 4 on page 5 of this module. Recall that this study assessed the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism. A total of 8 children with autism enroll in the study and the amount of time that each child is engaged in repetitive behavior during three hour observation periods are measured both before treatment and then again after taking the new medication for a period of 1 week. The data are shown below.
Child |
Before Treatment |
After 1 Week of Treatment |
---|---|---|
1 |
85 |
75 |
2 |
70 |
50 |
3 |
40 |
50 |
4 |
65 |
40 |
5 |
80 |
20 |
6 |
75 |
65 |
7 |
55 |
40 |
8 |
20 |
25 |
First, we compute difference scores for each child.
Child |
Before Treatment |
After 1 Week of Treatment |
Difference (Before-After) |
---|---|---|---|
1 |
85 |
75 |
10 |
2 |
70 |
50 |
20 |
3 |
40 |
50 |
-10 |
4 |
65 |
40 |
25 |
5 |
80 |
20 |
60 |
6 |
75 |
65 |
10 |
7 |
55 |
40 |
15 |
8 |
20 |
25 |
-5 |
The next step is to rank the difference scores. We first order the absolute values of the difference scores and assign rank from 1 through n to the smallest through largest absolute values of the difference scores, and assign the mean rank when there are ties in the absolute values of the difference scores.
Observed Differences |
|
Ordered Absolute Values of Differences |
Ranks |
---|---|---|---|
10 |
|
-5 |
1 |
20 |
|
10 |
3 |
-10 |
|
-10 |
3 |
25 |
|
10 |
3 |
60 |
|
15 |
5 |
10 |
|
20 |
6 |
15 |
|
25 |
7 |
-5 |
|
60 |
8 |
The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below.
Observed Differences |
|
Ordered Absolute Values of Difference Scores |
Ranks |
Signed Ranks |
---|---|---|---|---|
10 |
|
-5 |
1 |
-1 |
20 |
|
10 |
3 |
3 |
-10 |
|
-10 |
3 |
-3 |
25 |
|
10 |
3 |
3 |
60 |
|
15 |
5 |
5 |
10 |
|
20 |
6 |
6 |
15 |
|
25 |
7 |
7 |
-5 |
|
60 |
8 |
8 |
Similar to the Sign Test, hypotheses for the Wilcoxon Signed Rank Test concern the population median of the difference scores. The research hypothesis can be one- or two-sided. Here we consider a one-sided test.
H0: The median difference is zero versus
H1: The median difference is positive α=0.05
Test Statistic for the Wilcoxon Signed Rank Test
The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ (sum of the positive ranks) and W- (sum of the negative ranks). If the null hypothesis is true, we expect to see similar numbers of lower and higher ranks that are both positive and negative (i.e., W+ and W- would be similar). If the research hypothesis is true we expect to see more higher and positive ranks (in this example, more children with substantial improvement in repetitive behavior after treatment as compared to before, i.e., W+ much larger than W-).
In this example, W+ = 32 and W- = 4. Recall that the sum of the ranks (ignoring the signs) will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 8(9)/2 = 36 which is equal to 32+4. The test statistic is W = 4.
Next we must determine whether the observed test statistic W supports the null or research hypothesis. This is done following the same approach used in parametric testing. Specifically, we determine a critical value of W such that if the observed value of W is less than or equal to the critical value, we reject H0 in favor of H1, and if the observed value of W exceeds the critical value, we do not reject H0.
Table of Critical Values of W
The critical value of W can be found in the table below:
To determine the appropriate one-sided critical value we need sample size (n=8) and our one-sided level of significance (α=0.05). For this example, the critical value of W is 6 and the decision rule is to reject H0 if W < 6. Thus, we reject H0, because 4 < 6. We have statistically significant evidence at α =0.05, to show that the median difference is positive (i.e., that repetitive behavior improves.)
Note that when we analyzed the data previously using the Sign Test, we failed to find statistical significance. However, when we use the Wilcoxon Signed Rank Test, we conclude that the treatment result in a statistically significant improvement at α=0.05. The discrepant results are due to the fact that the Sign Test uses very little information in the data and is a less powerful test.
Example:
A study is run to evaluate the effectiveness of an exercise program in reducing systolic blood pressure in patients with pre-hypertension (defined as a systolic blood pressure between 120-139 mmHg or a diastolic blood pressure between 80-89 mmHg). A total of 15 patients with pre-hypertension enroll in the study, and their systolic blood pressures are measured. Each patient then participates in an exercise training program where they learn proper techniques and execution of a series of exercises. Patients are instructed to do the exercise program 3 times per week for 6 weeks. After 6 weeks, systolic blood pressures are again measured. The data are shown below.
Patient |
Systolic Blood Pressure Before Exercise Program |
Systolic Blood Pressure After Exercise Program |
---|---|---|
1 |
125 |
118 |
2 |
132 |
134 |
3 |
138 |
130 |
4 |
120 |
124 |
5 |
125 |
105 |
6 |
127 |
130 |
7 |
136 |
130 |
8 |
139 |
132 |
9 |
131 |
123 |
10 |
132 |
128 |
11 |
135 |
126 |
12 |
136 |
140 |
13 |
128 |
135 |
14 |
127 |
126 |
15 |
130 |
132 |
Is there is a difference in systolic blood pressures after participating in the exercise program as compared to before?
- Step1. Set up hypotheses and determine level of significance.
H0: The median difference is zero versus
H1: The median difference is not zero α=0.05
- Step 2. Select the appropriate test statistic.
The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ and W- which are the sums of the positive and negative ranks, respectively.
- Step 3. Set up the decision rule.
The critical value of W can be found in the table of critical values. To determine the appropriate critical value from Table 7 we need sample size (n=15) and our two-sided level of significance (α=0.05). The critical value for this two-sided test with n=15 and α=0.05 is 25 and the decision rule is as follows: Reject H0 if W < 25.
- Step 4. Compute the test statistic.
Because the before and after systolic blood pressures measures are paired, we compute difference scores for each patient.
Patient |
Systolic Blood Pressure Before Exercise Program |
Systolic Blood Pressure After Exercise Program |
Difference (Before-After) |
---|---|---|---|
1 |
125 |
118 |
7 |
2 |
132 |
134 |
-2 |
3 |
138 |
130 |
8 |
4 |
120 |
124 |
-4 |
5 |
125 |
105 |
20 |
6 |
127 |
130 |
-3 |
7 |
136 |
130 |
6 |
8 |
139 |
132 |
7 |
9 |
131 |
123 |
8 |
10 |
132 |
128 |
4 |
11 |
135 |
126 |
9 |
12 |
136 |
140 |
-4 |
13 |
128 |
135 |
-7 |
14 |
127 |
126 |
1 |
15 |
130 |
132 |
-2 |
The next step is to rank the ordered absolute values of the difference scores using the approach outlined in Section 10.1. Specifically, we assign ranks from 1 through n to the smallest through largest absolute values of the difference scores, respectively, and assign the mean rank when there are ties in the absolute values of the difference scores.
Observed Differences |
|
Ordered Absolute Values of Differences |
Ranks |
---|---|---|---|
7 |
|
1 |
1 |
-2 |
|
-2 |
2.5 |
8 |
|
-2 |
2.5 |
-4 |
|
-3 |
4 |
20 |
|
-4 |
6 |
-3 |
|
-4 |
6 |
6 |
|
4 |
6 |
7 |
|
6 |
8 |
8 |
|
-7 |
10 |
4 |
|
7 |
10 |
9 |
|
7 |
10 |
-4 |
|
8 |
12.5 |
-7 |
|
8 |
12.5 |
1 |
|
9 |
14 |
-2 |
|
20 |
15 |
The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below.
Observed Differences |
|
Ordered Absolute Values of Differences |
Ranks |
Signed Ranks |
---|---|---|---|---|
7 |
|
1 |
1 |
1 |
-2 |
|
-2 |
2.5 |
-2.5 |
8 |
|
-2 |
2.5 |
-2.5 |
-4 |
|
-3 |
4 |
-4 |
20 |
|
-4 |
6 |
-6 |
-3 |
|
-4 |
6 |
-6 |
6 |
|
4 |
6 |
6 |
7 |
|
6 |
8 |
8 |
8 |
|
-7 |
10 |
-10 |
4 |
|
7 |
10 |
10 |
In this example, W+ = 89 and W- = 31. Recall that the sum of the ranks (ignoring the signs) will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 15(16)/2 = 120 which is equal to 89 + 31. The test statistic is W = 31.
- Step 5. Conclusion.
We do not reject H0 because 31 > 25. Therefore, we do not have statistically significant evidence at α=0.05, to show that the median difference in systolic blood pressures is not zero (i.e., that there is a significant difference in systolic blood pressures after the exercise program as compared to before).