Wilcoxon Signed Rank Test


Another popular nonparametric test for matched or paired data is called the Wilcoxon Signed Rank Test. Like the Sign Test, it is based on difference scores, but in addition to analyzing the signs of the differences, it also takes into account the magnitude of the observed differences.

Let's use the Wilcoxon Signed Rank Test to re-analyze the data in Example 4 on page 5 of this module. Recall that this study assessed the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism. A total of 8 children with autism enroll in the study and the amount of time that each child is engaged in repetitive behavior during three hour observation periods are measured both before treatment and then again after taking the new medication for a period of 1 week. The data are shown below. 

Child

Before Treatment

After 1 Week of Treatment

1

85

75

2

70

50

3

40

50

4

65

40

5

80

20

6

75

65

7

55

40

8

20

25

First, we compute difference scores for each child.  

Child

Before Treatment

After 1 Week of Treatment

Difference

(Before-After)

1

85

75

10

2

70

50

20

3

40

50

-10

4

65

40

25

5

80

20

60

6

75

65

10

7

55

40

15

8

20

25

-5

The next step is to rank the difference scores. We first order the absolute values of the difference scores and assign rank from 1 through n to the smallest through largest absolute values of the difference scores, and assign the mean rank when there are ties in the absolute values of the difference scores.  

Observed Differences

 

Ordered Absolute Values of Differences

Ranks

10

 

-5

1

20

 

10

3

-10

 

-10

3

25

 

10

3

60

 

15

5

10

 

20

6

15

 

25

7

-5

 

60

8

The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below.

Observed Differences

 

Ordered Absolute Values of Difference Scores

Ranks

Signed Ranks

10

 

-5

1

-1

20

 

10

3

3

-10

 

-10

3

-3

25

 

10

3

3

60

 

15

5

5

10

 

20

6

6

15

 

25

7

7

-5

 

60

8

8

Similar to the Sign Test, hypotheses for the Wilcoxon Signed Rank Test concern the population median of the difference scores. The research hypothesis can be one- or two-sided. Here we consider a one-sided test.

 

H0: The median difference is zero  versus

H1: The median difference is positive α=0.05

Test Statistic for the Wilcoxon Signed Rank Test

The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ (sum of the positive ranks) and W- (sum of the negative ranks). If the null hypothesis is true, we expect to see similar numbers of lower and higher ranks that are both positive and negative (i.e., W+ and W- would be similar). If the research hypothesis is true we expect to see more higher and positive ranks (in this example, more children with substantial improvement in repetitive behavior after treatment as compared to before, i.e., W+ much larger than W-).

In this example, W+ = 32 and W- = 4. Recall that the sum of the ranks (ignoring the signs) will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 8(9)/2 = 36 which is equal to 32+4. The test statistic is W = 4.

Next we must determine whether the observed test statistic W supports the null or research hypothesis. This is done following the same approach used in parametric testing. Specifically, we determine a critical value of W such that if the observed value of W is less than or equal to the critical value, we reject H0 in favor of H1, and if the observed value of W exceeds the critical value, we do not reject H0.

Table of Critical Values of W

The critical value of W can be found in the table below:

 

alternative accessible content

To determine the appropriate one-sided critical value we need sample size (n=8) and our one-sided level of significance (α=0.05). For this example, the critical value of W is 6 and the decision rule is to reject H0 if W < 6. Thus, we reject H0, because 4 < 6. We have statistically significant evidence at α =0.05, to show that the median difference is positive (i.e., that repetitive behavior improves.)

Note that when we analyzed the data previously using the Sign Test, we failed to find statistical significance. However, when we use the Wilcoxon Signed Rank Test, we conclude that the treatment result in a statistically significant improvement at α=0.05. The discrepant results are due to the fact that the Sign Test uses very little information in the data and is a less powerful test.

Example:

A study is run to evaluate the effectiveness of an exercise program in reducing systolic blood pressure in patients with pre-hypertension (defined as a systolic blood pressure between 120-139 mmHg or a diastolic blood pressure between 80-89 mmHg). A total of 15 patients with pre-hypertension enroll in the study, and their systolic blood pressures are measured. Each patient then participates in an exercise training program where they learn proper techniques and execution of a series of exercises. Patients are instructed to do the exercise program 3 times per week for 6 weeks. After 6 weeks, systolic blood pressures are again measured. The data are shown below. 

Patient

Systolic Blood Pressure

Before Exercise Program

Systolic Blood Pressure

After Exercise Program

1

125

118

2

132

134

3

138

130

4

120

124

5

125

105

6

127

130

7

136

130

8

139

132

9

131

123

10

132

128

11

135

126

12

136

140

13

128

135

14

127

126

15

130

132

Is there is a difference in systolic blood pressures after participating in the exercise program as compared to before?

H0: The median difference is zero versus

H1: The median difference is not zero α=0.05

The test statistic for the Wilcoxon Signed Rank Test is W, defined as the smaller of W+ and W- which are the sums of the positive and negative ranks, respectively.  

The critical value of W can be found in the table of critical values. To determine the appropriate critical value from Table 7 we need sample size (n=15) and our two-sided level of significance (α=0.05). The critical value for this two-sided test with n=15 and α=0.05 is 25 and the decision rule is as follows: Reject H0 if W < 25.

 Because the before and after systolic blood pressures measures are paired, we compute difference scores for each patient.  


Patient

Systolic Blood Pressure

Before Exercise Program

Systolic Blood Pressure

After Exercise Program

Difference

(Before-After)

1

125

118

7

2

132

134

-2

3

138

130

8

4

120

124

-4

5

125

105

20

6

127

130

-3

7

136

130

6

8

139

132

7

9

131

123

8

10

132

128

4

11

135

126

9

12

136

140

-4

13

128

135

-7

14

127

126

1

15

130

132

-2

The next step is to rank the ordered absolute values of the difference scores using the approach outlined in Section 10.1. Specifically, we assign ranks from 1 through n to the smallest through largest absolute values of the difference scores, respectively, and assign the mean rank when there are ties in the absolute values of the difference scores.  

Observed Differences 

 

Ordered Absolute

Values of Differences

Ranks

7

 

1

1

-2

 

-2

2.5

8

 

-2

2.5

-4

 

-3

4

20

 

-4

6

-3

 

-4

6

6

 

4

6

7

 

6

8

8

 

-7

10

4

 

7

10

9

 

7

10

-4

 

8

12.5

-7

 

8

12.5

1

 

9

14

-2

 

20

15

The final step is to attach the signs ("+" or "-") of the observed differences to each rank as shown below. 

Observed Differences 

 

Ordered Absolute Values of Differences

Ranks

Signed

Ranks

7

 

1

1

1

-2

 

-2

2.5

-2.5

8

 

-2

2.5

-2.5

-4

 

-3

4

-4

20

 

-4

6

-6

-3

 

-4

6

-6

6

 

4

6

6

7

 

6

8

8

8

 

-7

10

-10

4

 

7

10

10

In this example, W+ = 89 and W- = 31. Recall that the sum of the ranks (ignoring the signs) will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 15(16)/2 = 120 which is equal to 89 + 31. The test statistic is W = 31.

We do not reject H0 because 31 > 25. Therefore, we do not have statistically significant evidence at α=0.05, to show that the median difference in systolic blood pressures is not zero (i.e., that there is a significant difference in systolic blood pressures after the exercise program as compared to before).