A Mann-Whitney's U test is also known as Wilcoxon Rank sum test, and basically a non-parametric version of t test. You want to use a Mann-Whitney's U test when
So, you see people use a Mann-Whitney' U test when they have ordinal dependent variables or when they have only a small sample size (and thus they cannot assume the normality). However, Mann-Whitney' U test still assumes the equality of variances.
Although a Mann-Whitney's U test can be considered as a non-parametric version of t test, a Mann-Whitney's U test compares the medians of the two groups, not the means.
Before looking at the example of a Mann-Whitney's U test, let's take a look at what a Mann-Whitney's U test does. The point of a Mann-Whitney's U test is that it treats the data as ordinal data. So you can order the data but the difference between any of the two values is not consistent. What a Mann-Whitney's U test does is to calculate the rank for each value instead of using the values as-is. Let's think about some data from a 5-Likert scale question and say you have the following data.
Then, you make a rank (R) based on these values. So,
|Group A||1 (R1)||3 (R6)||2 (R2)||4 (R7)||2 (R4)|
|Group B||3 (R5)||5 (R9)||5 (R10)||2 (R3)||4 (R8)|
For now, I just randomly ranked for the ties. But obviously this may cause a problem if we want to do a fair statistical test. One thing we can do is to take the average of the ranks of the ties and give them the same average. For instance, the value 2 gets rank 2 and 3 in this example. Instead of deciding which data point gets a higher rank, we just use the average of the ranks that value gets. So, both will get rank 2.5 in this case. Thus, with this correction, this example becomes
|Group A||1 (R1)||3 (R5.5)||2 (R3)||4 (R7.5)||2 (R3)|
|Group B||3 (R5.5)||5 (R9.5)||5 (R9.5)||2 (R3)||4 (R7.5)|
The means of the ranks of Group A and Group B are 4.0 and 7.0. The null hypothesis of a Mann-Whitney's U test is that the samples of the both groups came from the same population. So intuitively, if the null hypothesis holds, this means that there is no difference in the mean ranks between the two groups because both groups have the same chances to have low and high ranks. Thus, if the means of the ranks are skewed enough, you can say that you have a significant effect.
Please remember that this is not what exactly a Mann-Whitney test does. It calculates the statistics called the U value. The U value for each group is calculated by subtracting the possible minimum rank which the group can take from the sum of the ranks, and the smallest U value is used for the test. The distribution of the standardized U value is known to be close to the normal distribution when the sample size is more than 20. Thus, if the observed standardized U value is far from the center of the normal distribution (= 0), the test will reject the null hypothesis.
The calculation of the effect size of Mann-Whitney's U test is fairly easy.
where N is the total number of the samples. Here is the standard value of r for small, medium, and large sizes. The sign does not contain much information, so we often just report the absolute value of r.
|small size||medium size||large size|
Let's prepare the data. Create the data like the results from a 5-Likert scale question (the response is 1, 2, 3, 4, or 5), and you have two groups (Group) to compare.
Then, do Mann-Whitney's U test.
And you get the result.
However, as you can see here, the exact p value cannot be calculated because of ties. But this process is necessary to calculate the U value (which is reported as “W” in the results) because it is not straightforward to calculate the U value from the Z value (which is necessary to know for calculating the effect size), particularly when the sample size is small. Now I will show you how to calculate the Z value and exact p value.
Then, do another Mann-Whitney test. But you have to format the data for Mann-Whitney test with coin.
Now you get another result.
Thus, we have a significant effect of Group. You can also calculate the mean rank for each group as follows.
And calculate the effect size.
You can report the results of Mann-Whitney's U test as follows:
The medians of Group A and Group B were 2.5 and 3.5, respectively. We ran a Mann-Whitney's U test to evaluate the difference in the responses of our 5-Likert scale question. We found a significant effect of Group (The mean ranks of Group A and Group B were 7.8 and 13.2, respectively; U = 23, Z = -2.11, p < 0.05, r = 0.47).
For the effect size, please see: Field, A. Discovering statistics using SPSS. (2nd edition).