Message Boards Message Boards

Switching samples in Mann-Whitney test changes p-value

Posted 6 months ago

I did some demonstration about Mann-Whitney test for students and found some weird behavior. Take two samples, 6 values each. First is 1,2,3,4,5,6. Second is 7,8,9,10,11,12.

data1 = {1, 2, 3, 4, 5, 6};
data2 = {7, 8, 9, 10, 11, 12};

Now perform Mann Whitney test on these datasets:

MannWhitneyTest[{data1, data2}]
MannWhitneyTest[{data2, data1}]

Surprisingly, these two tests return different p-values. First returns 0.0030528, the second one returns 0.00507487

The same is true for WolframAlpha: MannWhitneyTest[{data1, data2}], MannWhitneyTest[{data2, data1}]

Any ideas why the test implementation is not symmetrical?

8 Replies

Hi just an update on this. The reason that the computation wasn't symmetric was that the previous chosen convention was only asymptotically symmetric. I have now modified this such that the case for AlternativeHypothesis->"Unequal" is symmetric for any Method. I have also added an "Exact" method following original sources (even though they are missing boundary conditions for the derivation), which matches the tables in the original sources. I have chosen a convention to be as consistent as possible with the preexisting "Asymptotic" computation. Unfortunately it won't make it into 14.0.

POSTED BY: Eduardo Serna

Thanks so much for this input, it helps a lot. I have added it to the test report

POSTED BY: Eduardo Serna

There is the same problem with SignedRankTest.

By default, MannWhitneyTest uses asymptotic statistics, which is not appropriate for such small samples. Using instead the method "Permutations", with a reasonable number of permutations, (50, 100), I obtained zero as a Pvalue in both cases.

POSTED BY: Claude Mante

Yes, this does seem to be a bug. In the M-W test, the test statistic U is the smaller of U1 and U2, but the WL function seems to just take the value of U_1, which is the incorrect value if the first group is larger than the second. So in your example, the first version with {data1,data2} gives the correct answer.

POSTED BY: Gareth Russell

Is there any way to report this bug? It is extremely influential. I can't imagine how many datasets from animal studies I've processed with M-W in Mathematica in last 5 years... It will be a nightmare to revisit all the data once again, taking into account that a lot of data is already published. I'm shocked.

Thank you for reporting this behavior, I am currently investigating. The intention of the implementation is certainly to be symmetric but something is wrong I will try to get it fixed soon.

POSTED BY: Eduardo Serna

Thank you very much. One more request: as long as you will fix the MannWhitneyTest function, it would be great also to fix the behavior of Method->"Automatic". Now it always calculates p-value from asymptotic normal distribution for U. At the same time for small samples (<20) an exact p-value could be calculated based on permutations (not Monte Carlo, a direct formula/table instead). For instance, in the reported case of data1, data2 the exact p-value is the probability of getting 2 extreme cases: all ranks of sample 1 below the sample 2 and all ranks of sample 1 above the sample 2, which is exactly 2(6!6!)/(6+6)! = 1/462 ~ 0.00216.The asymptotic normal approximation for U gives 0.003053. This difference is not a big deal if a single M-W test is performed as both results are significant. But if multiple comparisons are required, then multiple M-W tests are applied with subsequent correction of p-values by for instance Holm–Bonferroni method. And here the difference between 0.002 and 0.003 might become important.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract