Group Abstract

Message Boards

WOLFRAM COMMUNITY

5.5K Views

8 Replies

3 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Mathematics Wolfram|Alpha Wolfram Language Statistics and Probability

Switching samples in Mann-Whitney test changes p-value

Evgeniy Evtushenko

Posted 1 year ago

I did some demonstration about Mann-Whitney test for students and found some weird behavior. Take two samples, 6 values each. First is 1,2,3,4,5,6. Second is 7,8,9,10,11,12. data1 = {1, 2, 3, 4, 5, 6}; data2 = {7, 8, 9, 10, 11, 12}; Now perform Mann Whitney test on these datasets: MannWhitneyTest[{data1, data2}] MannWhitneyTest[{data2, data1}] Surprisingly, these two tests return different p-values. First returns 0.0030528, the second one returns 0.00507487 The same is true for WolframAlpha: MannWhitneyTest[{data1, data2}], MannWhitneyTest[{data2, data1}] Any ideas why the test implementation is not symmetrical?

POSTED BY: Evgeniy Evtushenko

8 Replies

Sort By:

Eduardo Serna

Eduardo Serna, Wolfram

Posted 1 year ago

Hi just an update on this. The reason that the computation wasn't symmetric was that the previous chosen convention was only asymptotically symmetric. I have now modified this such that the case for `AlternativeHypothesis->"Unequal"` is symmetric for any `Method`. I have also added an `"Exact"` method following original sources (even though they are missing boundary conditions for the derivation), which matches the tables in the original sources. I have chosen a convention to be as consistent as possible with the preexisting `"Asymptotic"` computation. Unfortunately it won't make it into 14.0.

POSTED BY: Eduardo Serna

Eduardo Serna

Eduardo Serna, Wolfram

Posted 1 year ago

Thanks so much for this input, it helps a lot. I have added it to the test report

POSTED BY: Eduardo Serna

Evgeniy Evtushenko

Posted 1 year ago

Thank you very much. One more request: as long as you will fix the MannWhitneyTest function, it would be great also to fix the behavior of Method->"Automatic". Now it always calculates p-value from asymptotic normal distribution for U. At the same time for small samples (<20) an exact p-value could be calculated based on permutations (not Monte Carlo, a direct formula/table instead). For instance, in the reported case of data1, data2 the exact p-value is the probability of getting 2 extreme cases: all ranks of sample 1 below the sample 2 and all ranks of sample 1 above the sample 2, which is exactly 2(6!6!)/(6+6)! = 1/462 ~ 0.00216.The asymptotic normal approximation for U gives 0.003053. This difference is not a big deal if a single M-W test is performed as both results are significant. But if multiple comparisons are required, then multiple M-W tests are applied with subsequent correction of p-values by for instance Holm–Bonferroni method. And here the difference between 0.002 and 0.003 might become important.

POSTED BY: Evgeniy Evtushenko

Eduardo Serna

Eduardo Serna, Wolfram

Posted 1 year ago

Thank you for reporting this behavior, I am currently investigating. The intention of the implementation is certainly to be symmetric but something is wrong I will try to get it fixed soon.

POSTED BY: Eduardo Serna

Gareth Russell

Gareth Russell, New Jersey Institute of Technology

Posted 1 year ago

That sucks! https://support.wolfram.com/12507?src=mathematica

POSTED BY: Gareth Russell

Claude Mante

Claude Mante, Retired

Posted 1 year ago

There is the same problem with SignedRankTest. By default, MannWhitneyTest uses asymptotic statistics, which is not appropriate for such small samples. Using instead the method "Permutations", with a reasonable number of permutations, (50, 100), I obtained zero as a Pvalue in both cases.

POSTED BY: Claude Mante

Evgeniy Evtushenko

Posted 1 year ago

Is there any way to report this bug? It is extremely influential. I can't imagine how many datasets from animal studies I've processed with M-W in Mathematica in last 5 years... It will be a nightmare to revisit all the data once again, taking into account that a lot of data is already published. I'm shocked.

POSTED BY: Evgeniy Evtushenko

Gareth Russell

Gareth Russell, New Jersey Institute of Technology

Posted 1 year ago

Yes, this does seem to be a bug. In the M-W test, the test statistic U is the smaller of U1 and U2, but the WL function seems to just take the value of U_1, which is the incorrect value if the first group is larger than the second. So in your example, the first version with {data1,data2} gives the correct answer.

POSTED BY: Gareth Russell

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback