I did some demonstration about Mann-Whitney test for students and found some weird behavior. Take two samples, 6 values each. First is 1,2,3,4,5,6. Second is 7,8,9,10,11,12.
data1 = {1, 2, 3, 4, 5, 6};
data2 = {7, 8, 9, 10, 11, 12};
Now perform Mann Whitney test on these datasets:
MannWhitneyTest[{data1, data2}]
MannWhitneyTest[{data2, data1}]
Surprisingly, these two tests return different p-values. First returns 0.0030528, the second one returns 0.00507487
The same is true for WolframAlpha: MannWhitneyTest[{data1, data2}], MannWhitneyTest[{data2, data1}]
Any ideas why the test implementation is not symmetrical?