I ran the matrix multiply for M8 and M10. (M9 was uninstalled)
M10 is marginally faster but repeating the tests a few times shows M8 can be better than the worst of M10.
Code attached. Chrome freezes every time I insert code so I'll show a sample picture.
I ran Timing and AbsoluteTiming, matrix sizes 1050 and 2050..
10.0 for Microsoft Windows (64-bit) (June 29, 2014) Time = 0.374402
10.0 for Microsoft Windows (64-bit) (June 29, 2014) Time = 0.436803
8.0 for Microsoft Windows (64-bit) (October 7, 2011) Time = 0.406
8.0 for Microsoft Windows (64-bit) (October 7, 2011) Time = 0.468
10.0 for Microsoft Windows (64-bit) (June 29, 2014) (Abs)Time = 0.456026
10.0 for Microsoft Windows (64-bit) (June 29, 2014) (Abs)Time = 0.493028
8.0 for Microsoft Windows (64-bit) (October 7, 2011) (Abs)Time = 0.5020287
8.0 for Microsoft Windows (64-bit) (October 7, 2011) (Abs)Time = 0.5210298
10.0 for Microsoft Windows (64-bit) (June 29, 2014) Time = 2.714417
10.0 for Microsoft Windows (64-bit) (June 29, 2014) Time = 2.839218
8.0 for Microsoft Windows (64-bit) (October 7, 2011) Time = 2.855
8.0 for Microsoft Windows (64-bit) (October 7, 2011) Time = 2.949
10.0 for Microsoft Windows (64-bit) (June 29, 2014) (Abs)Time = 2.938168
10.0 for Microsoft Windows (64-bit) (June 29, 2014) (Abs)Time = 3.058175
8.0 for Microsoft Windows (64-bit) (October 7, 2011) (Abs)Time = 3.1111780
8.0 for Microsoft Windows (64-bit) (October 7, 2011) (Abs)Time = 3.1811819
Attachments: