Message Boards Message Boards

MNIST task solved with Wolfram Mathematica - Accuracy of 96.31%

Posted 7 years ago

During my learning process in Data Science I solved the MNIST task with Wolfram Mathematica, simply by calculating the difference of values in pixels of different digits. There are 10 classes: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 (target value) and each digit is a 28x28 pixels matrix (784 pixel values). The original dataset has 60,000 examples in training set and 10,000 examples in test set. Linear separator achieve 93% accuracy. The whole calculation took 3 hours.

Here is the code:

Clear["Global`*"]
SetDirectory[$UserDocumentsDirectory];

<<JLink`;
InstallJava[];
ReinstallJava[JVMArguments->"-Xmx2048m"]

TRAIN SET;
a0=Import["mnist_train.csv"];
{Dimensions[a0],Dimensions[Flatten[a0]]}
{{60000,785},{47100000}}

TEST SET;
a=Import["mnist_test.csv"];
{Dimensions[a],Dimensions[Flatten[a]]}
{{10000,785},{7850000}}

t0=a0[[#]][[1]]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];
b0=Drop[a0[[#]],1]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];
o0=Partition[b0[[#]],28]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];

TEST SET;
t=a[[#]][[1]]&/@Table[k,{k,1,Dimensions[a][[1]],1}]
b=Drop[a[[#]],1]&/@Table[k,{k,1,Dimensions[a][[1]],1}];
o=Partition[b[[#]],28]&/@Table[k,{k,1,Dimensions[a][[1]],1}];
t2=Total[Abs[Flatten[o[[1]]]-Flatten[o0[[#]]]]]&/@Table[k,{k,1,Dimensions[o0][[1]],1}];
t3=Flatten[Position[t2,Min[t2]]];
f[x_]:=Flatten[Position[Total[Abs[Flatten[o[[x]]]-Flatten[o0[[#]]]]]&/@Table[k,{k,1,Dimensions[o0][[1]],1}],Min[Total[Abs[Flatten[o[[x]]]-Flatten[o0[[#]]]]]&/@Table[k,{k,1,Dimensions[o0][[1]],1}]]]];
t5=f/@Table[k,{k,1,Dimensions[t][[1]],1}];
t55=Flatten[If[Dimensions[t5[[#]]][[1]]>1,t5[[#]][[1]],t5[[#]]]&/@Table[k,{k,1,Dimensions[t5][[1]],1}]];
tt=t0[[#]]&/@Flatten[t55];

ACCURACY;
N[Count[t-tt,0]/Dimensions[t][[1]]]
0.9631

SAMPLE OF 1,000 OUTPUTS;
{t[[#]],MatrixPlot[o[[#]],ImageSize->80],MatrixPlot[o0[[t55[[#]]]],ImageSize->80]}&/@Table[k,{k,1,1000,1}]

Examples of the output are shown, left is test set digit and right predicted digit:

MNIST task

MNIST task 2

POSTED BY: Rubens Zimbres
3 Replies

That's quite good for a method that is fairly simple. Alternatives of varying complexity appear in this prior Community post. The biggest bottleneck I found when working on this was actually a preprocessing step that sharpened the images (needed, it seems, in that it slightly improved recognition). The dimension reduction and lookup steps were actually quite fast.

POSTED BY: Daniel Lichtblau

I see Java Link being loaded as:

<<JLink`

but do not see it is explicitly used in the further WL code. How JLink is relevant to the task?

POSTED BY: Sam Carrettie

Nice result. Thanks for sharing! I have a feeling Wolfram language (Mathematica) is not your primary programming language.

e.g.:

t0=a0[[#]][[1]]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];
b0=Drop[a0[[#]],1]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];
o0=Partition[b0[[#]],28]&/@Table[k,{k,1,Dimensions[a0][[1]],1}];

can be simplified to:

t0 = a0[[All, 1]]
b0 = a0[[All, 2 ;;]]
o0 = Partition[#, 28] & /@ b0[[;; Length[a0]]]

In general:

Table[k,{k,1,x,1}]

is simply:

Range[x]

and

Dimensions[x][[1]]
Dimensions[x,1]
Length[x]

All give the same result, but the last one is much more legible.

POSTED BY: Sander Huisman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract