I thought it would be helpful to extend Sebastian's great solution just a bit for images.
First, make and export the net (using the example from the docs):
resource = ResourceObject["MNIST"];
trainingData = ResourceData[resource, "TrainingData"];
testData = ResourceData[resource, "TestData"];
lenet = NetChain[{
ConvolutionLayer[20, 5], Ramp, PoolingLayer[2, 2],
ConvolutionLayer[50, 5], Ramp, PoolingLayer[2, 2],
FlattenLayer[], 500, Ramp, 10, SoftmaxLayer[]},
"Output" -> NetDecoder[{"Class", Range[0, 9]}],
"Input" -> NetEncoder[{"Image", {28, 28}, "Grayscale"}]
];
trained =
NetTrain[lenet, trainingData, ValidationSet -> testData,
MaxTrainingRounds -> 3]
Export["~/mnist.json", trained, "MXNet"]
Then in Python, import the net, run it over the mnist test set, and measure the results:
import mxnet as mx
import numpy as np
sym = mx.symbol.load("/home/ubuntu/mnist.json")
nd = mx.nd.load("/home/ubuntu/mnist.params")
def eval(img):
# manual 'NetEncoder' step
mxnet_img = 1 - (img / 255.)
mxnet_img = mxnet_img.reshape(28,28)
img_inputND = mx.nd.array([[mxnet_img]])
nd["Input"] = img_inputND
e = sym.bind(mx.cpu(), nd)
out = e.forward()
# manual 'NetDecoder' step
return np.argmax(out[0].asnumpy())
# Now test it
from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')
results = []
for img in mnist.data[60000:]:
results.append(eval(img))
expected = mnist.target[60000:]
from sklearn import metrics
print(metrics.classification_report(expected, results))
print("Confusion matrix:\n%s" % metrics.confusion_matrix(expected, results))
Which outputs
and compared to the ClassifierMeasurements yields the same results as expected:
NetEncoders and NetDecoders are one of the best abstractions of your Neural framework. But since we don't have them in MXNet land, I wanted to highlight how careful we must be to do that part by hand: accounting for image representation, colorspace, mean, image normalization, etc... I really think additional python usage examples showing how to use MXNet models trained inside Mathematica are needed for more sophisticated layers and NetModels. Especially recurrent models involving GRU, LSTMs, and attentional constructs would be great. Mainly because I've built and trained some really cool things, but haven't figured out how use them in any real way, that is, to run them in python.