I often take networks I train and convert them to run in other frameworks and environments, but for the first time I'm trying to do this with Mathematica and I'm running into trouble. The process of converting an MXNet model (this is a pair of .params and .symbol files) into Apple's CoreML format has worked for me many times, but is now failing on networks exported from Mathematica.
I have a trained NetGraph that Mathematica successfully exported into MXNet format:

Here are the exported network's files: mynet.json and mynet.params (on dropbox).
We then install this tool from the apache mxnet project for converting mxnet models into coreml models:
pip install mxnet-to-coreml
Before running the tool we need to rename the two files slightly from what Mathematica names them on export:
RenameFile["mynet.json", "mynet-symbol.json"];
RenameFile["mynet.params", "mynet-0000.params"];
Now we can run the conversion utility with the following command
mxnet_coreml_converter.py --model-prefix='mynet' --epoch=0 --input-shape='{"Input":"3,299,299"}' --mode=regressor --output-file="mynet.mlmodel"
The arguments respectively indicate:
- The base name of the model's two files
- The model it takes a 299x299 color image
- The model produces numeric regression values
- The resulting converted model filename
And sadly, it fails with this output:
[15:49:14] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.9.5. Attempting to upgrade...
[15:49:14] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!
Traceback (most recent call last):
File "/usr/local/bin/mxnet_coreml_converter.py", line 106, in <module>
label_names=label_names
File "/usr/local/lib/python2.7/site-packages/converter/utils.py", line 55, in load_model
sym, arg_params, aux_params = mx.model.load_checkpoint(model_name, epoch_num)
File "/usr/local/lib/python2.7/site-packages/mxnet/model.py", line 425, in load_checkpoint
tp, name = k.split(':', 1)
ValueError: need more than 1 value to unpack
I thought a developer on the community might have an some helpful insights regarding the nuances of exporting and/or if they have seen similar issues in the past? Here are some possible ideas:
- The json file contains
"mxnet_version":["int",905]
does that mean I need to downgrade my mxnet from v0.11 to something else?
- Perhaps we can use some internal functions like NeuralNetworks`ToMXJSON, which would make the load_checkpoint method work?
- Perhaps there's a way of tweaking the exported symbol file or the arguments to the conversion utility?
Once I get this working I'll post a tutorial.
Updates
It looks like we need to manually edit the symbols file, to change the activation layers like these
{"op":"relu","name":".Nodes.1.Nodes.conv_Activation$0","attr":{},"inputs":[[7,0,0]]}
to
{"op":"Activation","name":".Nodes.1.Nodes.conv_Activation$0","attr":{},"inputs":[[7,0,0]], "param": {"act_type": "relu"}}
Also, "SoftmaxActivation" and "slice_axis" are not supported layers, but "SoftmaxOutput" is, so not sure what to do about those layers...
Details & References:
I'm using the latest of everything: Mathematica v11.2.1, mxnet-0.11.0, and the latest mxnet-to-coreml package.