Message Boards Message Boards

GROUPS:

Converting Models for the Wolfram Neural Net Repository

Posted 6 months ago
1928 Views
|
10 Replies
|
20 Total Likes
|

It has been a long time since I posted on community - when I was converting this particular model (MobileNetV2) almost a year back, I had all the motivation to convert it, put it in the net repository, and train it on facial features to continue making the snap chat filters for handheld devices. Somehow I lost the motivation, and got busy with rather mundane grown-up activities. Recently, we had a team meeting where we were discussing how we are looking for more user submissions and how we could encourage our very talented community members to convert models and submit it for the Wolfram Neural Net Repository (through the Contact Us button). This also gives me the opportunity to appreciate our long-time user Julian Francis, who has already converted 6 models (3 published and 3 in curation):

https://blog.wolfram.com/2018/12/06/deep-learning-and-computer-vision-converting-models-for-the-wolfram-neural-net-repository

Disclaimer

Just a disclaimer, in this post we are going to discuss the non-automated way of converting models to Wolfram Language. This approach is ideal for anyone who is starting to create their own models in Wolfram Language or trying to learn any other framework (not just submit and run codes).

Step 1: Figure out the architecture

My aim was to convert MobileNetv2, so the first step is to thoroughly study the architecture of the model. While going through the paper gives the general overview, one needs to actually make their hand dirty by closely examining the nitty-gritty details of the code. For example, it took me a whole week to figure out the details of the tensorflow code: mobilenet architecture convolution blocks

Most times, using tensorboard to visualize the graph helps one to figure out connections, while the code helps to figure out the details and the parameters. Once you have reviewed the code in the framework from where you are converting, you can start designing the architecture in Wolfram Language

Step 2: Coding it in Wolfram Language

If you closely examine the code in the tensorflow, you would realize that there are repeating units, which we want to fully utilize in our code as well. As you would see in this case, there are mobileunits which contain a chain of ConvolutionLayer, BatchNormalization and Activation of ReLU6. There are three types of these units, one which contain symmetric padding, one with no activation (at the beginning of the network), and one with asymmetric padding.

mobileunit[prefix_,nchannels_,kernel_,stride_,pad_,ngroup_,type_]:=
    Which[
       type==1,
         NetGraph[
          <|
          "conv"<>prefix->ConvolutionLayer[nchannels,kernel,"Stride"->stride,"PaddingSize"-> pad,"ChannelGroups"-> ngroup],
          "conv"<>prefix<>"_bn"-> BatchNormalizationLayer[],
          "relu"<>prefix-> ElementwiseLayer[( Min[Max[0,#],6]&)]
          |>,
         {NetPort["Input"]-> 1-> 2-> 3}],
       type==2,
         NetGraph[
          <|
          "conv"<>prefix-> ConvolutionLayer[nchannels,kernel,"Stride"->stride,"PaddingSize"-> pad,"ChannelGroups"-> ngroup],
          "conv"<>prefix<>"_bn"-> BatchNormalizationLayer[]
          |>,
         {NetPort["Input"]-> 1-> 2}],
       type==3,
         NetGraph[
          <|
          "conv"<>prefix-> ConvolutionLayer[nchannels,kernel,"Stride"->stride,"PaddingSize"-> {{Ceiling[kernel[[1]]/2]-2,Ceiling[kernel[[2]]/2]},{Ceiling[kernel[[1]]/2]-2,Ceiling[kernel[[2]]/2]}},"ChannelGroups"-> ngroup],
          "conv"<>prefix<>"_bn"-> BatchNormalizationLayer[],
          "relu"<>prefix-> ElementwiseLayer[( Min[Max[0,#],6]&)]
          |>,
         {NetPort["Input"]-> 1-> 2-> 3}]
    ]

Our next job is to put these units in the inverted residual units, that contain an expansion block, a depthwise expansion block with the channelwise expansion parameter, followed by a linear block, just chained together.

invresunit[prefix_,nchannels_,kernel_,stride_,pad_,ngroup_,type_]:=
    Which[
       type==1,
         NetChain
          [{mobileunit[prefix<>"_expand",ngroup,1,1,0,1,1],
          mobileunit[prefix<>"_dwise",ngroup,kernel,stride,pad,ngroup,1],
          mobileunit[prefix<>"_linear",nchannels,1,1,0,1,2]}],
       type==2,
         NetChain
          [{mobileunit[prefix<>"_dwise",ngroup,kernel,stride,pad,ngroup,1],
          mobileunit[prefix<>"_linear",nchannels,1,1,0,1,2]}],
       type==3,
         NetChain
          [{mobileunit[prefix<>"_expand",ngroup,1,1,0,1,1], 
          mobileunit[prefix<>"_dwise",ngroup,kernel,stride,pad,ngroup,3],
          mobileunit[prefix<>"_linear",nchannels,1,1,0,1,2]}]
    ]

Also, there are mobileunit blocks, which are very similar to invresunits, except that they have the residual skip connections (i.e. a ThreadingLayer which adds)

mobilenetblock[prefix_,nchannels_,kernel_,stride_,pad_,ngroup_]:=
    NetGraph[{mobileunit[prefix<>"_expand", ngroup, 1, 1, 0, 1, 1],
         mobileunit[prefix<>"_dwise", ngroup, kernel, stride, pad, ngroup, 1],
         mobileunit[prefix<>"_linear", nchannels, 1, 1, 0, 1, 2],
         ThreadingLayer[Plus]},
         {1->2->3->4, NetPort["Input"]->4}]

Finally, we put the blocks together to create the mobilenet:

genmobilenet[c1_, c2_, c3_, c4_, c5_, c6_, c7_, c8_, c9_, g1_, g2_, 
  g3_, g4_, g5_, g6_, g7_, p_, dim_] := 
    NetChain[<|
      "1" -> mobileunit["1", c1, {3, 3}, {2, 2}, {0, 0}, 1, 3], 
     "2_1" -> invresunit["2_1", c2, {3, 3}, {1, 1}, {1, 1}, g1, 2], 
     "2_2" -> invresunit["2_2", c3, {3, 3}, {2, 2}, {0, 0}, g2, 3], 
           "3_1" -> mobilenetblock["3_1", c3, {3, 3}, {1, 1}, {1, 1}, g3], 
           "3_2" -> invresunit["3_2", c4, {3, 3}, {2, 2}, {0, 0}, g3, 3], 
           "4_1" -> mobilenetblock["4_1", c4, {3, 3}, {1, 1}, {1, 1}, g4], 
           "4_2" -> mobilenetblock["4_2", c4, {3, 3}, {1, 1}, {1, 1}, g4], 
           "4_3" -> invresunit["4_3", c5, {3, 3}, {2, 2}, {0, 0}, g4, 3], 
           "4_4" -> mobilenetblock["4_4", c5, {3, 3}, {1, 1}, {1, 1}, g5], 
           "4_5" -> mobilenetblock["4_5", c5, {3, 3}, {1, 1}, {1, 1}, g5], 
           "4_6" -> mobilenetblock["4_6", c5, {3, 3}, {1, 1}, {1, 1}, g5], 
           "4_7" -> invresunit["4_7", c6, {3, 3}, {1, 1}, {1, 1}, g5, 1], 
           "5_1" -> mobilenetblock["5_1", c6, {3, 3}, {1, 1}, {1, 1}, g6], 
           "5_2" -> mobilenetblock["5_2", c6, {3, 3}, {1, 1}, {1, 1}, g6], 
           "5_3" -> invresunit["5_3", c7, {3, 3}, {2, 2}, {0, 0}, g6, 3], 
           "6_1" -> mobilenetblock["6_1", c7, {3, 3}, {1, 1}, {1, 1}, g7], 
           "6_2" -> mobilenetblock["6_2", c7, {3, 3}, {1, 1}, {1, 1}, g7], 
           "6_3" -> invresunit["6_3", c8, {3, 3}, {1, 1}, {1, 1}, g7, 1], 
           "6_4" -> mobileunit["6_4", c9, {1, 1}, {1, 1}, {0, 0}, 1, 1], 
           "pool6" -> PoolingLayer[{p, p}, {1, 1}, "Function" -> Mean], 
           "fc7" -> ConvolutionLayer[1001, {1, 1}], 
           "reshape" -> ReshapeLayer[{1001}], 
           "prob_softmax" -> SoftmaxLayer[]
           |>, 
    "Input" -> {3, dim, dim}]

Considering the depth parameter of 1.4:

{c1,c2,c3,c4,c5,c6,c7,c8,c9} = {48, 24, 32, 48, 88, 136, 224, 448, 1792}

And for an input size of 224, the value of p=7

Step 3: Importing the Weights

Once the architecture is built, the next step is to get the pre-trained weights of the model. Thanks to ExternalEvaluate this can be now easily done from within the Wolfram Language. Note than you need to have Python, Numpy, Tensorflow and all the libraries that the model depends on already installed in the correct path.

session = StartExternalSession["Python-NumPy"]
weights = ExternalEvaluate[session, "import tensorflow as tf
  import numpy as np
  import h5py
  import sys
  sys.path.append('/home/tuseetab/models/research/slim')
  from nets.mobilenet import mobilenet_v2
  height=224
  width=224
  channels=3
  slim=tf.contrib.slim
  X=tf.placeholder(tf.float32,shape=[None,height,width,channels])
  with slim.arg_scope(mobilenet_v2.training_scope(is_training=False)):
       logits,end_points=mobilenet_v2.mobilenet(X,num_classes=1001)
  with tf.Session() as sess:saver=tf.train.Saver()
      saver.restore(sess,checkpoint_dir+'mobilenet_v2_1.4_224.ckpt')
      var = tf.global_variables()
      weights = sess.run(var)
  my_dict = {}
  for i in range(len(var)):
      my_dict[var[i].name]= weights[i]
  my_dict"]
DeleteObject[session]

Step 4: Parsing the Weights

The weights then need to be manually parsed. In this case the layers containing parameters are ConvolutionLayer (weights), BatchNormalizationLayer (moving mean, variances, biases and scaling). While the BatchNormalization parameters are quite simple to parse (they are mostly vectors), the Convolution weight matrices often need to be transposed (to follow the convention of how these matrices are stored in different frameworks). Please see the following code for more clarification.

convw = 
     Flatten[{{Transpose[Normal@weights[[1]][[1, 2]], {4, 3, 2, 1}],
                  Transpose[Normal@weights[[1]][[4, 1, 2]], {4, 3, 1, 2}],
                  Transpose[Normal@weights[[1]][[4, 2, 2]], {4, 3, 2, 1}],
                  Transpose[Normal@weights[[1]][[5, 2, 2]], {4, 3, 2, 1}],
                  Transpose[Normal@weights[[1]][[5, 1, 2]], {4, 3, 1, 2}],
                  Transpose[Normal@weights[[1]][[5, 3, 2]], {4, 3, 2, 1}]},
                  Flatten[
     Table[{Transpose[Normal@weights[[1]][[i, 2, 2]], {4, 3, 2, 1}],
                  Transpose[Normal@weights[[1]][[i, 1, 2]], {4, 3, 1, 2}],
                  Transpose[Normal@weights[[1]][[i, 3, 2]], {4, 3, 2, 1}]},
                  {i, 13, Length@weights[[1]]}], 1],

             Flatten[
               Table[
                  {Transpose[Normal@weights[[1]][[i, 2, 2]], {4, 3, 2, 1}],
                  Transpose[Normal@weights[[1]][[i, 1, 2]], {4, 3, 1, 2}],
                  Transpose[Normal@weights[[1]][[i, 3, 2]], {4, 3, 2, 1}]},
                  {i, 6, 12}], 1],

             {Transpose[Normal@weights[[1]][[2, 2]], {4, 3, 2, 1}]}}, 1];


    bnbeta = Flatten[{{weights[[1]][[1, 1, 1]]},
                  Table[weights[[1]][[4, i, 1, 1]], {i, 2}],
                  Table[weights[[1]][[5, i, 1, 1]], {i, {2, 1, 3}}],
                  Flatten[
                    Table[weights[[1]][[i, k, 1, 1]],
                   {i, 13, Length@weights[[1]]}, {k, {2, 1, 3}}]
                  , 1],
                  Flatten[

     Table[weights[[1]][[i, k, 1, 1]], {i, 6, 12}, {k, {2, 1, 3}}]
                  , 1],
                  {weights[[1]][[2, 1, 1]]}}
             , 1];


    bngamma = Flatten
              [{{weights[[1]][[1, 1, 2]]},
              Table[weights[[1]][[4, i, 1, 2]], {i, 2}],
              Table[weights[[1]][[5, i, 1, 2]], {i, {2, 1, 3}}],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 2]],
                    {i, 13, Length@weights[[1]]}, {k, {2, 1, 3}}]
                  , 1],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 2]], {i, 6, 12}, {k, {2, 1, 3}}]
                  , 1],
             {weights[[1]][[2, 1, 2]]}}
            , 1];


    movmean = Flatten
              [{{weights[[1]][[1, 1, 3]]},
              Table[weights[[1]][[4, i, 1, 3]], {i, 2}],
              Table[weights[[1]][[5, i, 1, 3]], {i, {2, 1, 3}}],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 3]],
                    {i, 13, Length@weights[[1]]}, {k, {2, 1, 3}}]
                  , 1],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 3]], {i, 6, 12}, {k, {2, 1, 3}}]
                  , 1],
              {weights[[1]][[2, 1, 3]]}}
            , 1];


    movvar = Flatten
              [{{weights[[1]][[1, 1, 4]]},
              Table[weights[[1]][[4, i, 1, 4]], {i, 2}],
              Table[weights[[1]][[5, i, 1, 4]], {i, {2, 1, 3}}],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 4]],
                    {i, 13, Length@weights[[1]]}, {k, {2, 1, 3}}]
                  , 1],
              Flatten[
                     Table[weights[[1]][[i, k, 1, 4]]
                    , {i, 6, 12}, {k, {2, 1, 3}}]
                  , 1],
              {weights[[1]][[2, 1, 4]]}}
            , 1];

Step 5: Linking the Weights

Yes, the next piece of code is quite tedious, prone to errors, and might have required the most iterations to figure out. Nevertheless, this is the final code -

    pref = {"2_2", "3_1", "3_2", "4_1", "4_2", "4_3", "4_4", "4_5", "4_6",
        "4_7", "5_1", "5_2", "5_3", "6_1", "6_2", "6_3"};
    pref2 = {"_expand", "_dwise", "_linear"};
    pref3 = {"_expand_bn", "_dwise_bn", "_linear_bn"};

    mobilenet2 = NetReplacePart[mobilenet,
        Flatten[
             Join[
               Thread[
                  Flatten[
                         Table[
                         {i, 

          Which[k == "_expand", 1, k == "_dwise", 2, k == "_linear", 
           3], 
                                "conv" <> i <> k, 
                                "Weights"},      
                               {i, pref}, {k, pref2}]
                           , 1] 
                    -> Table[convw[[i]], {i, 4, 51}]],

               Thread[
                    Flatten[
                          Table[
                          {
                          {i,

           Which[k == "_expand_bn", 1, k == "_dwise_bn", 2, 
            k == "_linear_bn", 3], 
                              "conv" <> i <> k, 
                              "Biases"}, 
                            {i, 

           Which[k == "_expand_bn", 1, k == "_dwise_bn", 2, 
            k == "_linear_bn", 3], 
                              "conv" <> i <> k, 
                              "Scaling"},
                            {i, 

           Which[k == "_expand_bn", 1, k == "_dwise_bn", 2, 
            k == "_linear_bn", 3], 
                              "conv" <> i <> k, 
                              "MovingMean"}, 
                            {i, 

           Which[k == "_expand_bn", 1, k == "_dwise_bn", 2, 
            k == "_linear_bn", 3], 
                              "conv" <> i <> k, 
                              "MovingVariance"}}
                        , {i, pref}, {k, pref3}]
                         , 2] 
                   -> Flatten[

        Table[{bnbeta[[i]], bngamma[[i]], movmean[[i]], 
          movvar[[i]]}, {i, 4, 51}]
                     , 1]],

             {{"1", "conv1", "Weights"} -> convw[[1]],
              {"1", "conv1_bn", "Biases"} -> bnbeta[[1]],
              {"1", "conv1_bn", "Scaling"} -> bngamma[[1]],
              {"1", "conv1_bn", "MovingMean"} -> movmean[[1]],
              {"1", "conv1_bn", "MovingVariance"} -> movvar[[1]],

              {"2_1", 1, "conv2_1_dwise", "Weights"} -> convw[[2]],
              {"2_1", 1, "conv2_1_dwise_bn", "Biases"} -> bnbeta[[2]],
              {"2_1", 1, "conv2_1_dwise_bn", "Scaling"} -> bngamma[[2]],
              {"2_1", 1, "conv2_1_dwise_bn", "MovingMean"} -> 
       movmean[[2]],
              {"2_1", 1, "conv2_1_dwise_bn", "MovingVariance"} -> 
       movvar[[2]],

              {"2_1", 2, "conv2_1_linear", "Weights"} -> convw[[3]],
              {"2_1", 2, "conv2_1_linear_bn", "Biases"} -> 
       bnbeta[[3]],
              {"2_1", 2, "conv2_1_linear_bn", "Scaling"} -> 
       bngamma[[3]],
              {"2_1", 2, "conv2_1_linear_bn", "MovingMean"} ->  
       movmean[[3]],
              {"2_1", 2, "conv2_1_linear_bn", "MovingVariance"} -> 
       movvar[[3]],

              {"6_4", "conv6_4", "Weights"} -> convw[[52]],
              {"6_4", "conv6_4_bn", "Biases"} -> bnbeta[[52]],
              {"6_4", "conv6_4_bn", "Scaling"} -> bngamma[[52]],
              {"6_4", "conv6_4_bn", "MovingMean"} -> movmean[[52]],
              {"6_4", "conv6_4_bn", "MovingVariance"} -> movvar[[52]],
              {"fc7", "Weights"} -> 
       Transpose[Normal@weights[[1]][[3, 1, 2]], {4, 3, 2, 1}], 
              {"fc7", "Biases"} -> Normal@weights[[1]][[3, 1, 1]]}
             ], 1]];

Step 7: Making the tests

For performing the tests, we evaluate the model in tensorflow (the original framework), and evaluate the model with zero input and random input. Finally we evaluate the difference between the tensorflow output (imported as a structured HDF5), and the output from evaluating the model in Wolfram Language with zero input and the exact same random input as used in tensorflow. A snippet of the code to perform the difference is below:

randomInput = Transpose[Normal@tests["RandomInput"], {3, 2, 1}];
    fileOutZero   = Normal@tests["OutputForZeros"][[All, 1]];
    fileOutRandom = Normal@tests["OutputForRandom"][[All, 1]];    

    netOutZero = 
   Normal@bareNet [ ConstantArray[0, Dimensions@randomInput]];
    netOutRandom = Normal@bareNet [randomInput];

    diffZero = Abs[netOutZero - fileOutZero];
    diffRandom = Abs[netOutRandom - fileOutRandom];

    zeroTest = <|
           "MaxAbsoluteDifference" -> Max[diffZero],
           "MaxRelativeDifference" -> 
     Max[diffZero / Clip[Abs@netOutZero, {10^-8., Infinity}]]
        |>;

    randomTest = <|
           "MaxAbsoluteDifference" -> Max[diffRandom],
           "MaxRelativeDifference" -> 
     Max[diffRandom / Clip[Abs@netOutRandom, {10^-8., Infinity}]]
        |>;

where tests, contain the HDF5 obtained from tensorflow and contains: a) Output obtained by running with zeros b) Random Input c) Output obtained by running the random input

Step 8: Attach Encoder and Decoder

mean = {0.485, 0.456, 0.406};
variance = {0.052441, 0.050176, 0.050625};
classes = 
  Prepend[Import @ FileNameJoin[{$CommonDir, "imagenet1000.m"}], 
   Entity["Concept", "Other::nzvm6"]];
dec = NetDecoder[{"Class", classes}];

net = NetReplacePart[mobilenet,
    {"Input" -> NetEncoder[{"Image", {224, 224}}],
    "Output" -> dec}]

Here the classes variable imports a file that contains all the 1000 image classes stored as Entities and attaches an extra "Other" class.

Conclusion:

After reading the post, one would often think that it is easier, better, and often less time consuming to opt for using automated tools for model conversion. Yes, when available the tools are definitely the way to go (in fact we have a big drive to develop and use the ONNX converter in the group, although it is not ready and fully tested yet). However, many a times, there are models in the wild that are harder and sometimes impossible to convert using these tools. In those cases, you need to use the traditional, hand conversion methods. In fact, although the process of hand conversion is often laborious, time consuming and in lack of better words, frustrating, this process teaches us a lot about the various frameworks/their differences, the details of the models, and is quintessential if you wish to pursue research in the field for future.


10 Replies

Tuseeta-san, thanks for your post about how to convert TensorFlow model. I use PyTorch as well as Mathematica, so I may post about how to convert a trained model in PyTorch later.

That would be wonderful actually! I have myself converted models from Pytorch in the past, but it would be great to see newer models. Please post here in community (and send us the net to the neural repository). If you have any questions, during the process of conversion, please feel free to reach out. We would love to see more user submissions in the neural net repository.

Tuseeta-san, thanks for your advice. I posted Pose Estimation Model, OpenPose in PyTorch. I think there must be a cooler way to do it, so I would appreciate your comments.

Thank you so much for the follow up, and please submit it to the neural net repository. Also, would you be able to post a notebook, the link to the paper etc. Please keep the models coming! We appreciate all the submissions.

enter image description here - Congratulations! This post is now featured in our Staff Pick column as distinguished by a badge on your profile of a Featured Contributor! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

Posted 5 months ago

Thanks for sharing this post!!

Posted 1 month ago

Hello Tuseeta

Terrific work! And I like to help out with the ports of the NN models from others into Mathematica.

I have vested interest for that for other reasons but we could share resources.

If you could have techsupport send me a bunch of notebooks for that kinda of port over, with more examples on the net functions (Documentation Center is very abrupt on NN), I will be in your debt.

I have a large servers 64 cpus with 1TB of memory with 64 licenses of Mathematica for each cpu core GPU boards and perhaps we could experiment to retrain or enhance some of the nets.

I do not mind to open source the code at my end.

Techsupport has my email address in case

Posted 1 month ago

I spoke to Techsuport and I think there is a problem with older 2.x version of Python for now no need to do send me anything.

I will continue upgrading and report back.

links http://macappstore.org/zeromq/ https://reference.wolfram.com/language/tutorial/ConfigurePythonForExternalEvaluate.html

Hello Wayne,

Thank you so much for your interest in converting models to the Wolfram Language and for offering to experiment with retraining etc. Sounds absolutely wonderful!

I would be more than happy to help in any way possible. Yes please work via tech-support, they will reach out to me, and I will make sure I respond to them as quickly as possible.

Thanks, Tuseeta

Posted 1 month ago

You are so good!

This is the case number

[CASE:4376400]

I am speaking to engineering about the problem with Python 3 issues. There is a step missing in the installation that upsets Mathematica.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract