Message Boards Message Boards

1
|
1057 Views
|
6 Replies
|
7 Total Likes
View groups...
Share
Share this post:

NetTrain fails with MXNetError: Check failed: assign(&dattr, vec.at(i)) on CNN with pooling

Posted 4 months ago
POSTED BY: Kirill Vasin
6 Replies

I have tried to train your internal block and I'm getting a different error (on Linux):

conv[channelsIn_, channelsOut_, length_ : 1024] := 
  ConvolutionLayer[channelsOut, {7}, "Input" -> {channelsIn, length}, 
   PaddingSize -> 3];

batchnorm[channelsIn_, length_ : 1024] := 
  BatchNormalizationLayer["Input" -> {channelsIn, length}];

relu[channelsIn_, length_ : 1024] := 
  ElementwiseLayer["ReLU", "Input" -> {channelsIn, length}];

residual[channelsIn_, length_ : 1024] := 
  NetGraph[{LinearLayer[{channelsIn, length}], 
    ElementwiseLayer[Ramp, "Input" -> {channelsIn, length}], 
    ThreadingLayer[Plus, "Output" -> {channelsIn, length}, 
     InputPorts -> 2]}, {1 -> 2, {NetPort["Input"], 2} -> 3}, 
   "Input" -> {channelsIn, length}];

net = NetFlatten@NetChain[
   {
    conv[8, 16, 512],
    batchnorm[16, 512],
    relu[16, 512],
    residual[16, 512],
    conv[16, 16, 512],
    batchnorm[16, 512],
    relu[16, 512]
    },
   "Input" -> {8, 512},
   "Output" -> {16, 512}
   ]
NetTrain[net, 
 RandomReal[1, {100, 8, 512}] -> RandomReal[1, {100, 16, 512}]]

Error is

MXNetError: Check failed: !is_view:

Regardless of the error, these are MXNet bugs and unfortunately we have to live with them until we migrate the framework to a different backend, whose timeframe is very long (hard to predict, maybe around 1-2 years).

As a workaround, I could train the net normally by removing the batchnorms or by setting WorkingPrecision -> "Real64", but speed is going to suffer greatly from the latter. You are getting a different error so I'm not sure what works for you. I think WorkingPrecision -> "Real64" is going to work, not sure about removing the batchnorms.

Thank you @Matteo Salvarezza

Your workaround did work well. :) Just a question, is there any plans on which backend is going to be used there?

POSTED BY: Kirill Vasin

Most likely PyTorch, Tensorflow, or both

Thank you for the reply. Great!

PS: But then probably Torch (c++) or could it be that Python interpreter is going to be integrated into a bundle?

POSTED BY: Kirill Vasin

No python, we directly hook up the C++ libraries. We already have a working prototype internally, I presented it at this year's Wolfram Tech Conference 2 weeks ago.

After poking this problem more, I've found that it might be something related to how I encode Nx1024 to Nx512 blocks. it is always connected with those blocks

    "internalBlock11" -> NetChain[
      {
        conv[8, 16, 512], 
        batchnorm[16, 512], 
        relu[16, 512], 
        residual[16, 512], (* <--- HERE *)
        conv[16, 16, 512], 
        batchnorm[16, 512], 
        relu[16, 512]
      }, 
      "Input" -> {8, 512}, 
      "Output" -> {16, 512}
    ],
POSTED BY: Kirill Vasin
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract