Message Boards Message Boards

1
|
4793 Views
|
3 Replies
|
5 Total Likes
View groups...
Share
Share this post:

Deconvolution approaches for 3D data

Posted 3 years ago

Attention to all Wolfram Community Neural Net experts. Please help me understand this important part of implementing a 3D neural net.

I've recently been engaged in rewriting a previous GAN of mine in the WL to see if I can get better performance. I was surprised to find that the standard DeconvolutionLayer does not have the same dimensional coverage as ConvolutionLayer. There's no option on the DeconvolutionLayer to enter a {h, w, d}-sized kernel. My use is in the Architecture/Engineering field, however, this seems like an even larger oversight for others doing 3D medical image analysis (which is becoming more common place). Be that as it may, I'm now searching for a good workaround. I'm looking for something similar in functionality to Tensorflow2 Keras' Conv3DTranspose.

There seems to be very little writing online about this particular corner of Mathematica. There is this 2018 post by Martijn Froeling on the Wolfram Community forums which I've been using as reference. Instead of upscaling through a DeconvolutionLayer that keeps the trainable parameters, there's quite a lot of working around with RescaleLayer and ResizeLayer which I don't understand.

I've tried my best to research all the pieces in the documentation, and parsed the code as best I could to understand it's flow, but larger questions still exist. Why does this lead to that? Why is this placed after that? There's a gap in my knowledge that I would like to learn about before implementing a variation of it in my project.

Below I've started annotating the code as a place to start, but there must be others who can add more detail. Until there's an actual implementation of a 3D convolution transpose in the WL, more information on this approach is important for the community.

(* Different implementation of 2D convolution layer *)
DeconvLayer2D[n_, {dimInx_, dimIny_}] := 
Block[{sc = 2}, 
    NetChain[{DeconvolutionLayer[n, {sc, sc}, "Stride" -> {sc, sc}, 
    "Input" -> {sc n, dimInx, dimIny}]}]
 ]

(* Unknown why the resize layer is structured with Scaled x2. \
The Convolutional layer is where the training parameters are kept? *)
ResizeLayer2D[n_, {dimInx_, dimIny_}] := 
Block[{sc = 2}, 
    NetChain[{ResizeLayer[{Scaled[sc], Scaled[sc]}, 
     "Input" -> {sc n, dimInx, dimIny}], ConvolutionLayer[n, 1]}]
]

(* I suspect that since 3D Transpose Convolution is not available, \
one must transpose the array twice to cover all {X,Y,Z} axis. Not \
sure where the training parameters are kept in this netchain either, \
probably the convolution layer at the end? *)
ResizeLayer3D[n_, {dimInx_, dimIny_, dimInz_}] :=
Block[{sc = 2},
    NetChain[{
    FlattenLayer[1, "Input" -> {n sc, dimInx, dimIny, dimInz}],
    ResizeLayer[{Scaled[sc], Scaled[sc]}],
    ReshapeLayer[{n sc, dimInx, sc dimIny, sc dimInz}],
    TransposeLayer[2 <-> 3],
    FlattenLayer[1],
    ResizeLayer[{Scaled[sc], Scaled[1]}],
    ReshapeLayer[{n sc, sc dimIny, sc dimInx, sc dimInz}],
    TransposeLayer[2 <-> 3],
    ConvolutionLayer[n, 1]}
    ]
]

{DeconvLayer2D[16, {2, 4}], ResizeLayer2D[16, {2, 4}], ResizeLayer3D[16, {2, 4, 6}]}

Implementation

Let's say I want to do 3D convolutions over a random array of noise. How would I go about implementing the above? It's beyond my previous neural nets in WL and Python.

noise = RandomChoice[{0.98, 0.02} -> {0, 1}, {25, 25, 25}];
image3D[noise]

Feature Request

Finally I draw the community's attention to the need for symmetry between ConvolutionLayer and DeconvolutionLayer similar to Tensorflow 2. For a framework to suggest it's industry-leading and easy-to-use, but then requires the above code tangle instead of a readable and predictable function is contradictory. I hope this functionality is considered for inclusion in future updates to WL neural nets. It really could be lifesaving.

(Additional information is available on the original & unanswered Mathematica Stack Exchange question.)

POSTED BY: Blair Birdsell
3 Replies

Thanks for the update. I was wondering if the original poster was still around years later (certainly the problem still is). With the extra information I improved on the annotated code. After your description it makes more sense now. What seems less reasonable is the way WL neural nets handle this use case. I hadn't seen that ResizeLayer also has the drawback of being unable to work on 4D arrays much like DeconvolutionLayers. Excellent job figuring all this out. Your new version looks like a type of Progressive Growth GAN where the steps are mixed as its upsampled.

ResizeLayer3D[n_, {dimInx_, dimIny_, dimInz_}] :=
 Block[{sc = 2},
  NetChain[{
    (* Flattens the first two levels so because the next layer only \
works on 2D arrays *)
    FlattenLayer[1, "Input" -> {n sc, dimInx, dimIny, dimInz}],
    (* Doubles the size of the last two dimensions *)
    ResizeLayer[{Scaled[sc], Scaled[sc]}],
    (* Reshapes the array back to its original order but with the \
last two dimensions scaled up from previous layer *)        
    ReshapeLayer[{n sc, dimInx, sc dimIny, sc dimInz}],
    (* Transposes 2nd and 3rd dimensions so that previously unscaled \
dimension can be actioned *)
    TransposeLayer[2 <-> 3],
    (* Again flatten array a level so that the next later can \
actually action it *)
    FlattenLayer[1],
    (* Scale only the dimension that hasn't been actioned yet *)
    ResizeLayer[{Scaled[sc], Scaled[1]}],
    (* reshape back to original structure but the array now has been \
scaled up *)
    ReshapeLayer[{n sc, sc dimIny, sc dimInx, sc dimInz}],
    (* importantly, transpose the dimensions back to their original order since it was \
changed above*)
    TransposeLayer[2 <-> 3],
    (* Now a convolution can be applied to the upsampled up data *)
      ConvolutionLayer[n, 1]}
   ]
  ]
POSTED BY: Blair Birdsell

Hi, Indeed the code I posted is an ugly workaround.

The juggling with reshaping and resizing is done because of another limitation of ResizeLayer. For 3D data, the convolution layer takes 4D information (the features and the data dimensions). However, ResizeLayer does not allow to resize 4D arrays, only 3D arrays and it only allows to resize the last two dimensions of the 3D array. Therefore I first flatten the array to 3D then I resize the last two dimensions. Next, I reshape back to a 4D array and place the last dimension that is not resized yet to one of the two last positions. Then I repeat the same trick to reshaping to 3D resizing the last remaining dimensions and transposing all to the original positions and reshaping it back to a 4D array.

Then finally the convolution can be applied.

So basically I split the Deconvolution it two parts, upscaling and then convolution. Where it should be done simultaneously by introducing a stride in the convolution layer.

So the function is a workaround for the non-existing N-dimensional versions of DeconvolutionLayer and ResizeLayer. However I don't think all the extra juggling of dimensions makes the function very efficient, but it does the trick for me.

Regarding your specific questions:

  1. The scale in the ResizeLayer is the attempt to mimic the stride in the convolution layer.

  2. Indeed the convolution layer store the trainable parameters. Instead of using a stride, I use the ResizeLayer and then convolution with stride 1. Although I don't think the actual effect is identical both layers give similar results when training a 2D network.

  3. Yes all the trainable parameters are in the convolution layer. This can easily be seen I you remove the convolution layer from the function.

enter image description here

Hope this clarifies the function, Best Martijn

POSTED BY: Martijn Froeling

As an extra addition I actually don't use the function anymore in my implementation of UNET. For the encoding layers I use the PoolingLayer, which does not have any trainable parameters. So for symmetry of the network in the deconvolution part I also use only the upscaling without convolution.

The convolution is done later after concatenating it with the information passed though the skip layers.

enter image description here enter image description here

POSTED BY: Martijn Froeling
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract