Group Abstract Group Abstract

Message Boards Message Boards

1
|
6.9K Views
|
3 Replies
|
5 Total Likes
View groups...
Share
Share this post:

Deconvolution approaches for 3D data

Posted 5 years ago
POSTED BY: Blair Birdsell
3 Replies

Thanks for the update. I was wondering if the original poster was still around years later (certainly the problem still is). With the extra information I improved on the annotated code. After your description it makes more sense now. What seems less reasonable is the way WL neural nets handle this use case. I hadn't seen that ResizeLayer also has the drawback of being unable to work on 4D arrays much like DeconvolutionLayers. Excellent job figuring all this out. Your new version looks like a type of Progressive Growth GAN where the steps are mixed as its upsampled.

ResizeLayer3D[n_, {dimInx_, dimIny_, dimInz_}] :=
 Block[{sc = 2},
  NetChain[{
    (* Flattens the first two levels so because the next layer only \
works on 2D arrays *)
    FlattenLayer[1, "Input" -> {n sc, dimInx, dimIny, dimInz}],
    (* Doubles the size of the last two dimensions *)
    ResizeLayer[{Scaled[sc], Scaled[sc]}],
    (* Reshapes the array back to its original order but with the \
last two dimensions scaled up from previous layer *)        
    ReshapeLayer[{n sc, dimInx, sc dimIny, sc dimInz}],
    (* Transposes 2nd and 3rd dimensions so that previously unscaled \
dimension can be actioned *)
    TransposeLayer[2 <-> 3],
    (* Again flatten array a level so that the next later can \
actually action it *)
    FlattenLayer[1],
    (* Scale only the dimension that hasn't been actioned yet *)
    ResizeLayer[{Scaled[sc], Scaled[1]}],
    (* reshape back to original structure but the array now has been \
scaled up *)
    ReshapeLayer[{n sc, sc dimIny, sc dimInx, sc dimInz}],
    (* importantly, transpose the dimensions back to their original order since it was \
changed above*)
    TransposeLayer[2 <-> 3],
    (* Now a convolution can be applied to the upsampled up data *)
      ConvolutionLayer[n, 1]}
   ]
  ]
POSTED BY: Blair Birdsell

Hi, Indeed the code I posted is an ugly workaround.

The juggling with reshaping and resizing is done because of another limitation of ResizeLayer. For 3D data, the convolution layer takes 4D information (the features and the data dimensions). However, ResizeLayer does not allow to resize 4D arrays, only 3D arrays and it only allows to resize the last two dimensions of the 3D array. Therefore I first flatten the array to 3D then I resize the last two dimensions. Next, I reshape back to a 4D array and place the last dimension that is not resized yet to one of the two last positions. Then I repeat the same trick to reshaping to 3D resizing the last remaining dimensions and transposing all to the original positions and reshaping it back to a 4D array.

Then finally the convolution can be applied.

So basically I split the Deconvolution it two parts, upscaling and then convolution. Where it should be done simultaneously by introducing a stride in the convolution layer.

So the function is a workaround for the non-existing N-dimensional versions of DeconvolutionLayer and ResizeLayer. However I don't think all the extra juggling of dimensions makes the function very efficient, but it does the trick for me.

Regarding your specific questions:

  1. The scale in the ResizeLayer is the attempt to mimic the stride in the convolution layer.

  2. Indeed the convolution layer store the trainable parameters. Instead of using a stride, I use the ResizeLayer and then convolution with stride 1. Although I don't think the actual effect is identical both layers give similar results when training a 2D network.

  3. Yes all the trainable parameters are in the convolution layer. This can easily be seen I you remove the convolution layer from the function.

enter image description here

Hope this clarifies the function, Best Martijn

POSTED BY: Martijn Froeling

As an extra addition I actually don't use the function anymore in my implementation of UNET. For the encoding layers I use the PoolingLayer, which does not have any trainable parameters. So for symmetry of the network in the deconvolution part I also use only the upscaling without convolution.

The convolution is done later after concatenating it with the information passed though the skip layers.

enter image description here enter image description here

POSTED BY: Martijn Froeling
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard