Message Boards Message Boards

UNET: CatenateLayer to connect nodes within a NetChain ?

GROUPS:

EDIT: Looking closely at my network I have a similar problem of providing inputs to the CropLayer function. The output from the pooling layer (node 30) and node 87 (activation) should feed to the CropLayer. The question is how to feed outputs from one layer to other layers in the network

I am trying to implement UNET in Mathematica: https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net

I have the following code so far for generating the net partially:

(* encoder *)

encoder = NetEncoder[{"Image", "ImageSize" -> {168, 168}, "ColorSpace" -> "Grayscale"}];

(* decoder *)

decoder = NetDecoder[{"Image", "ColorSpace" -> Automatic}];

(* convolution module *)

Options[convolutionModule] = {"batchNorm" -> True, "downpool" -> False,
"uppool" -> False, "activationType" -> Ramp, "convolution" -> True};

convolutionModule[net_, kernelsize_, padsize_, stride_: {1, 1}, OptionsPattern[]] := 
With[{upPool = OptionValue["uppool"], activationType = OptionValue["activationType"], 
convolution = OptionValue["convolution"], batchNorm = OptionValue["batchNorm"],
downpool = OptionValue@"downpool"},

Block[{nnet = net},
If[upPool,
nnet = NetAppend[nnet, DeconvolutionLayer[1, {2, 2}, "PaddingSize" -> {0, 0}, 
   "Stride" -> {2, 2}]];
nnet = NetAppend[nnet, BatchNormalizationLayer[]];
If[activationType === Ramp,
 nnet = NetAppend[nnet, ElementwiseLayer[activationType]]
 ];
];
If[convolution,
 nnet = NetAppend[nnet, ConvolutionLayer[1, kernelsize, "Stride" -> stride,
   "PaddingSize" -> padsize]]
];
If[batchNorm,
nnet = NetAppend[nnet, BatchNormalizationLayer[]]
];
If[activationType === Ramp,
nnet = NetAppend[nnet, ElementwiseLayer[activationType]]
];
If[downpool,
nnet = NetAppend[nnet, PoolingLayer[{2, 2}, "Function" -> Max, "Stride" -> {2, 2}]]
];
nnet]
]

(* Crop Layer *)

CropLayer[netlayer_] := With[{p = NetExtract[netlayer, "Output"]},
PartLayer[{First@p, 1 ;; p[[2]], 1 ;; Last@p}] ]; 

(* partial UNET *)

UNET[] := 
Block[{nm, pool1, pool2, pool3, pool4, pool5, kernelsize = {3, 3}, 
padsize = {1, 1}, stride = {1, 1}},
nm = NetChain@
Join[{ConvolutionLayer[1, {3, 3}, 
   "Input" -> encoder]}, {BatchNormalizationLayer[], 
  ElementwiseLayer[Ramp],
  PoolingLayer[{2, 2}, "Function" -> Max, "Stride" -> {2, 2}]}];
pool1 = nm[[-1]];
nm = convolutionModule[nm, kernelsize, padsize, stride,"downpool" -> True];
pool2 = nm[[-1]];
nm = convolutionModule[nm, kernelsize, padsize, stride,"downpool" -> True];
pool3 = nm[[-1]];
nm = convolutionModule[nm, kernelsize, padsize, stride,"downpool" -> True];
pool4 = nm[[-1]];
nm = NetAppend[nm, DropoutLayer[]];
nm = convolutionModule[nm, kernelsize, padsize, stride, "downpool" -> True];
pool5 = nm[[-1]];
nm = convolutionModule[nm, kernelsize, padsize, stride, "uppool" -> True]; 
nm = convolutionModule[nm, kernelsize, padsize + 1, stride, "uppool" -> True]; 
nm = NetAppend[nm, CropLayer@pool3];

with NetInformation I can generate the net plot below:

NetInformation[(nm = UNET[]), "MXNetNodeGraphPlot"]

enter image description here

My problem: how do I catenate the output from the pooling layer i.e. node 30 with the output from node 91.

I tried using NetGraph with CatenateLayer but could not find a way to connect node 30 within the NetChain with the second input of CatenateLayer.

enter image description here

POSTED BY: Ali Hashmi
Answer
1 month ago

By the looks of it, the output of your pooling layer is 1 x 20 x 20 while the output of your network is 20 x 20, so you'll first have to decide how to match these dimensions. You can use ReshapeLayer to match the dimensions of one to the other or you can use AppendLayer instead.

Now as for getting the output of node 30 directly towards the output: this isn't possible if you use NetChain as far as I know. NetChain only constructs linear nets. Instead, you should use NetGraph. As a simple example of how to catenate information from deeper inside of the net directly to the output, consider the simple example below where the output of the first linear layer is appended directly to the output of the last linear layer:

NetGraph[
 {
  LinearLayer[10],
  Ramp,
  LinearLayer[5],
  CatenateLayer[]
  },
 {
  NetPort["Input"] -> 1 -> 2 -> 3,
  {1, 3} -> 4 -> NetPort["Output"]
  }
 ]

I noticed that your code is based on the idea of gradually adding layers to the network rather than constructing it in one go and I think you may want to reconsider that strategy here. It's probably easier to rewrite convolutionModule to a function that simply returns a discrete block (e.g., a NetChain) rather than having it append the results to an existing net. Several of these blocks can then be put together in one single NetGraph.

POSTED BY: Sjoerd Smit
Answer
26 days ago

Hi Sjoerd,

Thank you very much for the reply. I am trying to re-implement UNET as you suggested using NetGraph rather than incrementally building the network. In retrospect, I do think my previous approach was not the right way to go about implementing the net.

Btw, I am stuck on a particular step. Do you know if there is a custom layer that takes the tensor outputs from two layers as distinct inputs and crops the first tensor with regards to the second input. I checked PartLayer but it seems that you need to tell in advance about the cropping dimension. Do you have any idea how that can be implemented?

POSTED BY: Ali Hashmi
Answer
25 days ago

Hi Ali,

As far as I know, it's not possible to have a layer that accepts 2 tensors and resizes one to the size of the second. Of course there might be hacks that I'm not aware off, but as a rule I think that this is against the nature of how neural networks work. The idea of NNs is that you know the dimensions of your operations so you can efficiently handle all transformations. Sequences with varying lengths are the most significant exception, but even those are limited by dimension restrictions you have to specify up front.

So the only way I can see this working is by knowing the dimensions of your input and output beforehand, at which point PartLayer should do the job. Or you use any of the other resizing layers, like ReshapeLayer, ResizeLayer and PaddingLayer, though all of these impose up-front restrictions on how the dimensions of your tensors change as they flow through the network.

If you have a concrete minimal example of what you need to achieve, I can try to be more specific.

POSTED BY: Sjoerd Smit
Answer
25 days ago

Dear Sjoerd,

I have re-implemented the architecture in spirit of what you said. I have made a new post on the community as well as on StackExchange:

https://mathematica.stackexchange.com/questions/172481/how-can-image-segmentation-from-unet-be-improved

http://community.wolfram.com/groups/-/m/t/1332109

Kindly see if you are able to help me enhance the performance from the net. Thanks in advance !

POSTED BY: Ali Hashmi
Answer
23 days ago

Glad to hear I could be of service. I'll take a look at your work, but my expertise is not so much on image analysis so I may not be able to provide much useful commentary on the subject of performance improvement.

POSTED BY: Sjoerd Smit
Answer
23 days ago

Group Abstract Group Abstract