Message Boards Message Boards

Avoid RMSProp and Contrastive problems and how to implement maxout, LSUV ?

Posted 6 years ago

Greetings, I have been experimenting with MNIST dataset and was hit by the following problems, any help regarding any of my problems is highly appreciated and thanks a lot to all those who have spent their time reading my post and trying to help me out. 1) I know that mathematica supports RMSProp optimization but when i try to use this option, i am getting the following error:

NetTrain::arrdiv: Training was aborted because one or more trainable parameters of the net diverged. To avoid this, ensure that the training data has been normalized to have zero mean and unit variance. You can also try specifying a lower "InitialLearningRate" to Method; the value used for this training session was 0.001`. Alternatively, you can use the "GradientClipping" option to Method to bound the magnitude of gradients during training.

I hae tried several sources but failed to solve this problem of divergence. Can someone also please tell me how to make the trainig data normalized to have zero mean and unit variance using mathematica.

2) Another problem i have been getting is when i am trying to use another in-built option by mathematica which is contrastive loss,

netEncoder = 
  NetEncoder[{"Image", {width, height}, ColorSpace -> "Grayscale"}];
netDecoder = 
  NetDecoder[{"Class", {"0", "1", "2", "3", "4", "5", "6", "7", "8", 
     "9"}}];
conv1 = ConvolutionLayer[20, {5, 5}, "PaddingSize" -> 0, 
   "Stride" -> 1 ];
conv2 = ConvolutionLayer[50, {5, 5}, "PaddingSize" -> 0, 
   "Stride" -> 1];
block1 = {BatchNormalizationLayer[], conv1, activation, 
   PoolingLayer[2, 2], conv2, activation, PoolingLayer[2, 2], 
   FlattenLayer[], 500, activation, class, SoftmaxLayer[]};

leNet = NetChain[block1, "Output" -> netDecoder, 
  "Input" -> netEncoder]

I am getting the following error: NetTrain::invploss2: Provided loss layer ContrastiveLossLayer[2.5,...], which expects a number, is incompatible with "Output" port, which produces a class. First::normal: Nonatomic expression expected at position 1 in First[ $Failed]. Rest::normal: Nonatomic expression expected at position 1 in Rest[$Failed].

I can understand that it has something to do with the netDecoder options i am using but i am unable to figure out how to use contrastive loss and also give the different classes present in the data.

3) I would like to implement the maxout activation function for my network and would like some help in defining the activation function.

4) I would also like to implement the LSUV initialization and request some help in order to successfully implement it.

Thanks

POSTED BY: Ashish Sharma
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract