Group Abstract

Message Boards

WOLFRAM COMMUNITY

5.5K Views

2 Replies

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Changing value of a NetArray or a NetArrayLayer during training?

Ethan H.

Posted 5 years ago

I am trying to code a variant of Proximal Policy Optimization algorithm (Reinforcement Learning) in Mathematica, and for the training of the network I need to change the coefficient, beta, of one of the loss terms dynamically after each batch... sometimes the value of beta should be doubled and sometimes halved. Is there ANY way to do that in Mathematica? The only way(?) that comes to my mind is that when the TrainingProgressFunction is called after each batch, I get my hands on the #TrainingNet or the network that is being trained, and then change the value of NetArray associated with beta manually to whatever I want for the next round. However, unfortunately, commands like NetExtract, NetTake, NetReplacePart all create new copies of the net, and hence, won't be any good. Somehow I need to change or update the very net that is being trained without copying it. To make the beta value not trainable, the LearningRateMultipliers of the NetArrayLayer must be set to None. Any information or guidance is very much appreciated.

POSTED BY: Ethan H.

2 Replies

Sort By:

Ethan H.

Posted 5 years ago

Sorry if my post was unclear or if it appeared that I am bluntly asking for help. Perhaps copying the code here is going to make my already confusing question even more confusing. I have spent a ton of time searching online and reading through Mathematica manual pages. I was hoping that someone here can either tell me that such a feature I am looking for does not exists in Mathematica or someone gives me a pointer. Some reinforcement learning methods such as TRPO or PPO use minimization (maximization) of some entropy or Kullbeck-Leibler divergence - see, for example, equations 2b and 2c in this recent paper: Hsu, Chloe Ching-Yun, Celestine Mendler-Dünner, and Moritz Hardt. "Revisiting Design Choices in Proximal Policy Optimization." arXiv preprint arXiv:2009.10897 (2020). I want to be able to dynamically update the scaling factor of a specific loss term during training. Say below is how I train my network and note that the Loss functions for different parts of the network are optimized separately using Scaled. Scaled allows me to scale individual losses by different factors. For example, -1 for the clip loss maximizes that term, and 1.0 for valueFunctionLoss minimizes the value function. So using Scaled, if I need to, let's say, use a different scaling factor for the KL divergence loss of the network, klForwardLoss, say 0.01, I use Scaled[0.01]. But what if I want to change/update beta during training, say 0.01 at first and then slowly update it to get to 0.1. Is doing this possible? resultNet = NetTrain[ net, ppoSampler[#Net, #BatchSize] &, All, LossFunction -> { "clipLoss" -> Scaled[-1.0] , "valueFunctionLoss" -> Scaled[1.0] , "klForwardLoss" -> Scaled[beta] (* (1) <-------BETA, can it be updated here?) }, Method -> "RMSProp", BatchSize -> 32, MaxTrainingRounds -> 20000, LearningRate -> 0.00025, TrainingUpdateSchedule -> {"policy", "value"}, WorkingPrecision -> "Real64", TrainingProgressFunction->Function[ ((2) Can beta be updated here?) ( access to the network is provided through #Net *) ] ]

POSTED BY: Ethan H.

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 5 years ago

Welcome to Wolfram Community! Please make sure you know the rules: https://wolfr.am/READ-1ST Please provide an example code so it is clear what exactly you are looking for.

POSTED BY: EDITORIAL BOARD

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback