Message Boards Message Boards

GROUPS:

Changing value of a NetArray or a NetArrayLayer during training?

Posted 2 months ago
368 Views
|
2 Replies
|
0 Total Likes
|

I am trying to code a variant of Proximal Policy Optimization algorithm (Reinforcement Learning) in Mathematica, and for the training of the network I need to change the coefficient, beta, of one of the loss terms dynamically after each batch... sometimes the value of beta should be doubled and sometimes halved.

Is there ANY way to do that in Mathematica?

The only way(?) that comes to my mind is that when the TrainingProgressFunction is called after each batch, I get my hands on the #TrainingNet or the network that is being trained, and then change the value of NetArray associated with beta manually to whatever I want for the next round.

However, unfortunately, commands like NetExtract, NetTake, NetReplacePart all create new copies of the net, and hence, won't be any good. Somehow I need to change or update the very net that is being trained without copying it. To make the beta value not trainable, the LearningRateMultipliers of the NetArrayLayer must be set to None.

Any information or guidance is very much appreciated.

2 Replies

Welcome to Wolfram Community!
Please make sure you know the rules: https://wolfr.am/READ-1ST
Please provide an example code so it is clear what exactly you are looking for.

Posted 2 months ago

Sorry if my post was unclear or if it appeared that I am bluntly asking for help. Perhaps copying the code here is going to make my already confusing question even more confusing. I have spent a ton of time searching online and reading through Mathematica manual pages. I was hoping that someone here can either tell me that such a feature I am looking for does not exists in Mathematica or someone gives me a pointer.

Some reinforcement learning methods such as TRPO or PPO use minimization (maximization) of some entropy or Kullbeck-Leibler divergence - see, for example, equations 2b and 2c in this recent paper:

Hsu, Chloe Ching-Yun, Celestine Mendler-D√ľnner, and Moritz Hardt. "Revisiting Design Choices in Proximal Policy Optimization." arXiv preprint arXiv:2009.10897 (2020).

I want to be able to dynamically update the scaling factor of a specific loss term during training. Say below is how I train my network and note that the Loss functions for different parts of the network are optimized separately using Scaled. Scaled allows me to scale individual losses by different factors. For example, -1 for the clip loss maximizes that term, and 1.0 for valueFunctionLoss minimizes the value function.

So using Scaled, if I need to, let's say, use a different scaling factor for the KL divergence loss of the network, klForwardLoss, say 0.01, I use Scaled[0.01].

But what if I want to change/update beta during training, say 0.01 at first and then slowly update it to get to 0.1. Is doing this possible?

resultNet = NetTrain[
  net,
  ppoSampler[#Net, #BatchSize] &,
  All,
  LossFunction -> {
    "clipLoss" -> Scaled[-1.0]
    , "valueFunctionLoss" -> Scaled[1.0]
    , "klForwardLoss" -> Scaled[beta]  (* (1) <-------BETA, can it be updated here?*)
    },
  Method -> "RMSProp",
  BatchSize -> 32,
  MaxTrainingRounds -> 20000,
  LearningRate -> 0.00025,
  TrainingUpdateSchedule -> {"policy", "value"},
  WorkingPrecision -> "Real64",
  TrainingProgressFunction->Function[
 (*(2) Can beta be updated here?*) 
 (* access to the network is provided through #Net *)
  ]
  ]
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract