Message Boards Message Boards

1
|
5008 Views
|
1 Reply
|
2 Total Likes
View groups...
Share
Share this post:

How is the loss computed from the batch and round?

Posted 3 years ago

Hello,

I am trying to just reproduce the loss computed during training a very simple network. In this case the loss is just the standard L2 loss function. I also attached the complete notebook.

Start with some data to learn a sin function:

trainingData = Table[x -> Sin[x], {x, 0, 2 Pi, 2 Pi/100.0}];

Next, create a very simple network:

chain = NetChain[{10, Tanh, 1}];

Normally, I would train it just like this:

result = NetTrain[chain, trainingData, All

But to reproduce the loss, I need to know what the batch data was used at each iteration. So let's also save that:

lastBatchIn = {};
lastBatchOut = {};
lastBatchLossList = None;

appendBatch = Block[{},
    (* Save the last two batch data *)

    AppendTo[lastBatchIn, Normal[#BatchData["Input"]]];
    AppendTo[lastBatchOut, Normal[#BatchData["Output"]]];
    If[Length[lastBatchIn] > 2,
     lastBatchIn = lastBatchIn[[-2 ;;]];
     lastBatchOut = lastBatchOut[[-2 ;;]];
     ];

    (* Save the last two losses for these batches *)

    lastBatchLossList = #BatchLossList[[-2 ;;]];
    ] &;

result = NetTrain[chain, trainingData, All, 
  TrainingProgressFunction -> appendBatch]

Here I have create a function to save the #BatchData and the #BatchLossList for the last two examples. The factor 2 comes from the fact that I see it is using 2 batches for each round of training.

Now to the question:

  • The loss of the last round is reported as this:

    Print["Last round loss reported: ", result["RoundLoss"]]
    (* Last round loss reported: 6.05732*10^-6 *)
    
  • I can reproduce this from the stored loss list for each batch. Since every round contains two batches, I average the two:

    Print["Mean round loss recalculated: ", Mean[lastBatchLossList], 
      " from loss of the last 2 batches: ", lastBatchLossList];
    (* Mean round loss recalculated: 6.05732*10^-6 from loss of the last 2 batches: {4.61266*10^-6,7.50198*10^-6} *)
    
  • Great! Now I want to recompute it by feeding the actual batch data into the network:

    trained = result["TrainedNet"];
    mb1 = Mean[(trained[lastBatchIn[[1]]] - lastBatchOut[[1]])^2];
    mb2 = Mean[(trained[lastBatchIn[[2]]] - lastBatchOut[[2]])^2];
    Print["Mean round loss recomputed: ", 0.5*(mb1 + mb2), 
      " from last 2 batches: ", mb1, " ", mb2];
    (* Mean round loss recomputed: 5.77591*10^-6 from last 2 batches: 5.35384*10^-6 6.19799*10^-6 *)
    

It's definitely not the same. It may be close, but I want to figure out how to get the exact same result. How can I reproduce the loss exactly with the trained network?

Thanks!

Attachments:
POSTED BY: Oliver Ernst

The thing is that the batch losses during training were NOT computed with the final trained net, but with a partially trained network, which is different for every training batch.

You can get the partially trained network using the association key #Net in the function passed to TrainingProgressFunction.

I insist: the trained net is updated after every training batch.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract