Group Abstract Group Abstract

Message Boards Message Boards

How is the loss computed from the batch and round?

Posted 4 years ago

Hello,

I am trying to just reproduce the loss computed during training a very simple network. In this case the loss is just the standard L2 loss function. I also attached the complete notebook.

Start with some data to learn a sin function:

trainingData = Table[x -> Sin[x], {x, 0, 2 Pi, 2 Pi/100.0}];

Next, create a very simple network:

chain = NetChain[{10, Tanh, 1}];

Normally, I would train it just like this:

result = NetTrain[chain, trainingData, All

But to reproduce the loss, I need to know what the batch data was used at each iteration. So let's also save that:

lastBatchIn = {};
lastBatchOut = {};
lastBatchLossList = None;

appendBatch = Block[{},
    (* Save the last two batch data *)

    AppendTo[lastBatchIn, Normal[#BatchData["Input"]]];
    AppendTo[lastBatchOut, Normal[#BatchData["Output"]]];
    If[Length[lastBatchIn] > 2,
     lastBatchIn = lastBatchIn[[-2 ;;]];
     lastBatchOut = lastBatchOut[[-2 ;;]];
     ];

    (* Save the last two losses for these batches *)

    lastBatchLossList = #BatchLossList[[-2 ;;]];
    ] &;

result = NetTrain[chain, trainingData, All, 
  TrainingProgressFunction -> appendBatch]

Here I have create a function to save the #BatchData and the #BatchLossList for the last two examples. The factor 2 comes from the fact that I see it is using 2 batches for each round of training.

Now to the question:

  • The loss of the last round is reported as this:

    Print["Last round loss reported: ", result["RoundLoss"]]
    (* Last round loss reported: 6.05732*10^-6 *)
    
  • I can reproduce this from the stored loss list for each batch. Since every round contains two batches, I average the two:

    Print["Mean round loss recalculated: ", Mean[lastBatchLossList], 
      " from loss of the last 2 batches: ", lastBatchLossList];
    (* Mean round loss recalculated: 6.05732*10^-6 from loss of the last 2 batches: {4.61266*10^-6,7.50198*10^-6} *)
    
  • Great! Now I want to recompute it by feeding the actual batch data into the network:

    trained = result["TrainedNet"];
    mb1 = Mean[(trained[lastBatchIn[[1]]] - lastBatchOut[[1]])^2];
    mb2 = Mean[(trained[lastBatchIn[[2]]] - lastBatchOut[[2]])^2];
    Print["Mean round loss recomputed: ", 0.5*(mb1 + mb2), 
      " from last 2 batches: ", mb1, " ", mb2];
    (* Mean round loss recomputed: 5.77591*10^-6 from last 2 batches: 5.35384*10^-6 6.19799*10^-6 *)
    

It's definitely not the same. It may be close, but I want to figure out how to get the exact same result. How can I reproduce the loss exactly with the trained network?

Thanks!

Attachments:
POSTED BY: Oliver Ernst

The thing is that the batch losses during training were NOT computed with the final trained net, but with a partially trained network, which is different for every training batch.

You can get the partially trained network using the association key #Net in the function passed to TrainingProgressFunction.

I insist: the trained net is updated after every training batch.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard