Message Boards Message Boards


How is the loss computed from the batch and round?

Posted 6 days ago
0 Replies
0 Total Likes


I am trying to just reproduce the loss computed during training a very simple network. In this case the loss is just the standard L2 loss function. I also attached the complete notebook.

Start with some data to learn a sin function:

trainingData = Table[x -> Sin[x], {x, 0, 2 Pi, 2 Pi/100.0}];

Next, create a very simple network:

chain = NetChain[{10, Tanh, 1}];

Normally, I would train it just like this:

result = NetTrain[chain, trainingData, All

But to reproduce the loss, I need to know what the batch data was used at each iteration. So let's also save that:

lastBatchIn = {};
lastBatchOut = {};
lastBatchLossList = None;

appendBatch = Block[{},
    (* Save the last two batch data *)

    AppendTo[lastBatchIn, Normal[#BatchData["Input"]]];
    AppendTo[lastBatchOut, Normal[#BatchData["Output"]]];
    If[Length[lastBatchIn] > 2,
     lastBatchIn = lastBatchIn[[-2 ;;]];
     lastBatchOut = lastBatchOut[[-2 ;;]];

    (* Save the last two losses for these batches *)

    lastBatchLossList = #BatchLossList[[-2 ;;]];
    ] &;

result = NetTrain[chain, trainingData, All, 
  TrainingProgressFunction -> appendBatch]

Here I have create a function to save the #BatchData and the #BatchLossList for the last two examples. The factor 2 comes from the fact that I see it is using 2 batches for each round of training.

Now to the question:

  • The loss of the last round is reported as this:

    Print["Last round loss reported: ", result["RoundLoss"]]
    (* Last round loss reported: 6.05732*10^-6 *)
  • I can reproduce this from the stored loss list for each batch. Since every round contains two batches, I average the two:

    Print["Mean round loss recalculated: ", Mean[lastBatchLossList], 
      " from loss of the last 2 batches: ", lastBatchLossList];
    (* Mean round loss recalculated: 6.05732*10^-6 from loss of the last 2 batches: {4.61266*10^-6,7.50198*10^-6} *)
  • Great! Now I want to recompute it by feeding the actual batch data into the network:

    trained = result["TrainedNet"];
    mb1 = Mean[(trained[lastBatchIn[[1]]] - lastBatchOut[[1]])^2];
    mb2 = Mean[(trained[lastBatchIn[[2]]] - lastBatchOut[[2]])^2];
    Print["Mean round loss recomputed: ", 0.5*(mb1 + mb2), 
      " from last 2 batches: ", mb1, " ", mb2];
    (* Mean round loss recomputed: 5.77591*10^-6 from last 2 batches: 5.35384*10^-6 6.19799*10^-6 *)

It's definitely not the same. It may be close, but I want to figure out how to get the exact same result. How can I reproduce the loss exactly with the trained network?


Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract