Group Abstract

Message Boards

WOLFRAM COMMUNITY

6.3K Views

1 Reply

2 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

How is the loss computed from the batch and round?

Oliver Ernst

Oliver Ernst, UCSD, Salk

Posted 4 years ago

Hello, I am trying to just reproduce the loss computed during training a very simple network. In this case the loss is just the standard L2 loss function. I also attached the complete notebook. Start with some data to learn a sin function: trainingData = Table[x -> Sin[x], {x, 0, 2 Pi, 2 Pi/100.0}]; Next, create a very simple network: chain = NetChain[{10, Tanh, 1}]; Normally, I would train it just like this: result = NetTrain[chain, trainingData, All But to reproduce the loss, I need to know what the batch data was used at each iteration. So let's also save that: lastBatchIn = {}; lastBatchOut = {}; lastBatchLossList = None; appendBatch = Block[{}, (* Save the last two batch data ) AppendTo[lastBatchIn, Normal[#BatchData["Input"]]]; AppendTo[lastBatchOut, Normal[#BatchData["Output"]]]; If[Length[lastBatchIn] > 2, lastBatchIn = lastBatchIn[[-2 ;;]]; lastBatchOut = lastBatchOut[[-2 ;;]]; ]; ( Save the last two losses for these batches ) lastBatchLossList = #BatchLossList[[-2 ;;]]; ] &; result = NetTrain[chain, trainingData, All, TrainingProgressFunction -> appendBatch] Here I have create a function to save the #BatchData and the #BatchLossList for the last two examples. The factor 2 comes from the fact that I see it is using 2 batches for each round of training. Now to the question: The loss of the last round is reported as this: Print["Last round loss reported: ", result["RoundLoss"]] ( Last round loss reported: 6.0573210^-6 ) I can reproduce this from the stored loss list for each batch. Since every round contains two batches, I average the two: Print["Mean round loss recalculated: ", Mean[lastBatchLossList], " from loss of the last 2 batches: ", lastBatchLossList]; (* Mean round loss recalculated: 6.0573210^-6 from loss of the last 2 batches: {4.6126610^-6,7.5019810^-6} ) Great! Now I want to recompute it by feeding the actual batch data into the network: trained = result["TrainedNet"]; mb1 = Mean[(trained[lastBatchIn[[1]]] - lastBatchOut[[1]])^2]; mb2 = Mean[(trained[lastBatchIn[[2]]] - lastBatchOut[[2]])^2]; Print["Mean round loss recomputed: ", 0.5(mb1 + mb2), " from last 2 batches: ", mb1, " ", mb2]; ( Mean round loss recomputed: 5.7759110^-6 from last 2 batches: 5.3538410^-6 6.1979910^-6 ) It's definitely not the same. It may be close, but I want to figure out how to get the exact same result. How can I reproduce the loss exactly with the trained network? Thanks! Attachments: loss_test.nb

Hello,

I am trying to just reproduce the loss computed during training a very simple network. In this case the loss is just the standard L2 loss function. I also attached the complete notebook.

Start with some data to learn a sin function:

trainingData = Table[x -> Sin[x], {x, 0, 2 Pi, 2 Pi/100.0}];

Next, create a very simple network:

chain = NetChain[{10, Tanh, 1}];

Normally, I would train it just like this:

result = NetTrain[chain, trainingData, All

But to reproduce the loss, I need to know what the batch data was used at each iteration. So let's also save that:

lastBatchIn = {};
lastBatchOut = {};
lastBatchLossList = None;

appendBatch = Block[{},
    (* Save the last two batch data *)

    AppendTo[lastBatchIn, Normal[#BatchData["Input"]]];
    AppendTo[lastBatchOut, Normal[#BatchData["Output"]]];
    If[Length[lastBatchIn] > 2,
     lastBatchIn = lastBatchIn[[-2 ;;]];
     lastBatchOut = lastBatchOut[[-2 ;;]];
     ];

    (* Save the last two losses for these batches *)

    lastBatchLossList = #BatchLossList[[-2 ;;]];
    ] &;

result = NetTrain[chain, trainingData, All, 
  TrainingProgressFunction -> appendBatch]

Here I have create a function to save the #BatchData and the #BatchLossList for the last two examples. The factor 2 comes from the fact that I see it is using 2 batches for each round of training.

Now to the question:

The loss of the last round is reported as this:

Print["Last round loss reported: ", result["RoundLoss"]]
(* Last round loss reported: 6.05732*10^-6 *)

I can reproduce this from the stored loss list for each batch. Since every round contains two batches, I average the two:

Print["Mean round loss recalculated: ", Mean[lastBatchLossList], 
  " from loss of the last 2 batches: ", lastBatchLossList];
(* Mean round loss recalculated: 6.05732*10^-6 from loss of the last 2 batches: {4.61266*10^-6,7.50198*10^-6} *)

Great! Now I want to recompute it by feeding the actual batch data into the network:

trained = result["TrainedNet"];
mb1 = Mean[(trained[lastBatchIn[[1]]] - lastBatchOut[[1]])^2];
mb2 = Mean[(trained[lastBatchIn[[2]]] - lastBatchOut[[2]])^2];
Print["Mean round loss recomputed: ", 0.5*(mb1 + mb2), 
  " from last 2 batches: ", mb1, " ", mb2];
(* Mean round loss recomputed: 5.77591*10^-6 from last 2 batches: 5.35384*10^-6 6.19799*10^-6 *)

It's definitely not the same. It may be close, but I want to figure out how to get the exact same result. How can I reproduce the loss exactly with the trained network?

Thanks!

POSTED BY: Oliver Ernst

1 Reply

Sort By:

Jérôme Louradour

Jérôme Louradour, Wolfram Research

Posted 4 years ago

The thing is that the batch losses during training were NOT computed with the final trained net, but with a partially trained network, which is different for every training batch. You can get the partially trained network using the association key `#Net` in the function passed to `TrainingProgressFunction`. I insist: the trained net is updated after every training batch.

POSTED BY: Jérôme Louradour

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback