Group Abstract Group Abstract

Message Boards Message Boards

0
|
3.7K Views
|
3 Replies
|
0 Total Likes
View groups...
Share
Share this post:

NetTrain exceeds size of dataset when using a generator function

Posted 2 years ago
3 Replies
Posted 2 years ago

Hi wiesenekker, I'm not an expert but here

trainfeatures[[(#Round - 1)* 24 * #BatchSize + 
            1 ;; #Round * 24 * #BatchSize ]]]

Trainfeatures[[99 * 24 * 10 + 1]]=trainfeatures[[23761]]
When Length[trainfeatures]=2400

Batchsize is an error. Maybe I think Bye

POSTED BY: Mauro Bertani

The following generator function randomly selects a batch rather then relying on the 'Round' number:

batchsize = 10
nbatches = Length[trainfeatures]/ (npack  * batchsize)
genTrain = 
 Function[
  With[{ibatch = RandomInteger[{1, nbatches}]}, <|
    "Input" -> 
     Map[Flatten, 
      IntegerDigits[
       ArrayReshape[
        Normal[trainfeatures[[(ibatch - 1)* 24 * #BatchSize + 1 ;; 
            ibatch * 24 * #BatchSize ]]], {#BatchSize, 24}], 2, 8]], 
    "Output" -> 
     trainlabels[[(ibatch - 1) * #BatchSize + 1 ;; 
        ibatch* #BatchSize ]]|>]]

With this generator function the error is gone.

Regards, GW

Also, 'RoundLength' turns out to be 'the number of samples that is expected to be seen during a Round (epoch)', so it should be equal to the number of training samples, 'ntrain' in this case:

trained = 
 NetTrain[net, {genTrain, "RoundLength" -> ntrain}, All, 
  BatchSize -> batchsize, TargetDevice -> "GPU"]

With 'ntrain' set to 100000000 and 'batchsize' to 65536 you get 1525 batches. The Training Progress shows 'round 1//10, batch x/1525)' as expected.

Regards, GW

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard