I'm working on training a neural network on an image dataset. There are 14k images and each image contains 3x150x150 pixels. I have built a generator function following the approach in Training on Large Datasets for Out-of-Core training. According to the reference
The first approach is for users to write a generator function f that, when evaluated, can load a single batch of data from an external source such as a disk or database. NetTrain[net,f,…] calls f at each training batch iteration, thus only keeping a single batch of training data in memory.
I use a generator function in that documentation:
genTrain = Function[RandomSample[trainingData, #BatchSize]]
The output of it is as expected:
However, the memory consumption of NetTrain keeps growing every round. I use a small RoundLength here in case of memory exhaust:
trainedNet = NetTrain[lenet, {genTrain, "RoundLength" -> 280},
TargetDevice -> "GPU", MaxTrainingRounds -> 50, BatchSize -> 64,
WorkingPrecision -> "Mixed", PerformanceGoal -> "TrainingMemory"]
Here is the memory usage from the task manager:
It increases till the end.
I've tried $HistoryLength=0
, Module[{x=Function[RandomSample[trainingData, #BatchSize]]},x]
but nothing changed.
My questions are:
Am I getting the right idea about Out-of-Core training? It should load a single batch of data and free it after use, isn't it? Or is there a bug?
What can I do to decrease memory consumption, for example, free the memory after every round or batch?