Message Boards Message Boards

Train large deep learning NN in true batch mode?


When I am training a DNN (Deep Neural Network) a typical command is:

  TrainSet, {"TrainedNet", "LossEvolutionPlot", 
   "RMSWeightEvolutionPlot", "RMSGradientEvolutionPlot", 
   "TotalTrainingTime", "MeanBatchesPerSecond", 
   "MeanInputsPerSecond", "BatchLossList", "RoundLossList",  
   "ValidationLossList"}, ValidationSet -> Scaled[0.2], 
  Method -> {"SGD", "Momentum" -> 0.95}, 
  TrainingProgressReporting -> "Print", MaxTrainingRounds -> 5, 
  BatchSize -> 256];

This works fine for smaller training set, but eventually it will fail (even on my 32GB iMac) when the training sets start getting truly large (>100K images).

How can I use NetTrain[ ] so it does not require the full Training set (and validation set) to be loaded as an in memory object (in example: TrainSet)?

Ideally I want to have these image files in folders, where the folder name delineates the "tag". Then NetTrain[ ] grabs from these folders the necessary files for training, but in a way that does not destroy computer performance.

Is this a DIY project?

Any help on this critical issue is appreciated.

POSTED BY: Bryan Minor
4 months ago

Ideally I want to have these image files in folders

If you are dealing with image files, then there is a nice solution: instead of using images, use the filename to the image instead (and try use JPG: this is the fastest out-of-core format). So: NetTrain[{File[...] -> "Class1", ...}, ...]

4 months ago

Group Abstract Group Abstract