I'm working on an image classification machine learning project and have a somewhat large (~800 MiB) training data set. The data set is composed of individual PNG image files organized in directories. Upon trying to import them all
loadedData = ParallelMap[Import, dataFiles]
It takes an extremely long time (10+ minutes), and ends up using 21 GiB of memory. Obviously, something seems to be wrong. Is there a more efficient or more correct way to be loading all these images?