Message Boards Message Boards

Deal with memory problems when using Block or Module?

Posted 8 years ago

Hi everyone,

I often have to write code that processes the text in many files and then exports the results for each as a file. It usually looks something like this:

myProcess = Block[{text, results,filePath},
Do[
text =Import[thisFile];
results = stuffDoneToText[text];
Export[filePath, results, "CSV"];
{thisFIle, theseFiles}]]

The files are small, but very often the procedure slows to a crawl. It might be after eight files or after 128. When I get fed up and abort, I get the spinning beachball on my Mac telling me that it cannot handle the Mathematica's demand for memory. I don't understand why. I figure that the text and results variables are being overwritten with each iteration of the Do loop, so where is the data accumulating to consume memory? Does it have to do with my use of Block? I have had the same problem with Module. I'm wondering whether I should perhaps not use either and stick with global variables instead.

Any insights would be much appreciated,

Gregory

POSTED BY: Gregory Lypny
4 Replies

Try to set:

 $HistoryLength = 2;

as it is by default set to infinity meaning it will save all the Out[...]. With the above setting it will only save the last two output. I work with multi-GB arrays, and if I don't set this after a few manipulations of the big matrix I will have several copies in my Out[] and Mathematica consumes all memory...

POSTED BY: Sander Huisman
Posted 8 years ago

Hi Sander,

Thanks for the tip! I put the statement at the beginning of my notebook. It did help, although not as much as I expected. If the average time to process a file by itself is 10 seconds, it takes 10 minutes or more per file when I am doing it inside of a Do loop wrapped in Block. I think that Mathematica must be gobbling up memory in some other way.

Regards,

Gregory

POSTED BY: Gregory Lypny
Posted 8 years ago

Hi again Sander,

I added

Module[{}, Unprotect[In, Out]; Clear[In, Out]; Protect[In, Out]; ClearSystemCache[]; ];

which I found on the StackOverflow website. It speeds things up, not ideal, but in conjunction with your tip about HistoryLength, better than before.

My procedure makes repeated use of string patterns, and I have read that string pattern functions can become arbitrarily slow if used repeated as part of a procedure. Do you have any tips on keeping string pattern matching from slowing down?

Regards,

Gregory

POSTED BY: Gregory Lypny

Perhaps you can dig a little deeper in your stuffDoneToText function and store the timings for each subfunction. To narrow done what the culprit is. Without specifying the entire program it will be hard for any of us to guess ;)

The clear in/out should be very similar to the $HistoryLength one...

POSTED BY: Sander Huisman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract