Hi everyone,
I've written a function that returns a list, and I have to loop the function over a big sample. If I loop through all of my sample data, the result would be an array of about 10 million rows and about 30 columns. I know from past experience that a procedure such as this will slow as the array grows, so instead, I will run the procedure on subsamples of the data, creating many smaller output arrays, saving each to a file in turn, and clearing them as I go. What I am grappling with now is how to append those smaller arrays to a single file and, ideally, merge them into one big array. Here's what I came up with using three arrays as pretend output. My approach is clunky, and my question is whether there is a better way.
Here are the output arrays, all of which are 4 x 3. m1 contains real numbers, m2 strings, and m3 integers just to be able to distinguish them easily.
SeedRandom[666]
m1 = RandomReal[1, {4, 3}]
m2 = RandomChoice[CharacterRange["A", "Z"], {4, 3}]
m3 = RandomInteger[100, {4, 3}]
m = {m1, m2, m3};
Create a file to save them. dumpPath is just the file path to my desktop.
tmpFile = FileNameJoin[{dumpPath, "tmp.nb"}]
CreateFile[tmpFile];
Append the arrays to the file using PutAppend.
Map[PutAppend[#, tmpFile] &, m]
Of course, by using PutAppend, the arrays are appended as separate expressions to the same cell in the notebook file, but are not merged or joined into one array. That would be nice, but I don't know how to do that, so after the procedure is done, I have to read the file back into Mathematica to join the arrays. I do this using ReadList.
readData = ReadList[tmpFile]
This gives me a list containing the three arrays. (It will be a huge list when I run it on my real sample.)
Finally, the three arrays can be merged into one using Flatten and then saved to another notebook file (step not shown).
Flatten[readData, 1]
I'm interested in other approaches, especially one that would append the output to the file and join it (build a single array) as it goes. Any tips would be much appreciated.
Regards,
Greg