Group Abstract Group Abstract

Message Boards Message Boards

0
|
12.8K Views
|
6 Replies
|
6 Total Likes
View groups...
Share
Share this post:

Is there a best way to append data to a file?

Posted 7 years ago
POSTED BY: Gregory Lypny
6 Replies
Posted 7 years ago

Hi Neil,

Thanks for the tips. I do, in fact, avoid Do and instead use Map or Table. The big array I refer to is the result of many computations from a correspondingly big dataset. Each row of the array will be the output of one computation; in that way the array is being built up, and my experience has been that repeated evaluation of a function can cause a process to slow to a crawl. I do things like $HistoryLength = 2 to mitigate that.

Greg

POSTED BY: Gregory Lypny
Posted 7 years ago

The Write command works well, a good alternative to PutAppend.

Thanks, John.

Greg

POSTED BY: Gregory Lypny
Anonymous User
Anonymous User
Posted 7 years ago

Write[channel,Subscript[expr, 1],Subscript[expr, 2],[Ellipsis]] writes the expressions Subscript[expr, i] in sequence, followed by a newline, to the specified output channel.

The output channel used by Write can be a single file or pipe, or a list of them, each specified as a string "name", as File["name"], or as an OutputStream object.


i may not understand the initial question. but Write appends upon each invocation, is efficient, and can write expressions to file/stream

if that's so, the PutAppend is not the only option

POSTED BY: Anonymous User

Greg,

You should be able to handle 10 million x 30 arrays in Mathematica without breaking them up. I tried:

In[8]:= Timing[bigarray = RandomReal[1, {10000000, 30}];]

Out[8]= {2.75571, Null}

It only took 2.7 seconds to generate a random array of that size and Mathematica was fine with it. There are several things you can do to optimize this. 1. do not print the arrays -- use a semicolon so they are not displayed. 2. Stay away from Do and other looping constructs. Use the list functions such as Table, Map, etc. 3. Write the data (whether or not you break it up) into a binary file such as .mat or use the Binary read and write functions.

Note if you want to run your functions over subsets of the bigarray, you can do that by using the Part functionality and still keep the array as one big array for writing to your file later. For example

bigarray[[1 ;; 5]] = RandomReal[10, {5, 30}]

Will replace the first 5x30 array elements with new numbers ranging from 0 to 10. This is done in place so you still have one big array but can process it in "chunks".

I hope this helps.

Regards,

Neil

POSTED BY: Neil Singer
Posted 7 years ago

Hi Alexey,

Thanks for your suggestion, but I must be doing something wrong. I replaced {…} with Sequence and changed the extension of the export file to .m, as in your code, but ReadList[tmpFile] still returns a list of lists. It seems that Sequence has eliminated the need for Map.

I any case, I should have been clearer in my example. I won't have all of the matrices—m1, m2, and m3—available all at once for appending to a file. They will be appended and then deleted or cleared in turn as soon as they are produced in order to conserve memory. So, what I'd like to accomplish is to flatten the contents of the file every time that one of the matrices is appended: flatten as I go.

Greg

POSTED BY: Gregory Lypny
Posted 7 years ago

Hi Gregory,

You need a minor change in the code for achieving the desired result:

SeedRandom[666]
m1 = RandomReal[1, {4, 3}]
m2 = RandomChoice[CharacterRange["A", "Z"], {4, 3}]
m3 = RandomInteger[100, {4, 3}]
m = Sequence[m1, m2, m3];

tmpFile = FileNameJoin[{dumpPath, "tmp.m"}]

PutAppend[m, tmpFile] 

readData = ReadList[tmpFile]

Now readData contains already flattened array. Note that the correct extensions for files containing Mathematica expressions intended for loading using Get, ReadList or Import are .m and .wl, the extension .nb is for Mathematica notebooks.

POSTED BY: Alexey Popkov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard