Message Boards Message Boards

1
|
3310 Views
|
4 Replies
|
2 Total Likes
View groups...
Share
Share this post:

[Solved] How to efficiently write 300 M integers to file in timely fashion?

Posted 1 year ago

Hi, I'm running v13.1 on an Intel+Windows system with a reasonably fast SSD. My current application generates an ordered List of 250M to 300M integers which I would like to store for later use. The values are all in the range 0 to 30. In total I will have 30 to 100 of these files. Presently it takes 45 minutes of elapsed time to Export one of these lists to a .TSV File, which ends up being about 800MB in size. Read time is acceptable at ~12 seconds. What alternative Export format(s) might provide better storage performance?

POSTED BY: Richard Frost
4 Replies

Richard, you could use NumericArray to reduce storage and then export the list with DumpSave.

byteArray = 
 NumericArray[RandomInteger[{0, 30}, 300 * 10^6], "UnsignedInteger8"]

ByteCount[byteArray]

Timing[DumpSave["test.mx", byteArray]]

On my machine the 300M Integers are written in no time to the HDD and the size of the mx file is less than 300 MByte (file size was corrected). See Richards table

POSTED BY: Michael Helmle

Do you only need to access this from Mathematica, then *.mx using DumpSave is an option.

POSTED BY: Martijn Froeling

Much better!

Export .mx.gz wrote 133MB file in 1 minute.

Thank you.

POSTED BY: Richard Frost
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract