Message Boards Message Boards

1
|
16032 Views
|
8 Replies
|
8 Total Likes
View groups...
Share
Share this post:

What is the most efficient export/import format for persisting data?

Posted 7 years ago

I exported a List from a Mathematica 11.1.1.0 Notebook to .wdx format. Since this format is native to Mathematica and intended for data, I had expected it to be the most performant option, with fast export and import times and maintaining exaction resolution of numerics. Unfortunately, the load time to read the .wdx for a moderate size List was over an hour. The List holds a rectangular structure with 24,000 rows and 11,000 columns.

Is .wdx really the best / fastest format for exporting data in a Mathematica variable and then loading it back into a new session?

8 Replies

Interesting question! I was not aware about the mentioned huge difference in I/O-speed. But here I just want/must cite the documentation:

? MX files cannot be exchanged between different operating systems or versions of the Wolfram System.

? [WDX] Stores arbitrary Wolfram Language expressions in a serialized, platform-independent form.

So, at least for me the .mx format would never ever be an option if data are meant to be persistent.

POSTED BY: Henrik Schachner

I remember hearing that since a certain version it is cross compatible, I have never tested that though…

Other options are compressing it and writing it binary:

str=OpenWrite[file,BinaryFormat->True];
BinaryWrite[str,Compress[myexpression],"TerminatedString"];
Close[str];
POSTED BY: Sander Huisman

But is is well-known that the documentation is incorrect. It just works cross-platform. The only thing which does work is to load a .mx file produced in a newer version of Mathematica into an oder one.

POSTED BY: Rolf Mertig

Maybe it is also an endian-issue? since most of us use little endian, the problem doesn't show up a lot? just a thought…

POSTED BY: Sander Huisman

Thanks for the note @Sander Huisman. In my problem case with .wdx, the export from was a GNU/Linux x8664 (Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz) and the import was to a GNU/Linux x8664 (Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz) so both sides would have been using little-endian format. In addition, the version of Mathematica was the same on both systems.

WDX is designed to be cross-platform and backwards compatible and so on. MX supposedly isn't, though there is evidence that makes it cross-platform for newer versions of Mathematica…

POSTED BY: Sander Huisman

The most efficient format is .mx:

test = RandomReal[{-10, 10}, {24000, 11000}];
AbsoluteTiming[Export["big.mx", test];]

needs 15 seconds ( depending also on your hard drive speed I guess). Loading it back in by

test=Import["big.mx"];

is pretty fast, too, less than a second. I never ever use wdx. Always .mx, which is also the fastest way known to me to collect larger results from parallel subkernels back to the master kernel ...

POSTED BY: Rolf Mertig
Posted 3 years ago

Hi, may I ask a question? When I export the data after parallel evaluation by .mx, I found it's very slow! But under the same data amount I export it very fast in one-kernel calculation. Due to the calculation is very solw I have to use parallel, but get into trouble while exporting. So I wonder if you can give some advice? Thanks!

POSTED BY: Tony Coliderse
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract