Group Abstract Group Abstract

Message Boards Message Boards

1
|
46 Views
|
8 Replies
|
6 Total Likes
View groups...
Share
Share this post:

What is the best way to save a Dataset?

I generated a large Dataset from several files. What would be the best way to save (or export) the new Dataset for future use?

8 Replies
Posted 17 hours ago

The best/essential way includes also saving the meta data. I assume that you tried DumpSave to get the .mx format and that Get did not work to re-import the data.

POSTED BY: Jim Baldwin

Thank you Jim. I am testing with DumpSave, but I think it is the same problem as with Export to MX. Maybe the best is to convert the Dataset to a List and then save it as a list. I will continue testing.

Posted 21 hours ago

Have a look at the function Iconize. It will label and persistently save its argument data in the notebook where its executed. And the icon label (with its underlying data) can be copied to other notebooks.

POSTED BY: Hans Milton

Thank you Hans. I have used Iconize with smaller sets of data, but for very large ones (more than 13 million records) it doesn't seem to perform very well.

Posted 1 day ago

Do you want it in a format that other applications can use? Or do you only need to import back into Mathematica? Or do you just want to avoid re-generating the Dataset again (avoid being dependent on those original files)?

POSTED BY: Eric Rimbey

Hi Eric. I just want to save them to import them back in Mathematica. I alreaqdy tried the *.MX format. It worked exporting (took a long time, though), but it didn't work when I tried to importe the data back to Mathematica. One Dataset has about 13 million records.

Hi Ricardo,

Have you tried to re-express the Dataset object using Tabular, new in WL 14.2 ? Tabular is usually much more efficient in storage, because it uses special ways to encode the different types of columns, say dates or strings. Even if your Dataset object is deeper than two levels, it can be expressed as a Tabular object. The important thing is that it is not very ragged.

Other than that, it may be more efficient to save the Dataset object in some compressed format, like MX or WDX.

Hi José Martín,
Nice to read you!
Thank you for your quick response. Unfortunately, I still have version 13.3. I will try the other options you mention.

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard