Message Boards Message Boards

HDF5 import partial dataset

Posted 9 years ago

Hi I am trying to import a dataset from an HDF5 file, but the dataset is ~3 gb in size, so I cannot just open it directly. I need to import parts of the dataset, reduce that part to a single number (take the sum of all its elements), keep that number, move to the next part, and so on.

It seems that I cannot open part of a dataset even if I use: Import["file.h5", {"HDF5", "Datasets",1,1,1}]]

This will return the first part, of the first part, of the first dataest, but it seems to still open the entire dataset, which causes the operation to crash.

My though was to do something like:

data = Table[Total@Import["file.h5",{"HDF5","Datasets",1,1,n}]

But I am stuck at getting the Import[] to work. I get this error:

" Unable to communicate with closed link \
LinkObject[\!\(\"\\\"C:\\\\Program Files\\\\Wolfram \
Research\\\\Mathematica\\\\10.0\SystemFiles\Converters\
Binaries\\\\Windows-x86-64\\\\HDF5.exe\\\"\"\),7137,4"] "

I can open smaller datasets and parts of the smaller datasets with no problem.

Any thoughts on how to access just a part of a large dataset so it doesn't crash the import process?

POSTED BY: Greg Drozd
Posted 9 years ago

For large HDF datasets it may be better to export the SD that you want using another tool. Then import the raw binary data into Mathematica.

I can suggest the following tool for exporting SD blocks from HDF4 and HDF5 files: 'gdal_translate' is one of the GDAL utilities and works great for this. http://www.gdal.org/frmt_hdf5.html

I am currently working on some code to step through the various dimensions of binary remote sensing files. I am trying to avoid loading the complete file in favor of only loading a single channel, frame, DN value, or spectral profile.

The lower level file I/O routines are your friend there.

Functions of interest are: OpenRead[] Skip[] Read[] Close[]

I did battle with Import[] for a while until it became obvious that I was loading way too much data.

I hope that helps.

POSTED BY: Peter Willis
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract