Frequently Asked Questions
- How can I use HDF5DotNet to read an HDF5 dataset or attribute of a fixed- or variable-length string type?
- How can I use HDF5DotNet to read and de-reference HDF5 region references?
The first two questions are related in that it is easy to overestimate what
H5D.read can do.
It's really just a thinly veiled version of the unmanaged
H5Dread function and
does not do any type marshalling.
The code snippets are written in IronPython and you shouldn't have much
difficulty translating them into your preferred .NET language.
The fixed- and variable-length HDF5 string datatypes have no direct counterpart in the .NET framework.
The .NET System.String
type is based on sequences of 2-byte System.Char
Unicode characters. (The C++/CLI counterpart is wchar_t, not char!) We have to use slightly
different methods for reading HDF5 fixed-length and variable-length strings into .NET strings.
Fixed-length HDF5 strings
This case is pretty straightforward. Since we know the length of the HDF5 strings in advance, we can read them
into properly sized System.Byte arrays and then use the GetString() method
of System.Text.ASCIIEncoding or System.Text.UTF8Encoding
(Determine the character set with H5T.get_cset!)
to convert System.Byte arrays into .NET strings. Here's an example of how to read a fixed-length ASCII-encoded
string attribute:
...
attr = H5A.open(dset, 'string attribute')
dtype = H5A.getType(attr)
size = H5T.getSize(dtype)
# this use of H5T.create is new in HDF5 1.8.8 and HDF5DotNet 1.8.8
mtype = H5T.create(H5T.CreateClass.STRING, size)
buffer = System.Array.CreateInstance(System.Byte, size)
H5A.read(attr, mtype, H5Array[System.Byte](buffer))
enc = System.Text.ASCIIEncoding()
print 'String attribute value: %s' % enc.GetString(buffer)
H5T.close(mtype)
H5T.close(dtype)
H5A.close(attr)
...
Variable-length HDF5 strings
With variable-length HDF5 strings matters are a little more subtle. If your gut instinct tells you 'pointer',
your are not too far from the solution. As a VB.NET or IronPython user you may ask, 'What's a pointer, Walter?'
Although some .NET languages support pointers they are not part of the
Common Language Specification (CLS),
and, whenever possible, we are trying to provide a solution that plays nice with all .NET languages.
A CLS-compliant construct which mimics pointers in .NET languages that otherwise don't support them is
System.IntPtr.
Instead of allocating an array of System.Byte we allocate an array of System.IntPtr "pointers" and
then use PtrToStringAnsi(IntPtr)
or PtrToStringUni(IntPtr) in
System.Runtime.InteropServices.Marshal
to convert them into .NET strings.
import clr
clr.AddReferenceToFile('HDF5DotNet.dll')
import HDF5DotNet
from HDF5DotNet import *
import System
from System import Array, IntPtr
status = H5.Open()
print '\nWelcome to HDF5 ', H5.Version.Major, '.', H5.Version.Minor, '.', H5.Version.Release, ' !\n'
file = H5F.open('h5ex_t_vlstringatt.h5', H5F.OpenMode.ACC_RDONLY)
dset = H5D.open(file, '/DS1')
attr = H5A.open(dset, 'A1')
atype = H5A.getType(attr)
print 'Attribute type class is %s' % H5T.getClass(atype)
print 'Variable length string? %s' % H5T.isVariableString(atype)
aspace = H5A.getSpace(attr)
print 'Dims = %i' % H5S.getSimpleExtentNDims(aspace)
npoints = H5S.getSimpleExtentNPoints(aspace)
print 'Count = %i' % npoints
mtype = H5T.create(H5T.CreateClass.STRING, -1)
buffer = Array.CreateInstance(IntPtr, npoints)
H5A.read(attr, mtype, H5Array[IntPtr](buffer))
print 'Value:\n'
for i in range(0,npoints):
print System.Runtime.InteropServices.Marshal.PtrToStringAnsi(buffer[i])
H5T.close(mtype)
H5S.close(aspace)
H5T.close(atype)
H5A.close(attr)
H5D.close(dset)
H5F.close(file)
print '\nShutting down HDF5 library\n'
status = H5.Close()
The H5R.dereference
method has 3 arguments: an object handle of an HDF5 object from the file containing the referenced object,
an indicator whether we are dealing with an object or region reference, and an object that implements
the H5R.IReference
interface. For region references, we use
RegionReference
objects. As for strings, H5D.read cannot read references directly into RegionReference
objects and we have to read them into an intermediate System.Byte array first.
Below is an example of reading and dereferencing an HDF5 region reference.
...
dsetRR = H5D.open(h5file, 'REGION_REFERENCES')
dtypeRR = H5T.copy(H5T.H5Type.STD_REF_DSETREG)
dspaceRR = H5D.getSpace(dsetRR)
npoints = H5S.getSimpleExtentNPoints(dspaceRR)
# Read the dataset in to a properly sized System.Byte array
nBytes = RegionReference.SizeInBytes
dataRR = System.Array.CreateInstance(System.Byte, npoints*nBytes)
H5D.read(dsetRR, dtypeRR, H5Array[System.Byte](dataRR))
# Create an array of RegionReference objects using the RegionReference(Byte[]) constructor
rr = Array.CreateInstance(RegionReference, npoints)
a = Array.CreateInstance(Byte, nBytes)
for i in range(rr.Length):
System.Array.Copy(dataRR, i*nBytes, a, 0, nBytes)
rr[i] = RegionReference(a)
# examine the first region reference
dset = H5R.dereference(dsetRR, H5R.ReferenceType.DATASET_REGION, rr[0])
name = H5R.getName(dsetRR, H5R.ReferenceType.DATASET_REGION, rr[0])
dspace = H5R.getRegion(dsetRR, rr[0])
dtype = H5T.copy(H5T.H5Type.NATIVE_INT)
# assume we are reading a 2x9 subset
shape = System.Array[System.Int64]((2,9))
data = System.Array.CreateInstance(System.Int32, shape)
H5D.read(dset, dtype, H5S_ALL, dspace, H5P_DEFAULT, H5Array[System.Int32](data))
...