h5py: Dataset slice reference - AttributeError: module 'h5py' has no attribute 'ref_dtype' - documentation outdated?
To assist reproducing bugs, please include the following:
- Operating System: Ubuntu 18.04
- Python version: 3.7.3
- Where Python was acquired: Anaconda,
conda install h5py(conda-forge) - h5py version: 2.9.0
- HDF5 version: 1.10.4
- The full traceback/stack trace shown (if it appears)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-16-3e6d0b14c796> in <module>
9
10 # ref slice of both dataset
---> 11 ds_ref = h5_store.create_dataset("ref", (100,), dtype=h5py.ref_dtype)
12 ds_ref[:50] = ds1[:50]
AttributeError: module 'h5py' has no attribute 'ref_dtype'
Just started exploring linking 2 dataset slices through references, therefore I’m following this page: http://docs.h5py.org/en/stable/refs.html
To get the above error, I ran the following code (in a JupyterHub Notebook):
import h5py
with h5py.File("sliceref.h5", 'w') as h5_store:
# create 2 datasets
ds1 = h5_store.create_dataset('wav1', (100,))
ds1[...] = np.arange(100)
print(ds1[:])
ds2 = h5_store.create_dataset('wav2', (100,))
ds2[...] = np.arange(100, 200, 1)
print(ds2[:])
# ref slice of both dataset
ds_ref = h5_store.create_dataset("ref", (100,), dtype=h5py.ref_dtype)
ds_ref[:50] = ds1.regionref[:50]
Seeing print xxx statements in the documentation makes me assume that it was written for Python 2.7 and has just not been updated?
What would be the correct way of doing (combining the first half of 2 arrays through referencing, no data duplication)
ds_ref[:50] = ds1.regionref[:50]
ds_ref[50:] = ds2.regionref[:50]
?
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 16 (8 by maintainers)
Yes, you can store it in the file. You assemble the VirtualLayout, then pass it to
f.create_virtual_dataset()to add it to the file. Have a look at the documentation about virtual datasets: http://docs.h5py.org/en/stable/vds.htmlI had to look this up (I haven’t used references before). A region reference is both a reference to the dataset and to a selection from that dataset, so you only need to store one thing.
To use it with h5py, you need to use it in two lookups: once to get the dataset, once to get the data from it:
I think you’re mixing up Reference and RegionReference. When you create the dataset to store references, use
h5py.regionref_dtypeinstead ofh5py.ref_dtype.I’ve updated my
h5pyto2.10.0withconda install -c conda-forge h5py(Anaconda channel is still on2.9.0).Before updating the documentation, it would be nice to also include how to use the references.
Reference usage
First running the above code, and then this:
Expectation:
Reality:
Question
How do I use references to get the output of my expectation? Or should I use a different approach for that?