h5py: Accessing Dataset Across Chunks Slower than Without Chunking
System Info:
- Operating System: Windows 7
- Python version: 2.7.12
- Anaconda: 4.1.1 64 bit
- h5py version: 2.6.0
- HDF5 version 1.8.15
I have a problem dealing with very large dataset in the same frames X rows X cols. I am chunking the dataset in chunks of (n_frames, 64, 64). This results in significant read time improvement when reading a single chunk, But when reading multiple chunks (n_frames, m64, m64) it is dramatically slower than without chunking. This seems like a bug to me.
For example in the following, reading the chunked data is slower than the non-chunked data:
import h5py
import numpy as np
import time
data = np.array(500, 512, 512)
with h5py.File('datacube_chunked.h5', 'w') as fid:
fid.create_dataset('cube', data=data, chunks=(500, 64, 64))
with h5py.File('datacube.h5', 'w') as fid:
fid.create_dataset('cube', data=data)
start_time = time.time()
with h5py.File('datacube.h5', 'r') as fid:
a = fid["cube"][:,0:256,0:256]
print(time.time() - start_time)
start_time = time.time()
with h5py.File('datacube_chunked.h5', 'r') as fid:
b = fid["cube"][:,0:256,0:256]
print(time.time() - start_time)
If I manually read the (500, 64, 64) sections into b, then reading the chunked data is faster, but I would have thought that h5py does this under the hood already.
About this issue
- Original URL
- State: open
- Created 6 years ago
- Comments: 18 (10 by maintainers)
Thanks, now that you’ve reproduced it in C, I think you should definitely ask HDF group about this. If doing the right thing in C code is slower than doing it with a Python for loop, something is definitely wrong.
@takluyver Thanks for your suggestions. I adapted your benchmark script (https://gist.github.com/takluyver/0480a74881d84678f48b92c021129cd6) but it still shows the same time difference.
Also Selection type doesn’t matter, both SimpleSelection and FancySelection show similar performance, e.g. the following code is slow too:
To test the C API, I modified an example from https://support.hdfgroup.org/ftp/HDF5/examples/src-html/c.html (h5_chunk_read.c) for the files created from the python script above. Here is the code, hope I didn’t miss anything.
Modifying STRIDE and/or NSLICES, compiling it with gcc/8.3.0 and hdf5/1.10.5, and timing single runs, I surprisingly got the following typical numbers:
The results correlate to those from h5py, so seems like it’s an issue of hdf5.