h5netcdf: Files written by h5netcdf cannot be edited by netcdf4-python
What happened:
Files written by h5netcdf cannot be edited by netcdf4-python. It would be good if they could be. I believe the following patch is needed. Would you like me to make the Pull Request?
What you expected to happen:
MCVE Code Sample
import xarray as xr
import numpy as np
dataset = xr.DataArray(
data=np.zeros((3, 3)),
name="my_data"
)
engine = "h5netcdf"
# engine = "netcdf4"
dataset.to_netcdf("test.nc", engine=engine, format="NETCDF4")
import netCDF4
nc = netCDF4.Dataset("test.nc", mode="a")
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
/tmp/ipykernel_367089/2679834096.py in <module>
1 import netCDF4
----> 2 nc = netCDF4.Dataset("test.nc", mode="a")
src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.__init__()
src/netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()
OSError: [Errno -103] NetCDF: Can't write file: b'test.nc'
Expected Output
That it works.
Version
Output of <tt>print(h5py.version.info, f"\nh5netcdf {h5netcdf.__version__}")</tt>
Summary of the h5py configuration
h5py 3.6.0 HDF5 1.12.1 Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:41:03) [GCC 9.4.0] sys.platform linux sys.maxsize 9223372036854775807 numpy 1.19.5 cython (built with) 0.29.24 numpy (built against) 1.19.5 HDF5 (built against) 1.12.1
h5netcdf 0.12.0
Suggested patch
diff --git a/h5netcdf/core.py b/h5netcdf/core.py
index b68e3f5..5994928 100644
--- a/h5netcdf/core.py
+++ b/h5netcdf/core.py
@@ -462,7 +462,7 @@ class Group(Mapping):
def _create_child_group(self, name):
if name in self:
raise ValueError("unable to create group %r (name already exists)" % name)
- self._h5group.create_group(name)
+ self._h5group.create_group(name, track_order=True)
self._groups[name] = self._group_cls(self, name)
return self._groups[name]
@@ -474,7 +474,7 @@ class Group(Mapping):
def create_group(self, name):
if name.startswith("/"):
- return self._root.create_group(name[1:])
+ return self._root.create_group(name[1:], track_order=True)
keys = name.split("/")
group = self
for k in keys[:-1]:
@@ -789,15 +789,15 @@ class File(Group):
"opening urls: {}".format(path)
)
try:
- with h5pyd.File(path, "r") as f: # noqa
+ with h5pyd.File(path, "r", track_order=True) as f: # noqa
pass
self._preexisting_file = True
except IOError:
self._preexisting_file = False
- self._h5file = h5pyd.File(path, mode, **kwargs)
+ self._h5file = h5pyd.File(path, mode, track_order=True, **kwargs)
else:
self._preexisting_file = os.path.exists(path) and mode != "w"
- self._h5file = h5py.File(path, mode, **kwargs)
+ self._h5file = h5py.File(path, mode, track_order=True, **kwargs)
else: # file-like object
if version.parse(h5py.__version__) < version.parse("2.9.0"):
raise TypeError(
@@ -806,7 +806,7 @@ class File(Group):
)
else:
self._preexisting_file = mode in {"r", "r+", "a"}
- self._h5file = h5py.File(path, mode, **kwargs)
+ self._h5file = h5py.File(path, mode, track_order=True, **kwargs)
except Exception:
self._closed = True
raise
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 24 (13 by maintainers)
To summarize the issue you pointed to:
netCDF4 wants the track_order parameter of the h5py constructors to be true though this might be getting relaxed in a more recent version of the standard, it seems like it is a little far away for now.
Let me know if you think anything is missing from https://github.com/h5netcdf/h5netcdf/pull/129
Happy to hold off on any discussion there while we wait for shoyer to chime in here.
@hmaarrfk A first test writing with
track_order=Falseand reading withtrack_order=Truegives no problems. It looks likeh5netcdfis track_order agnostic somehow.It would be good to parameterize it, with default to
track_order=True. Not sure if we need a deprecation cycle if this has no effects. The only effect I can think of is that some objects are repr’d in another order. But if this new ordering is in line with PythondictI would just give it a go. But I’d like to hear @shoyer’s suggestions how to proceed here.Sure adding tests for checking different read/write combinations would be good.