xarray: open_mfdataset parallel=True failing with netcdf4 >= 1.6.1
What happened?
When using the parallel=True
key, open_mfdataset
fails with NetCDF: Unknown file format
. Running the same command again (with try+except), or with parallel=False
executes as expected.
works:
xr.open_mfdataset(dirpath +'\\*.nc', parallel=False)
works:
try:
xr.open_mfdataset(dirpath +'\\*.nc', parallel=True)
except:
xr.open_mfdataset(dirpath +'\\*.nc', parallel=True)
fails:
xr.open_mfdataset(dirpath +'\\*.nc', parallel=True)
[Errno -51] NetCDF: Unknown file format
all with engine='netcdf4'
any help is highly appreciated as I’m a bit lost how to investigate this further.
What did you expect to happen?
No response
Minimal Complete Verifiable Example
No response
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
Relevant log output
No response
Anything else we need to know?
No response
Environment
About this issue
- Original URL
- State: open
- Created 2 years ago
- Comments: 23 (14 by maintainers)
Commits related to this issue
- tempoarily remove iris from ci, trying to reproduce #7079 — committed to jhamman/xarray by jhamman a year ago
- Attempt to reproduce #7079 in CI (#7488) * tempoarily remove iris from ci, trying to reproduce #7079 * add parallel=True test when using dask cluster * lint * add local scheduler test * p... — committed to pydata/xarray by jhamman 9 months ago
@jhamman Sorry for my delay — I started this the other day and got waylaid. I’ll try to get back to it today or tomorrow.
Use LocalCluster! 😉
This fix will restrict you to serial compute.
You can also parallelize across processes using something like
or
LocalCluster(threads_per_worker=1, ...)