xarray: open_mfdataset() significantly slower on 0.9.1 vs. 0.8.2
I noticed a big speed discrepancy between xarray versions 0.8.2 and 0.9.1 when using open_mfdataset() on a dataset ~ 1.2 GB in size, consisting of 3 files and using netcdf4 as the engine. 0.8.2 was run first, so this is probably not a disk caching issue.
Test
import xarray as xr
import time
start_time = time.time()
ds0 = xr.open_mfdataset('./*.nc')
print("--- %s seconds ---" % (time.time() - start_time))
Result
xarray==0.8.2, dask==0.11.1, netcdf4==1.2.4
--- 0.736030101776 seconds ---
xarray==0.9.1, dask==0.13.0, netcdf4==1.2.4
--- 52.2800869942 seconds ---
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 17 (11 by maintainers)
Looks like it has been resolved! Tested with the latest pre-release v0.10.0rc2 on the dataset linked by najascutellatus above. https://marine.rutgers.edu/~michaesm/netcdf/data/
xarray==0.10.0rc2-1-g8267fdb dask==0.15.4
xarray==0.9.1 dask==0.13.0