cfgrib: If opening with xarray open_mfdataset and parallel=True it will fail unless you have previously opened it with parallel=False
Minimal repro:
import xarray as xr
ds = xr.open_mfdataset('gfs.0p25.201511*00.f0*.grib2', engine='cfgrib', combine='nested', concat_dim=['step'], parallel=True, chunks=24, backend_kwargs={'filter_by_keys': {'typeOfLevel': 'surface'}, 'indexpath': ''})
Expected result: returns xarray Actual result:
ECCODES ERROR : grib_handle_create: cannot create handle, no definitions found
ecCodes assertion failed: `h' in /home/conda/feedstock_root/build_artifacts/eccodes_1570714279314/work/src/grib_query.c:529
Note if in the same session/kernel you have previously opened with parallel=False the above will pass. The repro needs to happen in a new session. This was executed on a local dask cluster.
About this issue
- Original URL
- State: open
- Created 5 years ago
- Reactions: 1
- Comments: 23 (4 by maintainers)
Commits related to this issue
- Move open_mfdataset() parallel param value to config Move setting of the `parallel` parameter for `xarray.open_mfdataset ()` to the extraction configuration YAML file, with its default value set to `... — committed to UBC-MOAD/Reshapr by douglatornell 7 months ago
- Move open_mfdataset() parallel param value to config (#106) Move setting of the `parallel` parameter for `xarray.open_mfdataset ()` to the extraction configuration YAML file, with its default value ... — committed to UBC-MOAD/Reshapr by douglatornell 7 months ago
I confirm this bug report with a different dataset and different error messages.
With
parallel=Falseopen_mfdatasetalways work:Restarting the kernel and running with
parallel=Truealways crashes python inside ecCodes but it returns a few different error messages. I observed at leat:and
It looks like a locking/threading problem, @shahramn do you have any hint?
Hi @guidocioni , I’m not sure if you’re on macos or Linux, but we’ve managed to update the conda version. Could you do the following:
conda search eccodes -c conda-forgeand if you see a 2.18.0 version with _1 at the end, install that version please. It takes the conda servers a little while to update their indexes, but it’s appeared now at least on macos.Sounds like you’re on the right path. A few years ago, when
cfgribwas still a baby 😃, I was getting an error while trying to read compressed grib files as the recipe foreccodesoncondawas not including the compression library because of a license issue. So in the end the problem was on theeccodesside onconda.Awesome. Thanks for looking into it. Not all heroes wear capes 😃
On 08.10.2020 15:01, shahramn wrote:
Looks like the conda recipe does NOT enable the thread safety flags. I will look into this
The ecCodes library has to be built with thread safety enabled See https://confluence.ecmwf.int/display/UDOC/Is+ecCodes+thread-safe+-+ecCodes+FAQ
I can confirm this is still here on
xarray 0.16.1andcfgrib 0.9.8.4. For now I’m usingparallel = Falsebut it takes about 3 times longer than withparallel = True. The problem is that when opening the files for the first time withparallel = Trueeccodesthrow an error tocfrgibwhich is unable to writeidxfiles. The error which you then see inpythonis due to emptyidxfiles.