xarray: Problems with distributed and opendap netCDF endpoint
Code Sample
I am trying to load a dataset from an opendap endpoint using xarray, netCDF4, and distributed. I am having a problem only with non-local distributed schedulers (KubeCluster specifically). This could plausibly be an xarray, dask, or pangeo issue, but I have decided to post it here.
import xarray as xr
import dask
# create dataset from Unidata's test opendap endpoint, chunked in time
url = 'http://remotetest.unidata.ucar.edu/thredds/dodsC/testdods/coads_climatology.nc'
ds = xr.open_dataset(url, decode_times=False, chunks={'TIME': 1})
# all these work
with dask.config.set(scheduler='synchronous'):
ds.SST.compute()
with dask.config.set(scheduler='processes'):
ds.SST.compute()
with dask.config.set(scheduler='threads'):
ds.SST.compute()
# this works too
from dask.distributed import Client
local_client = Client()
with dask.config.set(get=local_client):
ds.SST.compute()
# but this does not
cluster = KubeCluster(n_workers=2)
kube_client = Client(cluster)
with dask.config.set(get=kube_client):
ds.SST.compute()
In the worker log, I see the following sort of errors.
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 5, 0, 0)
distributed.worker - INFO - Dependent not found: open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf 0 . Asking scheduler
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 3, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 0, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 1, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 7, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 6, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 2, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 9, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 8, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 11, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 10, 0, 0)
distributed.worker - INFO - Can't find dependencies for key ('open_dataset-4a0403564ad0e45788e42887b9bc0997SST-9fd3e5906a2a54cb28f48a7f2d46e4bf', 4, 0, 0)
distributed.worker - WARNING - Compute Failed Function: getter args: (ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyOuterIndexedArray(array=_ElementwiseFunctionArray(LazilyOuterIndexedArray(array=<xarray.backends.netCDF4_.NetCDF4ArrayWrapper object at 0x7f45d6fcbb38>, key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))), func=functools.partial(<function _apply_mask at 0x7f45d70507b8>, encoded_fill_values={-1e+34}, decoded_fill_value=nan, dtype=dtype('float32')), dtype=dtype('float32')), key=BasicIndexer((slice(None, None, None), slice(None, None, None), slice(None, None, None)))))), (slice(3, 4, None), slice(0, 90, None), slice(0, 180, None))) kwargs: {} Exception: RuntimeError('NetCDF: Not a valid ID',)
Ultimately, the error comes from the netCDF library: RuntimeError('NetCDF: Not a valid ID',)
This seems like something to do with serialization of the netCDF store. The worker images have identical netcdf version (and all other package versions). I am at a loss for how to debug further.
Output of xr.show_versions()
xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.111+
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
xarray: 0.10.8
pandas: 0.23.2
numpy: 1.15.1
scipy: 1.1.0
netCDF4: 1.4.1
h5netcdf: None
h5py: None
Nio: None
zarr: 2.2.0
bottleneck: None
cyordereddict: None
dask: 0.18.2
distributed: 1.22.1
matplotlib: 2.2.3
cartopy: None
seaborn: None
setuptools: 39.2.0
pip: 18.0
conda: 4.5.4
pytest: 3.8.0
IPython: 6.4.0
sphinx: None
cube_client.get_versions(check=True)
{'scheduler': {'host': (('python', '3.6.3.final.0'),
('python-bits', 64),
('OS', 'Linux'),
('OS-release', '4.4.111+'),
('machine', 'x86_64'),
('processor', 'x86_64'),
('byteorder', 'little'),
('LC_ALL', 'en_US.UTF-8'),
('LANG', 'en_US.UTF-8'),
('LOCALE', 'en_US.UTF-8')),
'packages': {'required': (('dask', '0.18.2'),
('distributed', '1.22.1'),
('msgpack', '0.5.6'),
('cloudpickle', '0.5.5'),
('tornado', '5.0.2'),
('toolz', '0.9.0')),
'optional': (('numpy', '1.15.1'),
('pandas', '0.23.2'),
('bokeh', '0.12.16'),
('lz4', '1.1.0'),
('blosc', '1.5.1'))}},
'workers': {'tcp://10.20.8.4:36940': {'host': (('python', '3.6.3.final.0'),
('python-bits', 64),
('OS', 'Linux'),
('OS-release', '4.4.111+'),
('machine', 'x86_64'),
('processor', 'x86_64'),
('byteorder', 'little'),
('LC_ALL', 'en_US.UTF-8'),
('LANG', 'en_US.UTF-8'),
('LOCALE', 'en_US.UTF-8')),
'packages': {'required': (('dask', '0.18.2'),
('distributed', '1.22.1'),
('msgpack', '0.5.6'),
('cloudpickle', '0.5.5'),
('tornado', '5.0.2'),
('toolz', '0.9.0')),
'optional': (('numpy', '1.15.1'),
('pandas', '0.23.2'),
('bokeh', '0.12.16'),
('lz4', '1.1.0'),
('blosc', '1.5.1'))}},
'tcp://10.21.177.254:42939': {'host': (('python', '3.6.3.final.0'),
('python-bits', 64),
('OS', 'Linux'),
('OS-release', '4.4.111+'),
('machine', 'x86_64'),
('processor', 'x86_64'),
('byteorder', 'little'),
('LC_ALL', 'en_US.UTF-8'),
('LANG', 'en_US.UTF-8'),
('LOCALE', 'en_US.UTF-8')),
'packages': {'required': (('dask', '0.18.2'),
('distributed', '1.22.1'),
('msgpack', '0.5.6'),
('cloudpickle', '0.5.5'),
('tornado', '5.0.2'),
('toolz', '0.9.0')),
'optional': (('numpy', '1.15.1'),
('pandas', '0.23.2'),
('bokeh', '0.12.16'),
('lz4', '1.1.0'),
('blosc', '1.5.1'))}}},
'client': {'host': [('python', '3.6.3.final.0'),
('python-bits', 64),
('OS', 'Linux'),
('OS-release', '4.4.111+'),
('machine', 'x86_64'),
('processor', 'x86_64'),
('byteorder', 'little'),
('LC_ALL', 'en_US.UTF-8'),
('LANG', 'en_US.UTF-8'),
('LOCALE', 'en_US.UTF-8')],
'packages': {'required': [('dask', '0.18.2'),
('distributed', '1.22.1'),
('msgpack', '0.5.6'),
('cloudpickle', '0.5.5'),
('tornado', '5.0.2'),
('toolz', '0.9.0')],
'optional': [('numpy', '1.15.1'),
('pandas', '0.23.2'),
('bokeh', '0.12.16'),
('lz4', '1.1.0'),
('blosc', '1.5.1')]}}}
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 26 (18 by maintainers)
Oh, I didn’t even catch that the original was on defaults.