dask-cuda: Import failure in worker process
This came up as I was playing around with https://github.com/rapidsai/dask-cuda/blob/branch-0.16/dask_cuda/benchmarks/local_cudf_merge.py. I started seeing this import failure:
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/core.py", line 151, in loads
value = _deserialize(head, fs, deserializers=deserializers)
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 335, in deserialize
return loads(header, frames)
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 71, in pickle_loads
return pickle.loads(x, buffers=buffers)
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 73, in loads
return pickle.loads(x, buffers=buffers)
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py", line 16, in <module>
from cudf import _lib as libcudf
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/__init__.py", line 11, in <module>
from cudf import core, datasets, testing
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/__init__.py", line 4, in <module>
from cudf.core.dataframe import DataFrame, from_pandas, merge
File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/dataframe.py", line 36, in <module>
from cudf.core.series import Series
ImportError: cannot import name 'Series' from partially initialized module 'cudf.core.series' (most likely due to a circular import) (/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py)
distributed.utils - ERROR - cannot import name 'Series' from partially initialized module 'cudf.core.series' (most likely due to a circular import) (/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py)
I have simplified the reproducer down to https://gist.github.com/manopapad/aad22af7ec2206c7f078b71dc042ed56.
I suspect this has to do with doing imports inside a function that constructs dask graphs (e.g. here https://github.com/rapidsai/dask-cuda/blob/branch-0.16/dask_cuda/benchmarks/local_cudf_merge.py#L26-L32), then shipping those graphs to worker processes that haven’t executed those imports. Just a hunch though, someone familiar with the internals of dask would know better.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 15 (13 by maintainers)
I think this is due to a very particular use of
import cudfandpersist, and that occurs just the same withprotocol="tcp", so we can completely remove UCX complexity from this issue to simplify things.Doing
from cudf import Series, or even justimport cudfat the top of the script fixes the crash. Doing justimport cupyat the top the crash remains.