dask-cuda: Import failure in worker process

This came up as I was playing around with https://github.com/rapidsai/dask-cuda/blob/branch-0.16/dask_cuda/benchmarks/local_cudf_merge.py. I started seeing this import failure:

distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/core.py", line 151, in loads
    value = _deserialize(head, fs, deserializers=deserializers)
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 335, in deserialize
    return loads(header, frames)
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/serialize.py", line 71, in pickle_loads
    return pickle.loads(x, buffers=buffers)
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/distributed/protocol/pickle.py", line 73, in loads
    return pickle.loads(x, buffers=buffers)
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py", line 16, in <module>
    from cudf import _lib as libcudf
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/__init__.py", line 11, in <module>
    from cudf import core, datasets, testing
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/__init__.py", line 4, in <module>
    from cudf.core.dataframe import DataFrame, from_pandas, merge
  File "/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/dataframe.py", line 36, in <module>
    from cudf.core.series import Series
ImportError: cannot import name 'Series' from partially initialized module 'cudf.core.series' (most likely due to a circular import) (/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py)
distributed.utils - ERROR - cannot import name 'Series' from partially initialized module 'cudf.core.series' (most likely due to a circular import) (/gpfs/fs1/mpapadakis/miniconda3/envs/legate3/lib/python3.8/site-packages/cudf/core/series.py)

I have simplified the reproducer down to https://gist.github.com/manopapad/aad22af7ec2206c7f078b71dc042ed56.

I suspect this has to do with doing imports inside a function that constructs dask graphs (e.g. here https://github.com/rapidsai/dask-cuda/blob/branch-0.16/dask_cuda/benchmarks/local_cudf_merge.py#L26-L32), then shipping those graphs to worker processes that haven’t executed those imports. Just a hunch though, someone familiar with the internals of dask would know better.

About this issue

  • Original URL
  • State: open
  • Created 4 years ago
  • Comments: 15 (13 by maintainers)

Most upvoted comments

I think this is due to a very particular use of import cudf and persist, and that occurs just the same with protocol="tcp", so we can completely remove UCX complexity from this issue to simplify things.

Doing from cudf import Series, or even just import cudf at the top of the script fixes the crash. Doing just import cupy at the top the crash remains.