dask-cuda: [BUG] sync client constructor hangs on connecting to an async localcudacuster

Describe the bug

Connecting Client(synchronous=False) to a LocalCUDACluster(synchronous=True) hangs. This means an async localcudacluster cannot be used with RAPIDS libraries that expect a sync client, e.g., blazingsql. (cc @felipeblazing @kkraus14 ) .

Demo:

import asyncio, cudf, dask_cudf
from dask_cuda import LocalCUDACluster
from dask.distributed import Client

async def main(): 
    async with await LocalCUDACluster(asynchronous=True, dashboard_address=None) as cluster_async:
        print('making sync client..') ### last message to get printed
        with Client(address=cluster_async, asynchronous=False) as client_sync:
            print('exiting client..')
        print('exiting cluster..')
    print('cleaned up.')

asyncio.run(main())

=>

making sync client..
-------------------------
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x7f72a2bccf90>>, <Task finished coro=<Worker.heartbeat() done, defined at /conda/envs/rapids/lib/python3.7/site-packages/distributed/worker.py:929> exception=OSError('Timed out during handshake while connecting to tcp://127.0.0.1:33281 after 10 s')>)
Traceback (most recent call last):
  File "/conda/envs/rapids/lib/python3.7/site-packages/distributed/comm/core.py", line 319, in connect
    handshake = await asyncio.wait_for(comm.read(), time_left())
  File "/conda/envs/rapids/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

Steps/Code to reproduce bug

See above

Expected behavior

Code to terminate without exceptions

Environment overview (please complete the following information)

ubuntu w/ 10.2 -> docker ubuntu 18 -> conda rapids=17

Additional context

This is about LocalCUDACluster. When the cluster is started separately (dask-scheduler / dask-cuda-worker + connecting by address), initial testing makes it look fine to mix sync + async clients, at least in separate processes.

About this issue

  • Original URL
  • State: open
  • Created 3 years ago
  • Comments: 15 (3 by maintainers)

Most upvoted comments

The context statement is incorrectly stated:

async with await LocalCUDACluster

Should be

async with LocalCUDACluster