dask-cuda: Starting the cluster with memory_limit=None causes failures on the latest nightly
Starting the cluster with memory_limit=None causes failures on the latest nightly
Minimal Repro:
from dask_cuda import LocalCUDACluster
from dask.distributed import Client, wait
import dask
def test_func():
return "abc"
if __name__ == "__main__":
cluster = LocalCUDACluster(memory_limit=None)
client = Client(cluster)
test_val = client.submit(test_func)
print(test_val.result())
Trace:
Traceback (most recent call last):
File "test_bug.py", line 14, in <module>
print(test_val.result())
File "/raid/vjawa/conda_install/conda_env/envs/cudf_march_30/lib/python3.7/site-packages/distributed/client.py", line 220, in result
raise exc.with_traceback(tb)
File "/raid/vjawa/conda_install/conda_env/envs/cudf_march_30/lib/python3.7/site-packages/dask_cuda/device_host_file.py", line 139, in __setitem__
self.host_buffer[key] = value
File "/raid/vjawa/conda_install/conda_env/envs/cudf_march_30/lib/python3.7/site-packages/zict/buffer.py", line 84, in __setitem__
if self.weight(key, value) <= self.n:
TypeError: '<=' not supported between instances of 'int' and 'NoneType'
Env:
dask-cuda 0.14.0a200330 py37_35 rapidsai-nightly
Work around:
Setting it to auto works.
cluster = LocalCUDACluster(memory_limit='auto')
Other details:
This used to work earlier
dask-cuda 0.13.0b200329 py37_86 rapidsai-nightly
CC: @ayushdg , who triaged this.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 17 (17 by maintainers)
Thanks for the clarification. Agreed that a discussion for #270 around device_limits would also be useful. To clarify my discussion w.r.t.
auto, I am referring to defaults (auto) with host memory.