distributed: Unable to start a Client after update to latest dask
@mrocklin suggested to file an issue here.
In a restarted notebook I run:
import distributed
client = distributed.Client()
And get hundreds of these errors:
tornado.application - ERROR - Exception in Future <tornado.concurrent.Future object at 0x7fc162990be0> after timeout
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 910, in error_callback
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/nanny.py", line 300, in start
yield self._wait_until_running()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1069, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/nanny.py", line 386, in _wait_until_running
raise ValueError("Worker not started")
ValueError: Worker not started
tornado.application - ERROR - Exception in Future <tornado.concurrent.Future object at 0x7fc1629b82b0> after timeout
Traceback (most recent call last):
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 910, in error_callback
future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1063, in run
yielded = self.gen.throw(*exc_info)
File "/opt/conda/lib/python3.6/site-packages/distributed/nanny.py", line 300, in start
yield self._wait_until_running()
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1055, in run
value = future.result()
File "/opt/conda/lib/python3.6/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/opt/conda/lib/python3.6/site-packages/tornado/gen.py", line 1069, in run
yielded = self.gen.send(value)
File "/opt/conda/lib/python3.6/site-packages/distributed/nanny.py", line 386, in _wait_until_running
raise ValueError("Worker not started")
ValueError: Worker not started
I can reproduce this with:
docker run -it --rm quantumtinkerer/jupyter-research:latest bash
# create a new env or just use the current one where `distributed` is already installed
conda create --yes -n dask python=3.6 dask distributed
source activate dask
python
import distributed
c = distributed.Client()
The Docker image is based on jupyter/docker-stacks/base-notebook.
The weird thing is that it only happens on our server where we have a Jupyterhub. When I try it on a different machine there doesn’t seem to be an issue.
Any idea on how I can debug this?
About this issue
- Original URL
- State: closed
- Created 7 years ago
- Comments: 18 (11 by maintainers)
My first guess would be some networking issue. You might try the following:
Client(processes=False)which will avoid networking issues entirely@basnijholt thank you for access to your system. I tried updating to master with
!pip install git+https://github.com/dask/distributed.git --upgradeand things seem to work now:I should have a bit of time to look at this starting tomorrow.