dask: Dask client not running on Colab
Dask client is not running on Colab. My code is:
from dask.distributed import Client, progress
client = Client()
client
and the error is:
/usr/local/lib/python3.6/dist-packages/distributed/bokeh/core.py:57: UserWarning:
Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the diagnostics dashboard on a random port instead.
warnings.warn('\n' + msg)
tornado.application - ERROR - Multiple exceptions in yield list
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 828, in callback
result_list.append(f.result())
File "/usr/local/lib/python3.6/dist-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 4, in raise_exc_info
File "/usr/local/lib/python3.6/dist-packages/tornado/gen.py", line 1069, in run
yielded = self.gen.send(value)
File "/usr/local/lib/python3.6/dist-packages/distributed/deploy/local.py", line 229, in _start_worker
raise gen.TimeoutError("Worker failed to start")
tornado.gen.TimeoutError: Worker failed to start
---------------------------------------------------------------------------
TimeoutError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tornado/gen.py in callback(f)
827 try:
--> 828 result_list.append(f.result())
829 except Exception as e:
33 frames
TimeoutError: Worker failed to start
During handling of the above exception, another exception occurred:
TimeoutError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/distributed/deploy/local.py in _start_worker(self, death_timeout, **kwargs)
227 if w.status == 'closed' and self.scheduler.status == 'running':
228 self.workers.remove(w)
--> 229 raise gen.TimeoutError("Worker failed to start")
230
231 raise gen.Return(w)
TimeoutError: Worker failed to start
To reproduce the error please see the following Colab gist:
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 1
- Comments: 26 (13 by maintainers)
@cornhundred I suspect the dashboard being inaccessible is because printing the client info in Colab gives you a localhost address:
I’ve dug around for ways to expose it, and the simplest I’ve seen is using ngrok like so:
That should print a link which you can click and get the Dask Status Dashboard for your Colab instance.
The above with a LocalCluster(processes=False) worked for me just a few moments ago.
Any update on this? I’m getting the same error messages on Colab.
Okay it seems
!pip install -U dask[complete]
can fix the problem.Hi @jakirkham, yes the link worked when I ran the notebook off my cellphone’s WiFi hotspot.
I wonder if this solution will work on Kaggle? see https://github.com/dask/dask/issues/4844#issuecomment-496017917
Glad to hear that worked.
Could you please raise another Colab issue about the Dashboard being inaccessible?
Hi @jakirkham and @mrocklin, yes adding the argument
processes=False
worked - see gist https://colab.research.google.com/gist/cornhundred/1fa599c4cb09cded2a598cc02ca3947b/trying_dask_on_colab_no_processes.ipynbHowever, I have to read the documentation more to understand the consequences of not using proceses. Also, the Dashboard link is not working.
My guess is that they don’t allow users to start processes. You might try the following for now:
Though there are performance implications of this, particularly when using GIL-bound computations.