distributed: Can't start worker on MacOS
On current master distributed-2.9.1+13.g7d2ed43c (and tornado==6.0.3), I get the following error on MacOS but not in Docker:
# already running dask scheduler
(.venv) ➜ model git:(nyc) ✗ PYTHONPATH=. dask-worker 'localhost:8786' --nthreads 1 --memory-limit 6GB --local-directory /tmp/ --nprocs 1
distributed.nanny - INFO - Start Nanny at: 'tcp://127.0.0.1:55985'
objc[39479]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called.
objc[39479]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
distributed.nanny - INFO - Worker process 39479 was killed by signal 6
tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOMainLoop object at 0x118220490>>, <Task finished coro=<Nanny._on_exit() done, defined at /Users/brett/model/.venv/lib/python3.7/site-packages/distributed/nanny.py:396> exception=TypeError('addresses should be strings or tuples, got None')>)
Traceback (most recent call last):
File "/Users/brett/model/.venv/lib/python3.7/site-packages/tornado/ioloop.py", line 743, in _run_callback
ret = callback()
File "/Users/brett/model/.venv/lib/python3.7/site-packages/tornado/ioloop.py", line 767, in _discard_future_result
future.result()
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/nanny.py", line 399, in _on_exit
await self.scheduler.unregister(address=self.worker_address)
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/core.py", line 757, in send_recv_from_rpc
result = await send_recv(comm=comm, op=key, **kwargs)
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/core.py", line 556, in send_recv
raise exc.with_traceback(tb)
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/core.py", line 408, in handle_comm
result = handler(comm, **msg)
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/scheduler.py", line 2122, in remove_worker
address = self.coerce_address(address)
File "/Users/brett/model/.venv/lib/python3.7/site-packages/distributed/scheduler.py", line 4831, in coerce_address
raise TypeError("addresses should be strings or tuples, got %r" % (addr,))
TypeError: addresses should be strings or tuples, got None
distributed.nanny - INFO - Closing Nanny at 'tcp://127.0.0.1:55985'
distributed.dask_worker - INFO - End worker
No issues in the latest release (2.9.1). Possibly related to #3356 but I didn’t see any similar error messages so I figured I’d keep it separate for now.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 1
- Comments: 30 (28 by maintainers)
I am now able to reproduce. I am on Mojave, I used brew to build a venv and pip installed same as @jrbourbeau . I am also getting a lovely pop up with the following details:
Application Specific Information: objc[74025]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called. crashed on child side of fork pre-exec
Thread 0 Crashed: 0 libsystem_kernel.dylib 0x00007fff77fee016 __abort_with_payload + 10 1 libsystem_kernel.dylib 0x00007fff77fe95db abort_with_payload_wrapper_internal + 82 2 libsystem_kernel.dylib 0x00007fff77fe9589 abort_with_reason + 22 3 libobjc.A.dylib 0x00007fff766cf8dd _objc_fatalv(unsigned long long, unsigned long long, char const*, __va_list_tag*) + 108 4 libobjc.A.dylib 0x00007fff766cf78f _objc_fatal(char const*, …) + 135 5 libobjc.A.dylib 0x00007fff766d060f performForkChildInitialize(objc_class*, objc_class*) + 341 6 libobjc.A.dylib 0x00007fff766d162f initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 187 7 libobjc.A.dylib 0x00007fff766c0690 lookUpImpOrForward + 228 8 libobjc.A.dylib 0x00007fff766c0114 _objc_msgSend_uncached + 68 9 libobjc.A.dylib 0x00007fff766c369b +[NSObject new] + 86 10 com.apple.Foundation 0x00007fff4e18f478 -[NSThread init] + 61 11 com.apple.Foundation 0x00007fff4e18f3e2 ____NSThreads_block_invoke + 64 12 libdispatch.dylib 0x00007fff77e4e63d _dispatch_client_callout + 8 13 libdispatch.dylib 0x00007fff77e4fd4b _dispatch_once_callout + 20 14 com.apple.Foundation 0x00007fff4e18f39d _NSThreadGet0 + 325 15 com.apple.Foundation 0x00007fff4e18ec31 _NSInitializePlatform + 407 16 libobjc.A.dylib 0x00007fff766c1d51 call_load_methods + 233 17 libobjc.A.dylib 0x00007fff766bf405 load_images + 117 18 dyld 0x000000011bdcb46a dyld::notifySingle(dyld_image_states, ImageLoader const*,
--no-nannyalso resolves the issue. Additionally, usingspawnfor themultiprocessing-methodin~/.config/dask/distributed.yamlalso resolves the problemDigging a little more into brew and multiprocessing, it appears that this has been an issue in the past: https://bugs.python.org/issue33725