distributed: Integer overflow in TCP comm
What happened:
I am receiving integer overflow errors in some large scale computations. Haven’t had the chance to construct a MCVE but will try to.
cc @jakirkham I believe you reworked this section recently. Maybe you spot what’s going wrong right away
What you expected to happen:
Minimal Complete Verifiable Example:
N/A
Anything else we need to know?:
Traceback (most recent call last):
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/distributed/core.py", line 554, in handle_stream
msgs = await comm.read()
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/distributed/comm/tcp.py", line 199, in read
n = await stream.read_into(frames)
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 475, in read_into
self._try_inline_read()
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 842, in _try_inline_read
pos = self._read_to_buffer_loop()
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 755, in _read_to_buffer_loop
if self._read_to_buffer() == 0:
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
bytes_read = self.read_from_fd(buf)
File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 1592, in read_from_fd
return self.socket.recv_into(buf, len(buf))
File "/usr/lib/python3.6/ssl.py", line 1009, in recv_into
return self.read(nbytes, buffer)
File "/usr/lib/python3.6/ssl.py", line 871, in read
return self._sslobj.read(len, buffer)
File "/usr/lib/python3.6/ssl.py", line 631, in read
v = self._sslobj.read(len, buffer)
OverflowError: signed integer is greater than maximum (distributed.core)
Environment:
- Dask version: https://github.com/dask/dask/commit/91cee5b7944e47959f475eaebc42dc75be14bc21
- tornado version: 6.1
- Python version: py3.6
- Operating System: Deb9
- Install method (conda, pip, source): pip
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 27 (23 by maintainers)
Commits related to this issue
- Cap maximum shard size at the size of an integer Supercedes https://github.com/dask/distributed/pull/5134 Copying over the summary of that PR Works around the OpenSSL 1.0.2 bug demonstrated in issu... — committed to mrocklin/distributed by mrocklin 3 years ago
- Cap maximum shard size at the size of an integer (#5141) Supercedes https://github.com/dask/distributed/pull/5134 Copying over the summary of that PR Works around the OpenSSL 1.0.2 bug demonstr... — committed to dask/distributed by mrocklin 3 years ago
We are cutting a release tomorrow. These fixes should be in there ( https://github.com/dask/community/issues/173 )
Yeah, let’s close for now and re-open is needed