distributed: Integer overflow in TCP comm

What happened:

I am receiving integer overflow errors in some large scale computations. Haven’t had the chance to construct a MCVE but will try to.

cc @jakirkham I believe you reworked this section recently. Maybe you spot what’s going wrong right away

What you expected to happen:

Minimal Complete Verifiable Example:

N/A

Anything else we need to know?:

Traceback (most recent call last):
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/distributed/core.py", line 554, in handle_stream
    msgs = await comm.read()
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/distributed/comm/tcp.py", line 199, in read
    n = await stream.read_into(frames)
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 475, in read_into
    self._try_inline_read()
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 842, in _try_inline_read
    pos = self._read_to_buffer_loop()
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 755, in _read_to_buffer_loop
    if self._read_to_buffer() == 0:
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 867, in _read_to_buffer
    bytes_read = self.read_from_fd(buf)
  File "/mnt/mesos/sandbox/venv/lib/python3.6/site-packages/tornado/iostream.py", line 1592, in read_from_fd
    return self.socket.recv_into(buf, len(buf))
  File "/usr/lib/python3.6/ssl.py", line 1009, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/lib/python3.6/ssl.py", line 871, in read
    return self._sslobj.read(len, buffer)
  File "/usr/lib/python3.6/ssl.py", line 631, in read
    v = self._sslobj.read(len, buffer)
OverflowError: signed integer is greater than maximum (distributed.core)

Environment:

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 27 (23 by maintainers)

Commits related to this issue

Most upvoted comments

We are cutting a release tomorrow. These fixes should be in there ( https://github.com/dask/community/issues/173 )

Yeah, let’s close for now and re-open is needed