visdom: Visdom hanging on connect

I’m having issues with connecting to my visdom server with some experiments. Note, this only happens with some of my larger experiments, but it isn’t even consistent as this occurs after long periods of things working fine.

What happens is that visdom hangs and I cannot plot. The values being sent to the visdom server are not larger, usually nothing more than a dozen 2x2 matrices. If I do a keyboard interrupt, I get:

Traceback (most recent call last):
  File "scripts/deep_infomax.py", line 68, in <module>
    run(controller)
  File "/home/devonh/.local/lib/python3.7/site-packages/cortex/main.py", line 48, in run
    train.main_loop(model, **exp.ARGS['train'])
  File "/home/devonh/.local/lib/python3.7/site-packages/cortex/_lib/train.py", line 415, in main_loop
    plot(epoch, init=(epoch == first_epoch), viz_test_only=viz_test_only)
  File "/home/devonh/.local/lib/python3.7/site-packages/cortex/_lib/viz.py", line 276, in plot
    update=update
  File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 335, in wrapped_f
    return f(*args, **kwargs)
  File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 1367, in line
    update=update, name=name)
  File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 335, in wrapped_f
    return f(*args, **kwargs)
  File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 1292, in scatter
    return self._send(data_to_send, endpoint=endpoint)
  File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 548, in _send
    data=json.dumps(msg),
  File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 581, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/home/devonh/.local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/home/devonh/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/home/devonh/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
    httplib_response = conn.getresponse()
  File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 1321, in getresponse
    response.begin()
  File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 296, in begin
    version, status, reason = self._read_status()
  File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 257, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/home/devonh/anaconda3/lib/python3.7/socket.py", line 589, in readinto
    return self._sock.recv_into(b)
KeyboardInterrupt

This is after a few-minute hang.

Any way to fix this? I am using version 0.1.8.8

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Comments: 19 (9 by maintainers)

Commits related to this issue

Most upvoted comments

@rdevon With that confirmation I suspect it has to do with our use of a request session. A real fix would be to catch the timeout specifically (first by explicitly setting a shorter timeout on the _send call so that it doesn’t hang for an unacceptable amount of time), then attempt to refresh the visdom object’s _session object and retry on a failure. Thanks for your work investigating so far.