visdom: Visdom hanging on connect
I’m having issues with connecting to my visdom server with some experiments. Note, this only happens with some of my larger experiments, but it isn’t even consistent as this occurs after long periods of things working fine.
What happens is that visdom hangs and I cannot plot. The values being sent to the visdom server are not larger, usually nothing more than a dozen 2x2 matrices. If I do a keyboard interrupt, I get:
Traceback (most recent call last):
File "scripts/deep_infomax.py", line 68, in <module>
run(controller)
File "/home/devonh/.local/lib/python3.7/site-packages/cortex/main.py", line 48, in run
train.main_loop(model, **exp.ARGS['train'])
File "/home/devonh/.local/lib/python3.7/site-packages/cortex/_lib/train.py", line 415, in main_loop
plot(epoch, init=(epoch == first_epoch), viz_test_only=viz_test_only)
File "/home/devonh/.local/lib/python3.7/site-packages/cortex/_lib/viz.py", line 276, in plot
update=update
File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 335, in wrapped_f
return f(*args, **kwargs)
File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 1367, in line
update=update, name=name)
File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 335, in wrapped_f
return f(*args, **kwargs)
File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 1292, in scatter
return self._send(data_to_send, endpoint=endpoint)
File "/home/devonh/.local/lib/python3.7/site-packages/visdom/__init__.py", line 548, in _send
data=json.dumps(msg),
File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 581, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/home/devonh/.local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/home/devonh/.local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/home/devonh/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/home/devonh/.local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request
httplib_response = conn.getresponse()
File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 1321, in getresponse
response.begin()
File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 296, in begin
version, status, reason = self._read_status()
File "/home/devonh/anaconda3/lib/python3.7/http/client.py", line 257, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/home/devonh/anaconda3/lib/python3.7/socket.py", line 589, in readinto
return self._sock.recv_into(b)
KeyboardInterrupt
This is after a few-minute hang.
Any way to fix this? I am using version 0.1.8.8
About this issue
- Original URL
- State: open
- Created 5 years ago
- Comments: 19 (9 by maintainers)
@rdevon With that confirmation I suspect it has to do with our use of a request session. A real fix would be to catch the timeout specifically (first by explicitly setting a shorter timeout on the
_sendcall so that it doesn’t hang for an unacceptable amount of time), then attempt to refresh the visdom object’s_sessionobject and retry on a failure. Thanks for your work investigating so far.