notebook: Websocket ping timeout

I followed these instructions to set-up a Jupyter Notebook server on an Amazon EC2 instance. All works great, except when I run a block of code that requires a long execution time, greater than 2 or 3 min. As the kernel is busy running this code (I can see code executing due to a simple progress bar feature) it will stop all the sudden and display a websocket ping timeout error. The following are the messages I receive:

[I 17:22:19.083 NotebookApp] Serving notebooks from local directory: /home/ubuntu/Notebooks
[I 17:22:19.084 NotebookApp] 0 active kernels
[I 17:22:19.084 NotebookApp] The IPython Notebook is running at: https://[all ip addresses on your system]:8888/
[I 17:22:19.084 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 17:22:32.645 NotebookApp] 302 GET / (174.47.174.222) 0.72ms
[I 17:22:32.735 NotebookApp] 302 GET /tree (174.47.174.222) 0.93ms
[I 17:22:37.245 NotebookApp] 302 POST /login?next=%2Ftree (174.47.174.222) 0.92ms
[I 17:22:42.437 NotebookApp] Kernel started: 7b436e11-118d-4c28-9777-ec63baec0b5f
[W 17:24:13.097 NotebookApp] WebSocket ping timeout after 90000 ms.
[E 01:12:29.968 NotebookApp] Uncaught exception GET /api/kernels/7ca196a9-e64b-40dd-bd12-d8bc1a323686/channels?session_id=E2631BE0F605403986ED7D8387A07E99 (174.47.174.222)
    HTTPServerRequest(protocol='https', host='ec2-52-9-221-109.us-west-1.compute.amazonaws.com:8888', method='GET', uri='/api/kernels/7ca196a9-e64b-40dd-bd12-d8bc1a323686/channels?session_id=E2631BE0F605403986ED7D8387A07E99', version='HTTP/1.1', remote_ip='174.47.174.222', headers={'Origin': 'https://ec2-52-9-221-109.us-west-1.compute.amazonaws.com:8888', 'Upgrade': 'Websocket', 'Sec-Websocket-Version': '13', 'Connection': 'Upgrade', 'Sec-Websocket-Key': 'KMt575Kx/659v0lUZdkytA==', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; Touch; LCTE; rv:11.0) like Gecko', 'Host': 'ec2-52-9-221-109.us-west-1.compute.amazonaws.com:8888', 'Cookie': 'username-ec2-52-9-221-109-us-west-1-compute-amazonaws-com-8888="2|1:0|10:1463773579|62:username-ec2-52-9-221-109-us-west-1-compute-amazonaws-com-8888|48:ZThmOTlkNWItZGQ1Yy00YjlmLWExNGEtMmEyYzJkODNiMjU2|f323ab003cf23bfc7dd105e73f48ee970b4b26807e5ddaaafe221f9123d1ea65"', 'Cache-Control': 'no-cache'})
    Traceback (most recent call last):
      File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tornado/web.py", line 1401, in _stack_context_handle_exception
        raise_exc_info((type, value, traceback))
      File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tornado/stack_context.py", line 314, in wrapped
        ret = fn(*args, **kwargs)
      File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 184, in <lambda>
        self.on_recv(lambda msg: callback(self, msg), copy=copy)
      File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/notebook/base/zmqhandlers.py", line 188, in _on_zmq_reply
        self.write_message(msg, binary=isinstance(msg, bytes))
      File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/tornado/websocket.py", line 215, in write_message
        raise WebSocketClosedError()
    WebSocketClosedError
[I 17:24:42.542 NotebookApp] Saving file at /Untitled.ipynb

When accessing the server in Chrome or Internet Explorer I get the same messages.

Additionally (albeit very strange), when code is executing on the server my local laptop’s CPU utilization goes up about 50%.

Any thoughts in this websocket timeout?

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 1
  • Comments: 28 (4 by maintainers)

Most upvoted comments

I’m having the same problem with running a Keras model over ssh, and it’s super annoying. Could we change the timeout for cells as in here: http://nbconvert.readthedocs.io/en/stable/execute_api.html ?

You can also send the output to a file like so:

import sys
sys.stdout = open('keras_output.txt', 'w')
history = model.fit(X, y_cat, batch_size=128, nb_epoch=200, verbose=1)
sys.stdout = sys.__stdout__

That worked for me. http://stackoverflow.com/questions/4675728/redirect-stdout-to-a-file-in-python

Or you could turn the verbosity option to 0

Any update on this issue. Getting in k8s jupyter lab pod

A lot of people run jupyter with keras and this bug is over a year old now. Any way of getting this fixed as it a big pain running keras models with no progress bars? I note it says “needs info” but not sure what that means.

Websocket pinging is used because certain proxies close a websocket if there are no messages over it for 60 seconds: we send a ping message every 30 seconds, and the browser sends a pong back. This is part of the websocket protocol, so the browser should do it automatically. If we don’t get the pong back after 90 seconds, we assume that the connection is lost and kill it.

I can’t think why code executing would affect that, but it’s suspicious that it causes high CPU usage on the client. What progress bar library is the code using? Can you disable the progress bar and see if the behaviour still occurs?

I do use tqdm. But it is an extra step to specify tqdm progress bars on every call to fit functions; and another step to save the notebook with the bars. Does not feel like a proper solution.

Is it hard to fix? On 20 Jun 2017 12:03 a.m., “Nate George” notifications@github.com wrote:

Could also use keras-tqdm maybe https://github.com/bstriner/keras-tqdm

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jupyter/notebook/issues/1474#issuecomment-309597509, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJN6crffcUkhit3tQOV47YnhGDEDfcKks5sFv43gaJpZM4Ijq2W .

try disabling windows firewall for your private network (if you are running the notebook on localhost:8888), that solved it for me.

This WebSocket ping timeout also occurs frequently if you output a considerable amount of logging messages to the cell in a long running job (not with Keras).

This bug is not just an issue with fit but also with downloading applications. In this case the progress bar cannot be turned off.e.g.

 from keras.applications.vgg16 import VGG16

On 20 June 2017 at 00:10, simon mackenzie simonm3@gmail.com wrote:

I do use tqdm. But it is an extra step to specify tqdm progress bars on every call to fit functions; and another step to save the notebook with the bars. Does not feel like a proper solution.

Is it hard to fix? On 20 Jun 2017 12:03 a.m., “Nate George” notifications@github.com wrote:

Could also use keras-tqdm maybe https://github.com/bstriner/keras-tqdm

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jupyter/notebook/issues/1474#issuecomment-309597509, or mute the thread https://github.com/notifications/unsubscribe-auth/ABJN6crffcUkhit3tQOV47YnhGDEDfcKks5sFv43gaJpZM4Ijq2W .

@Jeffalltogether what version of tornado do you have (might start with pip list or conda list to be safe)? It’s odd that this timeout is raising an error in .get(), which should have returned before the timer started. That suggests that something isn’t waiting the way we expect it to - perhaps because your tornado is too old, or perhaps it’s a new version that changed something out from under us.

is it possible to override this timeout when the notebook is in the “(busy)” state?

Not easily, and I’m pretty sure it’s the wrong fix, in any case. The fact that the kernel is executing something shouldn’t stop the browser from responding to websocket pings.

I have seen a number of other people with a similar issue.

That does look like the same issue that you’re seeing, and @minrk is the person most likely to be able to work it out.