tornado: any way to graceful exit tornado application?

there are many gist show how to graceful exit tornado application like this:

stop_httpserver(http_server)
def try_stop_ioloop():
    io_loop = IOLoop.instance()
    if io_loop._callbacks or io_loop._timeouts:
        io_loop.add_timeout(time.time()+1, try_stop_ioloop)
    else:
        io_loop.stop()

try_stop_ioloop()

it’s safy to test io_loop._timeouts or _callbacks ? or _timeouts can be ignored?

About this issue

  • Original URL
  • State: open
  • Created 8 years ago
  • Reactions: 3
  • Comments: 19 (9 by maintainers)

Most upvoted comments

You definitely shouldn’t be looking at IOLoop._callbacks or IOLoop._timeouts. In addition to being private variables, there’s a good chance that they will never be empty, for reasons that are unimportant (if you use tornado_curl_httpclient there is always an active timeout). Instead of asking the IOLoop for all activity, you need to decide what activity you care about and exit the server when all of that activity is done.

Or you can do what I do and just stop the IOLoop 5 seconds after the stop is requested. This will be enough for any regular request to finish, and if it’s taking longer than 5 seconds you probably want to let it fail anyway. It’s not worth making a more precise measurement to stop in less than 5 seconds.

This seems like a question that is asked frequently enough that an example solution should be included in the user’s guide section of the docs. I would consider adding something to either the Running and deploying section or the Structure section. Ben, which do you think makes the most sense?

Yeah, something like @ploxiln’s approach is what I’d do. I don’t think there have been any noteworthy changes between Tornado 4 and 5 here. It all depends on what “gracefully” means for your application. (One missing piece in the snippet above is that you probably want to signal to your load balancer somehow to stop the incoming traffic).

Here’s how I do it (currently with tornado-4.5.3 but I expect it will work the same with tornado-5.1):

async def shutdown():
    periodic_task.stop()
    http_server.stop()
    for client in ws_clients.values():
        client['handler'].close()
    await gen.sleep(1)
    ioloop.IOLoop.current().stop()

def exit_handler(sig, frame):
    ioloop.IOLoop.instance().add_callback_from_signal(shutdown)

...
if __name__ == '__main__':
    signal.signal(signal.SIGTERM, exit_handler)
    signal.signal(signal.SIGINT,  exit_handler)
    ...

(instead of just gen.sleep(1) I actually have a global active-request count and a tornado.locks.Event() that is set after the last request has finished, but that’s more complicated and more code …)

You are right that there are plenty of caveats to how to really do a graceful shutdown depending on the deployment strategy. But for the guide, maybe it would suffice to show a simple example case (such as what you mentioned where you stop the IOLoop 5 seconds later to give requests time to finish) while explaining that this is not the only way, but that it at least works for simple setups. I’ll take a stab at this in the next few days and maybe we can try to iterate through a good solution via a pull request.

We managed to gracefully shutdown tornado by implementing the following steps:

  1. intercepting SIGINT/SIGTERM signals
  2. manually stop the HTTPServer right away (no more requests are accepted)
  3. manually stop the ioloop when all pending requests terminated

See https://github.com/svaponi/tornado-graceful-shutdown/blob/main/server.py

I came here today to ask the exact same question. For after much reading in SO I came up with this example:

import logging
import signal
import time
import tornado.httpserver
import tornado.ioloop
import tornado.web
import tornado.options
from tornado import gen

logger = logging.getLogger()


signal_received = False


class BlockingHandler(tornado.web.RequestHandler):

    def get(self):
        logger.debug("Starting sleep")
        time.sleep(5)
        logger.debug("Sleep done")
        self.finish('block')


class AsyncHandler(tornado.web.RequestHandler):

    @gen.coroutine
    def get(self):
        logger.debug("Starting range 5")
        for i in range(5):
            logger.debug(i)
            yield gen.sleep(1)
        logger.debug("Range done")
        self.finish('async')


def start_server():
    urls = [
        (r'/', BlockingHandler),
        (r'/async', AsyncHandler),
    ]
    application = tornado.web.Application(urls)
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
    ioloop = tornado.ioloop.IOLoop.instance()

    def register_signal(sig, frame):
        global signal_received
        logger.info("%s received, stopping server" % sig)
        http_server.stop()  # no more requests are accepted
        signal_received = True

    def stop_on_signal():
        global signal_received
        if signal_received and not ioloop._callbacks:
            ioloop.stop()
            logger.info("IOLoop stopped")

    tornado.ioloop.PeriodicCallback(stop_on_signal, 1000).start()
    signal.signal(signal.SIGTERM, register_signal)
    logging.info("Starting server")
    ioloop.start()


if __name__ == '__main__':
    tornado.options.parse_command_line()
    start_server()

It has two problems:

  1. BlockingHandler works as expected (finishes the result and gives it to the client), but raises an Exception:
[E 160808 14:46:12 ioloop:633] Exception in callback None
    Traceback (most recent call last):
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/ioloop.py", line 887, in start
        handler_func(fd_obj, events)
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/stack_context.py", line 275, in null_wrapper
        return fn(*args, **kwargs)
      File "/Users/margus/src/git/tornado-sigterm/venv/lib/python3.5/site-packages/tornado/netutil.py", line 260, in accept_handler
        connection, address = sock.accept()
      File "/usr/local/Cellar/python3/3.5.2/Frameworks/Python.framework/Versions/3.5/lib/python3.5/socket.py", line 195, in accept
        fd, addr = self._accept()
    OSError: [Errno 9] Bad file descriptor
[I 160808 14:46:12 application:58] IOLoop stopped

… which I think means that the tornado.access logger is trying to write after the ioloop has stopped and all sockets to write to gone. Not sure whether a feature or a bug in the logging.

Second problem is with AsyncHandler, where request is killed and client receives Remote Disconnected error.

It would be very nice to get an official right way of approaching this. I’ll now go and test https://gist.github.com/nicky-zs/6304878 in a real world.