evennia: [BUG] Fatal error when using reload command
Describe the bug
Attempting to use the @reload command from in-game often produces a traceback of the webserver internal port being in-use that prevents the server from restarting, leaving the game dead.
20-01-18 17:21:03-05 [..] Server disconnected from the portal.
20-01-18 17:21:03-05 Main loop terminated.
20-01-18 17:21:03-05 Server Shut Down.
20-01-18 17:21:03-05 Loading /home/tehom/arx/evennia/evennia/server/server.py...
20-01-18 17:21:05-05 Loaded.
20-01-18 17:21:05-05 twistd 19.10.0 (/home/tehom/arx/arxvenv/bin/python 3.8.1) starting up.
20-01-18 17:21:05-05 reactor class: twisted.internet.epollreactor.EPollReactor.
20-01-18 17:21:05-05 Traceback (most recent call last):
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/tcp.py", line 1361, in startListening
20-01-18 17:21:05-05 skt.bind(addr)
20-01-18 17:21:05-05 OSError: [Errno 98] Address already in use
20-01-18 17:21:05-05
20-01-18 17:21:05-05 During handling of the above exception, another exception occurred:
20-01-18 17:21:05-05
20-01-18 17:21:05-05 Traceback (most recent call last):
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/bin/twistd", line 11, in <module>
20-01-18 17:21:05-05 load_entry_point('Twisted==19.10.0', 'console_scripts', 'twistd')()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/twistd.py", line 31, in run
20-01-18 17:21:05-05 app.run(runApp, ServerOptions)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/app.py", line 674, in run
20-01-18 17:21:05-05 runApp(config)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/twistd.py", line 25, in runApp
20-01-18 17:21:05-05 runner.run()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/app.py", line 385, in run
20-01-18 17:21:05-05 self.postApplication()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/_twistd_unix.py", line 254, in postApplication
20-01-18 17:21:05-05 self.startApplication(self.application)
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/scripts/_twistd_unix.py", line 442, in startApplication
20-01-18 17:21:05-05 service.IService(application).privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/service.py", line 283, in privilegedStartService
20-01-18 17:21:05-05 service.privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/service.py", line 283, in privilegedStartService
20-01-18 17:21:05-05 service.privilegedStartService()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/internet.py", line 114, in privilegedStartService
20-01-18 17:21:05-05 self._port = self._getPort()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/application/internet.py", line 141, in _getPort
20-01-18 17:21:05-05 return getattr(_maybeGlobalReactor(self.reactor),
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/posixbase.py", line 495, in listenTCP
20-01-18 17:21:05-05 p.startListening()
20-01-18 17:21:05-05 File "/home/tehom/arx/arxvenv/lib/python3.8/site-packages/twisted/internet/tcp.py", line 1363, in startListening
20-01-18 17:21:05-05 raise CannotListenError(self.interface, self.port, le)
20-01-18 17:21:05-05 twisted.internet.error.CannotListenError: Couldn't listen on 127.0.0.1:5001: [Errno 98] Address already in use.
To Reproduce
Steps to reproduce the behavior:
- Use
@reload - Sometimes the above happens randomly
- See error
Expected behavior
The game to restart normally. Using evennia reload from shell works just fine.
Environment, Evennia version, OS etc
Evennia version: 0.9.0 (rev 9649295) OS: posix Python: 3.8.1 Twisted: 19.10.0 Django: 2.2.9
Additional context
WEBSERVER_PORTAL_INTERNAL is set to 5001. So far that’s the only ‘port in use’ issue we’ve come across. Tentatively I want to say this is happening about half the time during our peak hours, but I’m not hammering reloads to test it.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Comments: 20 (18 by maintainers)
I suggest altering the reload process so that it uses os.exec() on the server side to re-launch itself after shutting down the reactor loop. This would ensure the Python interpreter reloads all modules, and, as long as all file handles are closed before doing so, is perfectly fine and will not leak resources. The portal can just keep trying to connect to the server.
That might be a good way to go about it. I did have it fail to restart on the first re-try once, when I did a reload, then attempted to restart it from the console 6 seconds later. So apparently sometimes the port can stay occupied after shutdown is initiated for at least 6 seconds.