redis-py: 3.1.0 causing intermittent Connection closed by server error
Version: redis-py: 3.1.0 redis: 3.2.4 django-redis: 4.10.0
Platform: Python 2.7 on Alpine-Linux inside Docker
Description: After upgrading from redis-py 3.0.1, our service becomes very unstable talking to the existing redis server. It generates around 30 ‘Connection closed by server.’ errors in 10 minutes while the server is under ~20 QPS. The error is intermittent and I am not able to reproduce what exactly caused the errors. I tried restarting the redis server, rebuild our Docker images without any cache, and none of them worked.
After rolling back to redis-py==3.0.1, all errors are gone.
I understand that I don’t really provide enough information to fix the problem, but I hope to at least highlight this problem and others might provide more.
Errors
File "lib/last_seen/models.py" in user_seen
96. seen = cache.get(cache_key)
File "/usr/lib/python2.7/site-packages/django_redis/cache.py" in _decorator
39. raise e.parent
Exception Type: ConnectionError at /helper/listing/list_591.5917864/
Exception Value: Error while reading from socket: (u'Connection closed by server.',)
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Reactions: 15
- Comments: 35 (11 by maintainers)
Great, glad things are going well. I’m going to add an EPollSelector today or over the weekend, write a few more tests and then get this merged to master.
Thanks for helping test this stuff!
I’m getting this error as well, also with
retry_on_timeout. In my case the server hastimeout 300set in redis.conf. I can get this to happen consistently by setting a really lowtimeout 1.I believe what’s happening is the connection is timed out by the server, but isn’t being removed from the client’s connection pool. A subsequent request that attempts to use that connection triggers ConnectionError:
Error while reading from socket: ('Connection closed by server.',)Previously, version 3.0.1 would retry and succeed, presumably with another working connection in the pool. In 3.1.0 it fails with an exception. This resutls in a 500 error with django-redis.
Issue #306 seems like it could be involved here. If a connection that has timed out on the server is not removed from the pool until it is tried again and fails, we’d get this behavior. The
retry_on_timeoutbehavior in 3.0.1 mitigated this.Thanks so much for helping test this. I’ve just released 3.2.0 with these changes.
The error message you’re seeing is defined here: https://github.com/andymccurdy/redis-py/blob/master/redis/connection.py#L59 and raised in 5 or so cases throughout the connection module.
What I suspect is happening is that you have connections that sit idle for long enough and are killed by either the Redis server or a network device. Eventually the connection pool provides that connection to you and you attempt to issue a command which fails because the connection is closed. In previous versions, the command would be silently retried because of the faulty
retry_on_timeoutlogic.You could try to use the
socket_keepaliveandsocket_keepalive_optionsparameters to keep the connections healthy. They are passed directly to the Python socket object here: https://github.com/andymccurdy/redis-py/blob/master/redis/connection.py#L529. I apologize there isn’t better documentation on these settings.I’m in the process now of attempting to improve the Connection and ConnectionPool classes. Making sure the ConnectionPool only returns healthy connections is one of my goals although that will take a little bit of time.
File “test_tbmask.py”, line 44, in mget res = rc.mget(keys) File “/root/.jumbo/lib/python3.6/site-packages/redis/client.py”, line 1397, in mget return self.execute_command(‘MGET’, *args, **options) File “/root/.jumbo/lib/python3.6/site-packages/redis/client.py”, line 839, in execute_command return self.parse_response(conn, command_name, **options) File “/root/.jumbo/lib/python3.6/site-packages/redis/client.py”, line 853, in parse_response response = connection.read_response() File “/root/.jumbo/lib/python3.6/site-packages/redis/connection.py”, line 700, in read_response response = self._parser.read_response() File “/root/.jumbo/lib/python3.6/site-packages/redis/connection.py”, line 457, in read_response self.read_from_socket() File “/root/.jumbo/lib/python3.6/site-packages/redis/connection.py”, line 416, in read_from_socket raise ConnectionError(SERVER_CLOSED_CONNECTION_ERROR) redis.exceptions.ConnectionError: Connection closed by server.
who can help me?why i got this error? centos 6.3 python 3.6 redis-py 3.3.0
I think you are seeing the side effects of the
retry_on_timeoutfix in this commit: b940d073de4c13f8dfb08728965c6ac7c183c935The previous (<=3.0.1) condition check would retry the command on both
TimeoutErrorandConnectionErrorexceptions when theretry_on_timeoutoption wasFalse. Obviously that conditional was wrong on multiple levels.When you tested 3.1.0 with
retry_on_timeout=True, I suspect that the one error you encountered was a legitimateConnectionError, which was raised properly (ConnectionErrors are now never retried).I went through all the changes between https://github.com/andymccurdy/redis-py/compare/3.0.1...3.1.0. The
retry_on_timeoutfix was the only thing that could be impacting this.