redis-py: 4.6.0 - ConnectionError max number of clients reached
Version: 4.6.0
Platform: Python 3.10 / Docker or Linux
Server : Redis 7.0.11
Description:
We have an application using redis in an async context for months now. With the ugrade of redis-py from 4.5.x to 4.6.0, we saw our app on staging environment starts failing with ConnectionError max number of clients reached
I first run a review of the code to add the missing await redis_conn.close() statements that were missing but the issue still occurs.
It seems that we reach the 10.000 max clients settings in one hour approx whereas we never used to so far. Unfortunately on this environment, I don’t track this metric.
On staging last week, I used a workaround to set a client timeout to pass the one hour issue (config set timeout 60)
Doing it again a few minutes ago, I could drop connection from 8k to 160 and it’s stable then around 160.
On production environment (Python 3.9, redis 4.5.5), we have 25 connected clients on avergage over the last 6 months with a similar activity to our staging environment. And there is no timeout configuration set, so it’s disabled (default configuration)
On staging, reverting to 4.5.5 version of redis-py and with timeout disabled (aka value to 0), after a few minutes it’s stable around 20 connections whereas it was to be 1000+ in the same amount of time with 4.6.0.
My code used to be basically :
from redis.asyncio import Redis
from app import settings
async def get_redis_connection():
redis = Redis.from_url(settings.REDIS_URL)
return redis
async def list_queued_job_ids():
redis_conn = await get_redis_connection()
... do something with redis data and related code ...
await redis_conn.close()
return something
So I don’t know if it’s a regression in 4.6.0 with the AbstractClass maybe ? or something to add/change in my code but I didn’t find any documentation that could help me on the changes to be made.
Let me know if you need further details.
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 37 (18 by maintainers)
Commits related to this issue
- Bump redis from 4.6.0 to 5.0.0 (#227) Bumps [redis](https://github.com/redis/redis-py) from 4.6.0 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.... — committed to aeecleclair/Hyperion by dependabot[bot] 10 months ago
- chore(deps): bump redis from 4.5.4 to 5.0.0 in /backend (#724) Bumps [redis](https://github.com/redis/redis-py) from 4.5.4 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sourced from ... — committed to Substra/substra-backend by dependabot[bot] 10 months ago
- chore(deps): bump redis from 4.6.0 to 5.0.0 in /metrics-exporter (#723) Bumps [redis](https://github.com/redis/redis-py) from 4.6.0 to 5.0.0. <details> <summary>Release notes</summary> <p><em>Sour... — committed to Substra/substra-backend by dependabot[bot] 10 months ago
The reason was that the async version was doing all kinds of nasty stuff, including creating a new event loop, in order to perform a disconnect.
__del__is a synchronous method, whereasdisconnect()is async.The reasoning was also, that if there is a del call, there is no need to disconnect cleanly (i.e a socket disconnect will happen when the socket objects gets collected). What I failed to take into account was that in case of connection pooling, this is not a disconnect, but a return to the pool. connections need to return to the pool, not disconnect. We can do this more cleanly. I will create a pr which achieves this without having to invoke all the async stuff.
As several related bugs were fixed with this issue, I suggest that next one starts a new issue 😄
thanks @kristjanvalur for all your fixes
aaand, it got merged 😃
Right, @Harry-Lees , I’d rewrite that as:
or
this ensures that the connection is closed even if there is an error when using it.
@kristjanvalur seems I found the last remaining culprits where redis connection was not correctly closed.
Thanks for your detailed explaination about context managers. It helped to fix few ones this afternoon.
On my local machine, seems quite stable so far - will run it all night long to see if it changes but seems good so far this time 😄
I haven’t tried 5.0 in our production application since I had issues with 4.6 which capped our Redis instance connections but I also saw this issue on 4.6 with a FastAPI application. Our pattern was essentially
which was then called in endpoints using FastAPI’s dependency injection.
Interestingly, in async python, the canonical method to use is
aclose(). I’ll probably create PR to add those, so that stdlib context managers and other tools play well with redis.The proper pattern, with the small example you gave me is:
Or for a worker:
Basically, making sure that whenever you have created a Redis connection pool, (implicit inside the Redis object), that it is appropriately closed afterwards.
If you have a more complex pattern, such as a separately managed ConnectionPool (which is wise), then you similarly need to close that, when you have stopped using it (using
pool.disconnect(*)curiously. It should have anaclose()method too, for consistency.In async python, doing any sort of IO related cleanup during garbage collection is frowned upon. Instead, it is the responibility of the programmer to ensure that resources are freed, typically using a context manager pattern as above. This is because
__del__handlers can be called at arbitrary places and often are very limited in that what they can do.Synchronous python is more forgiving, in that blocking IO operations can generally be done anywyere.
Standard library python generally does not do any cleanup from del. Synchronous
redis-pydid put safety belts in the sync code. The async library also had safety belts in place but they were nasty: They would create a new event loop and do things that generally should not be done in__del__methods.As it’s a rainy day, I just tested it and it works as expected.
then, run the sample code below :
The two versions of the
redisvariable definition works as expected with same behaviour.So from my point of view, all the issues identified in this issue are fixed.
Thanks for your support, fixes and follow-up @kristjanvalur @woutdenolf @dvora-h !
So, no, I fixed this use case in the pr.
Interesting, I can have a look.
update: The difference in behaviour is that when
Redis.from_url()is called, thenRedis.auto_close_connection == False, whereas it isTrueotherwise. I think this is a bug, since a private connection pool is indeed created, and should be closed.When it is False, the pool is simply garbage collected, and also the connections in it. For some reason, however, without an explicit disconnect, the connection stays open. I need to investigate how that happens.
When I checkout
v4.5.5the connection does have a__del__.It was removed in https://github.com/redis/redis-py/pull/2755
@kristjanvalur Can you provide some insight in why
__del__was removed? The use-case above needs it.Nice. I can reproduce it. I compared the behavior with the synchronous version and I noticed this:
If you make a sync equivalent of the code in https://github.com/redis/redis-py/issues/2831#issuecomment-1641925032, the clients are disconnected when the connections are garbage collected:
There is no
__del__for the async version sodisconnectnever gets called. This means all connections you make stay connected.Could you create a full example that reproduces the issue?
You can hit
maxclientslike this but that’s to be expected