PyrateLimiter: Threadpool issues with not being able to start a new thread

https://github.com/vutran1710/PyrateLimiter/issues/156 Following up from our previous discussion here, I upgraded the package version and set the number of threadpools to be 2(in a 2 core container) but we started seeing this specific error once traffic increased on our server. So i was trying to understand is the threadpool process limit not only a resource contraint but also a concurrent request/unique bucket constraint? Does it make sense for us to have this to a higher number to correspond our concurrent requests estimates?

Also I was curious, on when are the threads created? are those created for each unique bucket and more importantly when are they released? Because we dont want this number dictating traffic to our service.

if self.rate_limiter is None:\n  File \"/app/sample/utils/rate_limiter.py\", line 44, in rate_limiter\n    return build_rate_limiter(self._rate_limiter_rate())\n  File \"/app/sample/pyratelimiter/limiters/redis_rate_limiter.py\", line 76, in build_rate_limiter\n    return RedisRateLimiter(\n  File \"/app/sample/pyratelimiter/limiters/redis_rate_limiter.py\", line 39, in __init__\n    argument=bucket_factory, thread_pool=ThreadPool(processes=number_of_leak_threads)\n  File \"/usr/local/lib/python3.10/multiprocessing/pool.py\", line 930, in __init__\n    Pool.__init__(self, processes, initializer, initargs)\n  File \"/usr/local/lib/python3.10/multiprocessing/pool.py\", line 215, in __init__\n    self._repopulate_pool()\n  File \"/usr/local/lib/python3.10/multiprocessing/pool.py\", line 306, in _repopulate_pool\n    return self._repopulate_pool_static(self._ctx, self.Process,\n  File \"/usr/local/lib/python3.10/multiprocessing/pool.py\", line 329, in _repopulate_pool_static\n    w.start()\n  File \"/usr/local/lib/python3.10/multiprocessing/dummy/__init__.py\", line 51, in start\n    threading.Thread.start(self)\n  File \"/usr/local/lib/python3.10/threading.py\", line 935, in start\n    _start_new_thread(self._bootstrap, ())\nRuntimeError: can't start new thread\n

For more context, this is how we implement our custom bucket’s get() function

def get(self, _item: RateItem) -> RedisBucket:
        bucket = RedisBucket([self.rate], self.redis_db, _item.name, self.script_hash)
        self.schedule_leak(bucket, self.base_clock)
        return bucket

Thanks!

About this issue

Original URL
State: closed
Created 4 months ago
Comments: 15 (9 by maintainers)

Most upvoted comments

Sounds good let me test it out. Just confirming, do I still need to call a schedule_leak for every new bucket being created and also still create the threadpool to be passed in to the Limiter? Is there a changelog doc i can use to see the differences and make those changes in my code base accordingly?

You still do need to schedule the leak
No more threadpool
Remove threadpool if you are using one
Other than that, your existing code would still work
there is changelog right below README. But I wasnt too clear on the new update since there is basically no change in the API

vutran1710 on Mar 11, 2024

Followup 2: If we were to limit usage of an endpoint(i.e a bucket) for every callerID and have multiple different rates based on a callerID(rates will vary based on payment tiers essentially) and we have thousands of different callers.

In this case, we would require a bucket to be created for every unique caller right, so does that mean the threads in the threadpool also need to be as big as your callerID count? and essentially assign one for every bucket/caller?

aaditya-srivathsan on Mar 10, 2024