fastapi: The memory usage piles up over the time and leads to OOM
First check
- I added a very descriptive title to this issue.
- I used the GitHub search to find a similar issue and didn’t find it.
- I searched the FastAPI documentation, with the integrated search.
- I already searched in Google “How to X in FastAPI” and didn’t find any information.
- I already read and followed all the tutorial in the docs and didn’t find an answer.
- I already checked if it is not related to FastAPI but to Pydantic.
- I already checked if it is not related to FastAPI but to Swagger UI.
- I already checked if it is not related to FastAPI but to ReDoc.
- After submitting this, I commit to one of:
- Read open issues with questions until I find 2 issues where I can help someone and add a comment to help there.
- I already hit the “watch” button in this repository to receive notifications and I commit to help at least 2 people that ask questions in the future.
- Implement a Pull Request for a confirmed bug.
Example
Here’s a self-contained, minimal, reproducible, example with my use case:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
Description
- Open the browser and call the endpoint
/. - It returns a JSON with
{"Hello": "World"}. - But I expected it to return
{"Hello": "Sara"}.
Environment
- OS: [e.g. Linux / Windows / macOS]:
- FastAPI Version [e.g. 0.3.0]:
To know the FastAPI version use:
python -c "import fastapi; print(fastapi.__version__)"
- Python version:
To know the Python version use:
python --version
Additional context
Tracemalloc gave insight on the lines , that are top consumers of memory: (top one seems to be below line in uvicorn) /usr/local/lib/python3.6/site-packages/uvicorn/main.py:305: Line: loop.run_until_complete(self.serve(sockets=sockets))
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 25
- Comments: 86 (3 by maintainers)
Links to this issue
Commits related to this issue
- Deal with https://github.com/tiangolo/fastapi/issues/1624 — committed to andreihalici/googlefinance-stocks-info by andreihalici 2 years ago
im running into the same issue - memory usage slowly builds over time, runnign on gunicorn with 4 uvicorn workers
Many folks are affected by this issue so definitely something is happening, but it could as well be that the problem is in the user code and not in fastapi. So I suggest that, to make things easier for the maintainers, if you’re affected by this issue
Memory issues are tricky but without a good reproducer, it will be impossible for the maintainers to declare whether this is still a problem or not, and if it is, to fix it.
Hello guys I and my colleagues had a similar issue and we solved it.
After profiling we found out the coroutines created by uvicorn did not disappear but remain in the memory (health check request, which basically does nothing could increase the memory usage). This phenomenon was only observed in the microservices that were using
tiangolo/uvicorn-gunicorn-fastapi:python3.9-slim-2021-10-02as base image. After changing the Image, the memory did not increased anymore.tiangolo/uvicorn-gunicorn-fastapias base docker image, try building from python official image. [it worked for us]I did encounter this issue last week, the root cause looked to be mismatched Pydantic types. For instance we had a
intdefined in a response model but was actually a float when returned from our database. We also had anintthat was also astr. Cleaning up the types solved the issue for use. This was a high traffic endpoint of < 100rpsI’m not sure the root cause but I suspect that the errors are caught in and recored in Pydantic somewhere, I suspect here as FastAPI validates returned responses here.
I see the same thing across all my services using different FastApi/Uvicorn/Gunicorn and Sentry SDK versions. This particular one is running:
And is receiving short-lived requests that trigger longer but relatively short background tasks (not like I made myself a Celery out of it). Another thing I can think of is my ignorance around the subject of database connections where we go something like
and then use
get_dbwith awithorDepends(get_gb)(I’m very unsure how to work with a DB and FastAPI but that’s another thing), but when I had this service on sync SQLAlchemy I was running into many issues where the connections were not going back to the connection pool, resulting in timeouts when waiting for a db connection.Using Gunicorn with the
uvicorn.workers.UvicornWorkerworker, workers set to4Here’s the memory usage (the spike on 10/21 is where I increased a replica count and it immediately hogged a lot of memory):
(green is memory %, red is CPU %)
to put it in traffic context (not that any correlation can be seen):
I wanted to limit requests for gunicorn to refresh the workers but I’m getting
Error while closing socket [Errno 9] Bad file descriptorwhich seems related to https://github.com/benoitc/gunicorn/issues/1877Please tell me if I can help i.e. by providing more data.
What is the deal with the original issue of not returning
{"Hello": "Sara"}? Was the original issue edited and now doesn’t make sense in this regard?EDIT: oh, okay. I see. The author included the bug report template without editing it. Sorry for the noise.
I don’t think the arguments used there are enough to discard those.
Thanks for the MRE. 👍
EDIT: I still cannot reproduce it: https://github.com/lorosanu/simple-api/issues/1#issue-1474238426. EDIT2: I can see the increase.
Would this simple-api sample help?
As far as I can tell
Related to #596, that issue already contains a lot of workarounds and information, please follow the updates over there.
+1
uvicorn main:app --host 0.0.0.0 --port 8101 --workers 4
docker:2 core 2GB memory,CentOS Linux release 7.8.2003 (Core)
client call the function below per minute, and server memory usage slowly builds over time.
I have implemented a hacky way to restart workers, but I don’t think it is a good idea to restart services. Waiting for a solution, so that I can remove restart logic and not to care about this weird OOM issue.
I’m also having this issue and I’ve imputed to uvicorn workers. So I opened this issue: https://github.com/encode/uvicorn/issues/1226
@munjalpatel [1] there you see that every route that is not async will be executed in a separate thread … the problem is that it used to use by default a thread pool and this uses up to
min(32, os.cpu_count() + 4)workers [2] so i assume that on some python version this workers are not reused or released and you end up increasing memory. I wrote a litte test app demonstrate that. [3]The implementation
run_in_threadpoolfrom [1] is coming from starlette 0.14.2 [4] (fastapi pinned to that version). but they changed there code toanyio[5]. I just looked briefly into the anyio code but to me it looks like an update to the new starlette ,anyio version could fix that memory problem. But 🤷♂️ if fastapi will update to the new starlette.[1] https://github.com/tiangolo/fastapi/blob/master/fastapi/routing.py#L144 [2] https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor [3] https://github.com/tiangolo/fastapi/issues/596#issuecomment-734880855 [4] https://github.com/encode/starlette/blob/0.14.2/starlette/concurrency.py#L27 [5] https://github.com/encode/starlette/blob/master/starlette/concurrency.py#L27
Are you able to share a minimal, reproducible, example?
I’m wondering if the
run_in_threadpool(field.validate, ...)call here in combination with the global EXC_TYPE_CACHE used by Pydantic’s validation could be contributing.Could you try removing the custom http exception handling for now, and then re-run the load testing?
Anyway, I did see that the graph looks a bit less steep (the orange region) after you applied the workaround. Maybe some other things caused the leak too?
It looks like #596 is due to using def instead of async def ? Ive seen this with async def
Can confirm I am experiencing the same issue. Using
Python 3.10+FastAPI+Hypercorn[uvloop]with 2 workers. The FastAPI project is brand new, so there isn’t any tech debt that could possibly be the cause - no models, schemas or anything fancy being done here.The Docker container starts at around 105.8 MiB of RAM usage when fresh.
After running a Locust swarm (40 Users) all hitting an endpoint that returns data ranging from 200KB to 8MB - the RAM usage of the Docker container grows (and sometimes shrinks, but mostly grows) until I get an OOM exception. The endpoint retrieves data from the Neo4J database and closes the driver connection cleanly each time.
I had some success making the function
async defeven though there was nothing toawaiton. But it seems that FastAPI is still holding onto some memory somewhere… caching?I’m curious why this topic isn’t more popular; surely everyone would be experiencing this. Perhaps we all notice it due to our endpoints returning enough data for us to notice the increase in usage, whereas the general user would most times only return a few KB at a time.
Additional details: Docker CMD
@Xcompanygames Consider Using ONNX instead of TF as it’s usually faster and more reliable.
I’m having a memory leak, but i think is because the inference data stays on memory / gets dupped at some point. I’ll update later if the issue is not related with the inference process.
Update: I wasn’t closing correctly the onnx inference session. The memory accumulation is almost unnoticeable now!
I am also facing a similar issue. All my api endpoints are defined with async.
One thing I have observed though. When I comment out my background tasks (which mainly consist of Database update queries) a consistent increase in RAM was not observed even after load testing with Locust at 300RPS.
If it helps the database I am using is Postgres.
Versions : Python 3.8.10 Fastapi 0.63.0 Uvicorn 0.13.4
In my case:
Finally, I was able to track down the memory leak by simply commenting block by block in my code. Turns out, it was due to this snippet at the top of
main.py:By converting that async function to normal function. The memory stopped leaking (I used locust to spawn 4000 users uploading an audio file of 20 seconds):
Python Version: 3.8.9 FastAPI Version: 0.67.0 Environment: Linux 5.12.7 x86_64
We are able to consistently produce a memory leak by using a synchronous Depends:
This is resolved by switching both endpoint and depends to
async def. This took us a while to hunt down. At first we also thought it only occurred on EC2, but that’s because we were disabling our authentication routines for local testing, which is where the issue was located. For those struggling here: check your depends, if you’ve got them.@ycd it’s 3.8.5, running on a DigitalOcean droplet. it’s on gunicorn with Uvicorn workers.
Same issue here. Did anybody find a good workaround in the meantime?
run into the same error
def-> memory leakasync def-> no memory leak thanks @curtiscookAh. I’ll try and do some profiling. Unfortunately my time is pretty scarce these days with the number of different projects I’m working on but fingers crossed.
@curtiscook The max-requests restarts the service completely, we need to configure workers to keep one running always, when we restart another, was able to solve memory issue, but got into one more, now sometimes I get multiple requests to workers with same data and each worker creates new entry into database.