fastapi: FastAPI and Uvicorn is running synchronously and very slow

I’m new in FastAPI and i’m testing file uploads and asyncronous requests. However, when i perform several request with clients parallel and serial the FastAPI process each upload in Queue (synchronously) and very slow. I’m performing the API with uvicorn and gunicorn and with 1 worker. With both execution the time spent was the same.

My client send 4 files with approximately 20MB in parallel (or in serial) for FastAPI endpoint, however, it is storing the files one at a time and very slow.

I made the same upload with a aiohttp endpoint and the files was stored in approximately 0.6 seconds with client making request in parallel (multiprocessing) and 0.8 seconds with client making request in serial (in mean). When i made this uploads in FastAPI the files was stored in approximately 13 seconds with client making parallel request and 15 seconds with client making serial request (in mean)

Would I like know if i’m making anything wrong?

Server Code


# app.py

from fastapi import FastAPI, File, UploadFile
import random
import aiofiles
import os

app = FastAPI()

STORAGE_PATH = 'storage'

@app.post("/")
async def read_root(file: UploadFile=File('teste')):
    fpath = os.path.join(
        STORAGE_PATH, f'{random.randint(0, 5000)}_{file.filename}'
    )
    async with aiofiles.open(fpath, 'wb') as f:
        content = await file.read()
        await f.write(content)

    return {"Test": "Test"}


# uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1
# gunicorn -w=1 -k=uvicorn.workers.UvicornWorker --bind=0.0.0.0:8000 app:app

Client Code

FILES = ['f1.txt', 'f2.txt', 'f3.txt', 'f4.txt' ]

def request(fname):
    files = {'file': open(fname,'rb')} 
    requests.post("http://localhost:8000/", files=files)


def req_mp():
    start = datetime.now() 
    pool = Pool(4) 
    pool.map(request, FILES) 
    print(datetime.now() - start)


def req_serial():
    start = datetime.now()  
    for fn in FILES:    
        request(fn)
    print(datetime.now() - start)

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (7 by maintainers)

Most upvoted comments

@igor-rodrigues1 sanic always reads files entirely into memory before passing control to your handler whereas fastapi via starlette uses a tempfile.SpooledTemporaryFile and thus rolls data over onto disk once it grows beyond 1 MB. if you overwrite the default spooling threshold for UploadFile like

from starlette.datastructures import UploadFile as StarletteUploadFile

# keep the SpooledTemporaryFile in-memory
StarletteUploadFile.spool_max_size = 0

… I think you may see performance somewhat similar to that of sanic & uvicorn as it should eliminate the file I/O trip

Found streaming_form_data which does multipart/form-data parsing much faster, as it uses Cython. In his blog the author wrote that byte-wise parsing in pure Python is slow, so thats why multipart is slow in the current implementation of starlette.

Created a gist with an example.

Had similar performance issues when doing file upload using fastapi. Looks to me as the Multipart parsing causes the performance impact.

from fastapi import FastAPI
from starlette.requests import Request
from starlette.datastructures import UploadFile as StartletteUploadFile, FormData
from starlette.formparsers import MultiPartParser

app = FastAPI()

target = "./test"
# will cause tempfile.SpooledTemporaryFile (StarletteUploadFile) to be written to the same place as target
os.environ["TMPDIR"] = os.getcwd()

# avg speed 275 MB/s
@app.post("/uploadfile")
async def uploadfile(request: Request):
    file = StartletteUploadFile(filename=target)
    async for chunk in request.stream():
        await file.write(chunk)
    await file.close()

# avg speed 350 MB/s
@app.post("/plain")
async def plain(request: Request):
    with open(target, "wb") as file:
        async for chunk in request.stream():
            file.write(chunk)
    file.close()

# avg speed 68 MB/s
@app.post("/multipart")
async def multipart(request: Request):
    m = MultipartParser(request.headers, request.stream())
    await m.parse()

StarletteUploadFile is still a bit slower than writing to a plain file, but that differences might be caused by other things running on my computer. I know that my full body is written to the file and not the file I’m uploading. It was just to check if multipart processing causes the performance impact. I also tested to just write the stream to the parser without writing it to a StarletteUploadfile afterwards, which shows the same.

TLDR; I think file IO is not the main issue here.

IIRC, all aiofiles does is run the blocking file operations in a threadpool to avoid blocking, so there wouldn’t be much difference between using aiofiles to manipulate a file from within an async def read_root() route and using the standard python file I/O functions from within a def read_root() (which FastAPI would automatically run inside a threadpool). From what I’m reading, aiofile (no ‘s’ at the end) uses POSIX asynchronous file I/O calls directly, so it might be worth trying that.

I remember reading in some places that POSIX asyncio operations have their share of issues, and it looks like this might be the reason why asyncio doesn’t support them directly (this section looks to have been written in 2015 or before, so it might not reflect the latest state of things). I’m not seeing any mention of IOCP for async file I/O on Windows or io_uring for Linux, so that might be ripe for improvements in the near future. Right now, though, aiofile seems to be your best option.

TL;DR: Async file I/O is hard, and it’s one use case where support tends to be disappointing in a lot of places, unfortunately.

Thanks for answer @chris-allnutt. I don’t know exactly how do make “async read(bytes)” . i made another tests, with others algorithm and the result also was slow. I changed the algorithm for write file asynchronously (chunk by chunk) and the result was the same.

New test 1

with open(fpath, 'wb') as f:
    content = await file.read(4096)
    f.write(content)

# uvicorn - time spent: 8 seconds - client serial (req_serial)
# uvicorn - time spent: 6 seconds - client parallel (req_mp)
# gunicorn - time spent: 6 seconds - client - parallel (req_mp)
# gunicorn - time spent: 6 seconds - client serial (req_serial)

New test 2

async with aiofiles.open(fpath, 'wb') as f:
    while True:
    chunk = await file.read(4096)
    if not bool(chunk):
        break
        await f.write(chunk)
 # uvicorn - time spent: 15 seconds - client serial (req_serial)
 # uvicorn - time spent: 12 seconds - client parallel (req_mp)
 # gunicorn - time spent: 13 seconds - client serial (req_serial)
 # gunicorn - time spent: 12 seconds - client parallel (req_mp)

this same algorithm was used in my tests with aiohttp and the read and write was asyncronously in fact with time between 0.8 and 0.4 seconds (aproximately) for save files. I made these tests with django with all operations sincronouly and the all files are written between 1.2 and 1.8 seconds.

Another thing i saw, when i send the requests, exists a delay until request are processed by python code.

I saw many benchmarks and fastAPI and Uvicorn has great results. But, I can’t verify the performance optimization that the tool offers in the tests I’m doing.

I’d like to use FastAPI in my next projects, but, unfortunately this basic tests aren’t show very nice results.

aiofiles itself is very slow, I would try re-running this using open and write directly, which will block but will probably be faster

file.read() is also going to read the entire file then put i into memory, which I believe under the covers may be a blocking operation despite saying async

I would try the following which may not work, alternatively you can use async read(bytes) to choose the rate at which you read and then write it.

    async with aiofiles.open(fpath, 'wb') as f:
        for line in file:
            await f.write(content)

If you really want to have non blocking uploads you’ll need to send a binary stream, and read it directly from the body, that’s how I’m handling multi gb uploads in my own system.

@igor-rodrigues1 no, I was suggesting something like

from fastapi import UploadFile
import starlette

starlette.datastructures.UploadFile.spool_max_size = 0


@app.post("/")
async def read_root(file: UploadFile = File('teste')):
    fpath = os.path.join(
        STORAGE_PATH, f'{random.randint(0, 5000)}_{file.filename}'
    )
    with open(fpath, 'wb') as f:
        content = await file.read()
        f.write(content)

hopefully that’s clearer now

the first snippet is incorrect for the reason you noted (starlette’s UploadFile can’t be used as a validation typehint) whereas the second works exactly the same as before–fastapi’s UploadFile is used solely for typehints while starlette’s UploadFile is still the one actually used in starlette.formparsers.MultiPartParser.parse:

                    if b"filename" in options:
                        filename = _user_safe_decode(options[b"filename"], charset)
                        file = UploadFile(
                            filename=filename,
                            content_type=content_type.decode("latin-1"),
                        )

… so we need to adjust spool_max_size in starlette.datastructures.UploadFile, not fastapi.datastructures.UploadFile