fastapi: Very poor performance does not align with marketing
I wanted to check the temperature of this project and so I ran a quick, very simple, benchmark with wrk and the default example:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
Everything default with wrk, regular Ubuntu Linux, Python 3.8.2, latest FastAPI as of now.
Uvicorn with logging disabled (obviously), as per the README:
python3 -m uvicorn fast:app --log-level critical
I get very poor performance, way worse than Node.js and really, really far from Golang:
Running 10s test @ http://localhost:8000
2 threads and 10 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 1.83ms 365.59us 3.90ms 75.34%
Req/Sec 2.74k 116.21 2.98k 65.00%
54447 requests in 10.00s, 7.37MB read
Requests/sec: 5442.89
Transfer/sec: 754.78KB
This machine can do 400k req/sec on one single thread using other software, so 5k is not at all fast. Even Node.js does 20-30k on this machine, so this does not align at all with the README:
The key features are:
Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available.
Where do you post benchmarks? How did you come to that conclusion? I cannot see you have posted any benchmarks at all?
Please fix marketing, it is not at all true.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 60
- Comments: 37 (6 by maintainers)
There seem to be two intertwined discussions here that I think we can address separately.
The NodeJS and Go comparison
There is definitely contention around the phrase “on par with NodeJS and Go” in the documentation. I believe the purpose of that phrase was to be encouraging so that people will try out the framework for their purpose instead of just assuming “it’s Python, it’ll be too slow”. However, clearly the phrase can also spawn anger and be off-putting which would be the opposite of what we’re trying to achieve here.
I believe if the comparison is causing bad feelings toward FastAPI that it should simply be removed. We can claim FastAPI is fast without specifically calling out other languages (which almost always leads to defensiveness). Obviously this is up to @tiangolo and we’ll need his input here when he gets to this issue.
FastAPI’s Performance
If you ask “is it fast” about anything, there will be evidence both for and against. I think the point of linking to TechEmpower instead of listing numbers directly is so that people can explore on their own and see if FastAPI makes sense for their workloads. However, we may be able to do a better job of guiding people about what is “fast” about FastAPI.
For the numbers I’m about to share, I’m using TechEmpowers “Round 19” looking at only the “micro” classification (which is what FastAPI falls under) for “go”, “javascript”, “typescript”, and “python”. I don’t use Go or NodeJS in production, so I’m picking some popular frameworks which appear under this micro category to compare: “ExpressJS” (javascript), “NestJS” (typescript), and “Gin” (golang). I don’t know how their feature sets compare to FastAPI.
Plain Text
I believe this is what most of the comparisons above me are using. FastAPI is much slower than nest/express which is much slower than Gin. Exactly what people are saying above. If your primary workload is serving text, go with Go.
Data Updates
Requests must fetch data from a database, update, and commit it back, then serialize and return the result to the caller. Here FastAPI is much faster than NestJS/Express which are much faster than Gin.
Fortunes
This test uses an ORM and HTML templating. Here all the frameworks are very close to each other but, in order from fastest to slowest, were Gin, NestJS, FastAPI, Express.
Multiple Queries
This is just fetching multiple rows from the database and serializing the results. Here, FastAPI slightly edges out Gin. Express and NestJS are much slower in this test.
Single query
Single row is fetched and serialized. Gin is much faster than the rest which are, in order, FastAPI, NestJS, and Express.
JSON serialization
No database activity, just serializing some JSON. Gin blows away the competition. Express, then Nest, then FastAPI follow.
So the general theme of all the tests combined seems to be if you’re working with large amounts of data from the database, FastAPI is the fastest of the bunch. The less database activity (I/O bound), the further FastAPI falls and Gin rises. The real takeaway here is that the answer to “is it fast” is always “it depends”. However, we can probably do a better job of pointing out FastAPI’s database strengths in the sections talking about speed.
@alexhultman If you are not happy about the different DB choices of TechEmpower, you can probably raise an issue there (e.g. https://github.com/TechEmpower/FrameworkBenchmarks/issues/2845 - that repo is open to contributions), or pick another comprehensive benchmark you prefer, which we can all benefit from when choosing a framework.
Also please be reminded that so far everyone replying to you in this thread is community member only; we are not maintainers of fastapi. If you want to know who wrote that claim, please use git blame. Please be kind to people who are trying to have a discussion here.
Okay, so taking the source you gave me (entirely disregarding my own test), I can read the following:
Which is in very stark contrast with the README:
https://www.collinsdictionary.com/dictionary/english/on-a-par-with
2.2% is not “on par with” 12%. It’s like comparing wine with light beer - they are entirely disjoint, you cannot possibly claim light beer gets you as hammered as wine?
And the golang thing… jeeeez!
Here is a short lesson in critical thinking:
But yes, I guess we should attribute this victory to FastAPI. Because the fact it used PostgreSQL in a test that clearly favors PostgreSQL has nothing at all to do with the outcome. Nothing at all 🎶 😉
And the fact FastAPI scores last in every single test that does not involve the variability of database selection, that is just random coincidence. 🎵 🎹
I must agree with @alexhultman that the performance claims are misleading …I learned it the hard way too. Performance is not really what it claims to be.
Taking another example, the one just serving a chunk of text:
To boldly state as the first feature “Fast: Very high performance, on par with NodeJS and Go” is well… I guess I don’t have to say it. …It leads to disappointments down the road when you discover the truth.
Probably it would be better to just keep “Among the fastest Python frameworks available” and emphasize on the other good features.
@alexhultman Your point might be valid, but I think you might be oversimplifying your tests here. Benchmarks are a tricky thing, but it’s important to know what is it that you are comparing.
FastApi is a Web application framework that provides quite a bit over just an application server.
So if you are comparing FastAPI, say, to NodeJs, then the test should be done over a Web Application Framework as NestJS or similar.
Same thing with Golang. the comparison should be against Revel or something like this.
In @tiangolo’s documentation on benchmarks you can read:
https://fastapi.tiangolo.com/benchmarks/
I believe that when the developers say:
They mean a full application on Golang or NodeJS (on some framework) vs a Full application on FastAPI.
I don’t know how you did the benchmarks, but from TechEmpower benchmarks, this is the result.
In a real world scenario like Data Updates, 1-20 Queries, etc. FastAPI is much faster.
Ok, I came a bit futher than a “hello world” application and conclusions here are correct. Performance cannot be even compared with Node.js or .NET. It’s lower, it’s much slower. But to say honestly I think it’s a python problem, not a framework by itself.
I think this issue has gone as deep as it goes already, nothing can be said that hasn’t already been. Alright, thank you and have a nice day.
For the caching: if you benchmark with a cache solution enabled: then you are testing you cache strategy rather than what fastapi can handle.
I just ran all-defaults comparison FastAPI vs ExpressJS:
I love the syntax and ease of use of FastAPI, but it’s disappointing to see misleading claims about its speed. 367kb/s is NOT “on par” with 1620kb/s. that’s 400% higher throughput than "Fast"Api
but it is about twice as fast as Flask:
To be honest it’s disappointing that @tiangolo or anyone from the fastapi team is not commenting on this. They still have “on par with Nodejs and Golang” on the homepage of their website. If there is a use case where fastapi is on par with Golang and Nodejs that they’re aware of they should share since that would be very useful information. It’s sad because fastapi is a really fun to work with framework and it has a lot of merits, but the potential dishonesty in their marketing is disheartening
FastAPI has a single maintainer, and there were comments on this issue from active users (at the time they posted at least). If you consider those active users as team, then we have replied.
In any case, @tiangolo will see this at some point.
I acknowledge that this thread is closed however, I wanted to add some extra information to aid in this comparison. An important detail I think a lot of these benchmarks miss is proper configuration of your libraries when using FastAPI. So without trying to sound too overly opinionated here are a couple things that I hope provide a better comparison…
--workers Nflag with FastAPI and not using concurrency with Express/Node is NOT apples-to-apples. The simplest approach is to just compare with a single process each.uvicorn[standard]which leveragesuvloopandhttptools,nodejs http parser written in C), JSON performance of FastAPI is on par with Express/Node for an equal number of processes.Okay onto the benchmarks… All of this was run on a M1 Max MBP
Python
Install the libraries
Run the app
Express/NodeJS
Install the libraries
Note that I used Typescript so you will want to add the
@typesas well if you feel like it. I usedloadshfor the math because it’s array functions outperform the native implementations in my testing.Run the app
Results
JSON
Python
Express/NodeJS
Math
When performing mathematical operations we have a few options… With python it is popular to use
numpyfor fast array math. All results are in req/sec for a given array length calculated. The differentiation between sync/async endpoints for python is certainly interesting and challenges the notion that using the sync endpoints with CPU bound code helps performance. This is especially the case when usingnumpy. Noticeably, the pure python approach has horrednous performance and should generally be avoided but, we already knew that. As for thenumpyvs node comparison they are very close at the small/medium array size and thennumpystarts takes the lead to the tune of 3x as many req/sec.Database
Didn’t so anything here but, I have seen that performance of
asyncpgvs node’spglibrary tends to favorasyncpgfrom a performance standpoint but, I haven’t run this test yet.Just a different point of view on the performance-between-languages “issue”:
Especially for web development, I prefer to have a prototype up and running, all the business (or “fun”) logic implemented & all of that implemented through a delight to read code & then try scaling (docker & kubernetes) or improving bottlenecks (celery & rabbitmq) or even implementing with as minimal codebase as possible, microservices written in other languages, rather than spending months in any other language (as basis) that its syntax is over-verbose, its libraries more often than not unmaintained/malware injected etc, its tooling immature or unpromising & its whole communities confused, unhelpful or disoriented with “performance”, “super-secure”, “we-are-the-future” complexes.
At the end of the day a couple of containers with python (any framework) at backend & node (any framework) for client will do the trick for performance as well.
p.s. But yeah, the official claim is overblown & over simplistic. Shame for such a noteworthy & well composed framework.
i did exactly what @alexhultman did. created a “hello world” application in both fastapi and expressjs - using all defaults. I didn’t optimize anything. then ran
wrkcommands as shown in my comment.I also ran a bare uvicorn server (with hello world app):
and already it’s slower than expressjs
I saw people pointing out database tests. This is important because you will use DBs etc, but it isn’t testing the WebFramework itself! It is comparing node.js pg vs python asyncpg or node.js json vs python json/orjson/ujson etc.
FastAPI is not a server, so you can only measure the overhead over raw ASGI and measure the ASGI Server performance.
For measuring overhead, TechEmPower plaintext is actually a cool tool. so let’s do it!
First let’s see how some frameworks with common (uvicorn and meinheld) servers in general, are compared.
So in raw throughput gunicorn+uvicorn is not doing great, and fastiapi using it will not be able to catch up with go frameworks like Gin or Fiber, or even with express and fastify node.js frameworks. Only socketify.py with PyPy pass the throughput of Fiber in this case. So I made an ASGI server (still in development) and WSGI server using socketify (aka uWS C++) as base.
So testing ASGI servers and WebFrameworks, Emmett and Falcon using uvicorn are faster than FastAPI, and faster than node.js express, remember node.js express is really slow.
Using socketify ASGI FastAPI is able to catch up with Emmett and Falcon but If Emmett and Falcon uses socketify ASGI too, they will be faster of course. But still not at the same level of performance as Fastify in node.js for example.
The raw ASGI test of socketify with CPython is faster than Fastify but the overhead of WebFrameworks just put the performance down, but even raw-asgi in CPython is not able to catch up with Gin golang framework (with is not fast for golang standards)
Using the same server as the base you can see that Falcon has less overhead than Emmett and Emmett has less overhead than FastAPI.
Socketify.py is not optimized for CPython yet but is optimized for PyPy, and you can see that using PyPy FastAPI, Emmett and Falcon passes Gin, but not even close to Fiber performance! That’s because the ASGI server it self is slower than Fiber!
So the claim:
Is just nonsense, because FastAPI is not a Server, Uvicorn Server is slower than node.js express, fastify, gin, and fiber SERVERS! Socketify.py ASGI Server is faster than express, fastify, gin servers but slower than Fiber server.
The question must always be using X server with FastAPI overhead on top, can it be on par with NodeJS and Go? The answer is Maybe because, uvicorn is slow compared to it, and socketify ASGI it’s on pair with express, fastify, or gin, but it is not on pair with fasthttp, fiber or even with socketify.py itself without using ASGI!
ASGI just has a lot of overhead! and on top of it, FastAPI has a lot of overhead too!
You can claim that Falcon/Emmett is faster than FastAPI, and you can also claim that socketify.py is faster than some node.js servers and golang servers, but can not claim that FastAPI is faster than node.js or go because is comparing apples to oranges.
Can FastAPI claim be fast? Yes! but need to be compared with WebFrameworks, not Servers! Can FastAPI claim to be the fastest Python WebFramework in all scenarios? No, because others WebFrameworks like Emmett and Falcon have less overhead.
ASGI WebFrameworks in general needs to reduce overhead, most of the overhead is dynamic allocations and the asyncio event loop itself, and it can be mitigated a LOT using Factories to reuse some Tasks/Futures and Request/Response objects in general instead of GC them every time, and using PyPy to be able to use stack allocations.
Raw throughput is very important for any big application because most production codes should use in-memory caching for responses. That’s why with socketify.py you can use sync route to check your in-memory cache and use
res.run_async(coro)to go async only if you need to get the data itself. And that is why I will develop a Caching tool to not even touch Python or Python GIL and just send the response from C++ world when cached.You can claim:
It’s not “on pair” because it’s faster with PyPy or slower with CPython.
If you want to see some WSGI numbers to compare overhead here is a chart:
WSGI has less overhead because it’s not using asyncio and because is not doing a lot of unnecessary dynamic allocations, in fact, Socketify.py WSGI does more coping and work than ASGI because it uses ASGI native API and covert the headers before calling the app itself, most of the slowdown is asyncio event loop overhead.
Django is just slow, so slow that even with socketify it can’t be faster than express, and only when PyPy optimizes and reduces the overhead of Django that it can be faster than express.
Falcon and Flask can be faster than Gin with PyPy too, but take a closer look, Falcon + WSGI have really little overhead over raw-wsgi, falcon + ASGI have a way bigger overhead over raw-asgi.
Of course, it is not faster than fasthttp or fiber golang, but at least you can say that is faster than SOME golang frameworks.
And remember PyPy it’s not only for compute-heavy workloads, but is also GREAT to remove unnecessary overhead.
socketify.py project: https://github.com/cirospaciari/socketify.py how to run ASGI/WSGI with socketify: https://docs.socketify.dev/cli.html tfb tools and benchs: https://github.com/TechEmpower/FrameworkBenchmarks
@introom if using ‘async’, router will use
await func(). else router will use `await run_in_threadpool(func)I guess the mystery is solved (kind of)!
FastAPI is even faster than NodeJS (even with a single worker). You just need to make the method
async. Though I still don’t understand why that’s the case when there’s no io?Here is the
abcommand for benchmarking:Here is the version without
asynckeyword:Result: Requests per second: 2596.19 [#/sec] (mean)
And with
async:Result: Requests per second: 4902.09 [#/sec] (mean)
Express:
Result: Requests per second: 4545.43 [#/sec] (mean)
All the above tests are done with a single worker. As a side note, just by adding
--workers Nwithout changing any code, FastAPI provides significantly higher performance (in my case 8 workers gave me 7698.08 [#/sec]!)This is unbelievable! Did I miss anything?
This comment explain it: https://github.com/tiangolo/fastapi/issues/1664#issuecomment-653580642
I really dont understand why people keep arguing here…
@andreixk please check the benchmarks that i sent, if you think it is inaccurate please open an issue in TechEmpower’s GitHub Repository
btw nice article https://www.travisluong.com/fastapi-vs-fastify-vs-spring-boot-vs-gin-benchmark/ when you put performant async libraries and correctly use fastapi to take advantage of the cores it seems on par with node/express which makes sense
@mjsampson saying their marketing is dishonest is a bit harsh. Marketing tends to emphasize one’s strengths and benefits to encourage patronage of whatever one is offering. Based on a subset of the independent benchmarks, to which a link is provided, their claim of performance advantages over certain language runtimes isn’t misplaced. So for you to impugn on the marketing statement and by extension the character of the author(s) is in my opinion, rather disheartening.
How are you running fastapi to ensure that your benchmark is valid?