azure-sdk-for-python: [Storage]: Severe performance degradation observed using v12.0 *.aio BlobClient
Severe performance degradation observed using v12.0 *.aio BlobClient
Team…on a current project we are seeing some drastic performance degradations when attempting to use the *.aio libs from v12.0 of azure-storage-blob. It’s quite possible we may be using the library incorrectly, but have not been able to find any documentation stating otherwise. The issue outlined below may in fact be a bug.
System Under Test
- Repo: https://github.com/mjyeaney/CustomVisionSlidingWindowAPI
- Linux App Service Plan
- B2 App Service Plan (200 ACU, 3.5 GB RAM, A-series equiv), instance count = 1
- Running Gunicorn -> Uvicorn (ASGI server) -> Quart (async REST fx)
- Gunicorn worker count: CPU core count + 1
- Using azure-storage-blob v.12.0, *.aio methods
Load Testing Environment
- Located in same datacenter
- Ds2v3 (2 CPU, 8GB RAM)
- Tests run using Apache Bench (
ab
) utility
Tests
There are 2 endpoints that we’re testing to ensure we don’t have any unexpected regressions. The first is a simple JSON serialization using coroutines/await+async, and does NOT use the Azure storage SDK in any code path.
Running ab
against this endpoint with n=1000, c=50
, this is handling ~450 - 550 req/sec (sample output below), showing that even on limited dev/test hardware this is performing reasonably well:
Concurrency Level: 50
Time taken for tests: 1.842 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 330000 bytes
HTML transferred: 38000 bytes
Requests per second: 542.83 [#/sec] (mean)
Time per request: 92.110 [ms] (mean)
Time per request: 1.842 [ms] (mean, across all concurrent requests)
Transfer rate: 174.94 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 2 0.5 2 8
Processing: 4 89 57.0 80 318
Waiting: 4 89 57.0 80 318
Total: 5 91 57.1 82 321
Percentage of the requests served within a certain time (ms)
50% 82
66% 103
75% 120
80% 130
90% 159
95% 186
98% 245
99% 319
100% 321 (longest request)
The second endpoint is simply downloading a VERY small JSON blob (~76 bytes), using the following code:
...
blob_client = BlobClient.from_connection_string(self.connection_string, "jobs", operation_id)
data_stream = await blob_client.download_blob()
raw_value = await data_stream.readall()
...
Using the same ab
parameters (n=1000, c=50), we see a DRASTIC bottleneck showing, as the req/sec are reduced to around 20 - 25 rps:
Concurrency Level: 50
Time taken for tests: 43.861 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 431000 bytes
HTML transferred: 126000 bytes
Requests per second: 22.80 [#/sec] (mean)
Time per request: 2193.062 [ms] (mean)
Time per request: 43.861 [ms] (mean, across all concurrent requests)
Transfer rate: 9.60 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 2 0.4 2 8
Processing: 193 2171 1116.8 1969 5131
Waiting: 193 2171 1116.8 1969 5130
Total: 194 2173 1116.8 1970 5133
Percentage of the requests served within a certain time (ms)
50% 1970
66% 2414
75% 2964
80% 3193
90% 3767
95% 4524
98% 4788
99% 5130
100% 5133 (longest request)
This does not seem to match at all what the underlying storage account is reporting via metrics (max around 20ms, allowing for 50ms just to be safe). Additionally, total transactions is reporting nearly exactly the number of tests (~1000), indicated that general purpose storage account throttling is not occuring.
Given the very small size of these blobs, and the fact the underlying pipeline transport should be using the aiohttp provider, something feels like it is either overwhelming the event loop, or still using a blocking call unintentionally.
Would appreciate any insights that can be offered - while there will of course be overhead to calling storage (don’t expect the same perf as the first test), there is seemingly quite a bit of un-accounted for time being taken somewhere. We definitely do not want to resort to implementing these calls via Tornado / etc. by hand if we don’t have to.
Thanks!
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 18 (9 by maintainers)
Perfect! I’ll test it tomorrow and post results here.
Thanks!
@mjyeaney We have just released the latest version of azure-core. Can you upgrade it using
pip install azure-core==1.1.0
and see if you see improvements with results? Thanks.Local dev environment: Python 3.6.8, pip 9.0.1
Dockerfile:
FROM python:3.6