aiobotocore: HeadObject calls are slower than the regular boto when done synchronously
Describe the bug When performing API calls to the s3 through aiobotocore, they get extremely slow compared to the boto itself on the singular mode. In our project, we use s3fs which is a nice wrapper to aiobotocore itself though the problem is that even making a single API call with aiobotocore costs 2-3x more time than making it with boto. I know that there is some sort of overhead for wrapping it and making it async, but I am concerned this being a bug since a single regular operation shouldn’t take that much time.
Here is a demo snippet to test it;
import time
import asyncio
import aiobotocore
from boto3.session import Session
def get_kwargs(f_no):
return {
'Bucket': 'some-bucket',
'Key': f'something/mini/file_{f_no}'
}
def sync_boto(times):
session = Session()
s3 = session.client("s3")
start = time.perf_counter()
for f_no in range(times):
head_object = s3.head_object(**get_kwargs(f_no))
end = time.perf_counter()
return end - start
async def async_boto(times, one_by_one=False):
session = aiobotocore.AioSession()
async with session.create_client("s3") as s3:
start = time.perf_counter()
coros = [s3.head_object(**get_kwargs(f_no)) for f_no in range(times)]
if one_by_one:
for coro in coros:
await coro
else:
await asyncio.gather(*coros)
end = time.perf_counter()
return end - start
print('10 sync head_object calls with boto: ', sync_boto(10))
print('10 sync head_object calls with aiobotocore: ', asyncio.run(async_boto(10, one_by_one=True)))
print('10 async head_object calls with aiobotocore: ', asyncio.run(async_boto(10, one_by_one=False)))
and here are the results I get (my connection is not that great, so feel free to give some room for deviance);
10 sync head_object calls with boto: 5.419796205000239
10 sync head_object calls with aiobotocore: 36.99419660199965
10 async head_object calls with aiobotocore: 3.782599554000626
I know that when running concurrently, it is quite nice though the use case for synchronously running them still exist. Any ideas about why there is so much of a difference in timing?
Checklist
- I have reproduced in environment where
pip check
passes without errors - I have provided
pip freeze
results - I have provided sample code or detailed way to reproduce
- I have tried the same code in botocore to ensure this is an aiobotocore specific issue
- I have tried similar code in aiohttp to ensure this is is an aiobotocore specific issue
- I have checked the latest and older versions of aiobotocore/aiohttp/python to see if this is a regression / injection
pip freeze results
aiobotocore==1.2.1
Environment:
- Python Version:
3.8
- OS name and version:
ubuntu 20.04
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Reactions: 1
- Comments: 18
Commits related to this issue
- add some comments — committed to fsspec/s3fs by isidentical 3 years ago
There is definitely an improvement (tried to run 3 times with each operation) and the best to best comparison shows that it is reduced from 36 seconds to 30 seconds. Though that is still too slow (%600) compared to the sync one.
Here is the results (only calls
async_boto(10, one_by_one=True)
) (performed via yappi): https://gist.github.com/isidentical/316ea24961fe7bfead5a66eb5b3a8596Huh, get_object seems to be ~in the same speed with the boto implementation.