aiobotocore: HeadObject calls are slower than the regular boto when done synchronously

Describe the bug When performing API calls to the s3 through aiobotocore, they get extremely slow compared to the boto itself on the singular mode. In our project, we use s3fs which is a nice wrapper to aiobotocore itself though the problem is that even making a single API call with aiobotocore costs 2-3x more time than making it with boto. I know that there is some sort of overhead for wrapping it and making it async, but I am concerned this being a bug since a single regular operation shouldn’t take that much time.

Here is a demo snippet to test it;

import time
import asyncio
import aiobotocore
from boto3.session import Session

def get_kwargs(f_no):
    return {
        'Bucket': 'some-bucket',
        'Key': f'something/mini/file_{f_no}'
    }

def sync_boto(times):
    session = Session()
    s3 = session.client("s3")

    start = time.perf_counter()
    for f_no in range(times):
        head_object = s3.head_object(**get_kwargs(f_no))
    end = time.perf_counter()
    return end - start

async def async_boto(times, one_by_one=False):
    session = aiobotocore.AioSession()
    async with session.create_client("s3") as s3:
        start = time.perf_counter()
        coros = [s3.head_object(**get_kwargs(f_no)) for f_no in range(times)]
        
        if one_by_one:
            for coro in coros:
                await coro
        else:
            await asyncio.gather(*coros)

        end = time.perf_counter()
        return end - start

print('10 sync head_object calls with boto: ', sync_boto(10))
print('10 sync head_object calls with aiobotocore: ', asyncio.run(async_boto(10, one_by_one=True)))
print('10 async head_object calls with aiobotocore: ', asyncio.run(async_boto(10, one_by_one=False)))

and here are the results I get (my connection is not that great, so feel free to give some room for deviance);

10 sync head_object calls with boto:  5.419796205000239
10 sync head_object calls with aiobotocore:  36.99419660199965
10 async head_object calls with aiobotocore:  3.782599554000626

I know that when running concurrently, it is quite nice though the use case for synchronously running them still exist. Any ideas about why there is so much of a difference in timing?

Checklist

  • I have reproduced in environment where pip check passes without errors
  • I have provided pip freeze results
  • I have provided sample code or detailed way to reproduce
  • I have tried the same code in botocore to ensure this is an aiobotocore specific issue
  • I have tried similar code in aiohttp to ensure this is is an aiobotocore specific issue
  • I have checked the latest and older versions of aiobotocore/aiohttp/python to see if this is a regression / injection

pip freeze results aiobotocore==1.2.1

Environment:

  • Python Version: 3.8
  • OS name and version: ubuntu 20.04

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 18

Commits related to this issue

Most upvoted comments

There is definitely an improvement (tried to run 3 times with each operation) and the best to best comparison shows that it is reduced from 36 seconds to 30 seconds. Though that is still too slow (%600) compared to the sync one.

 $ python t.py
10 sync head_object calls with boto:  5.458850416000132
10 sync head_object calls with aioboto:  30.85212171800049
10 async head_object calls with aioboto:  3.713436236999769
(.venv38) (Python 3.8.5+) [  2:41ÖS ]  [ isidentical@desktop:~ ]
 $ pip freeze | grep aiohttp
aiohttp==3.7.4.post0

Here is the results (only calls async_boto(10, one_by_one=True)) (performed via yappi): https://gist.github.com/isidentical/316ea24961fe7bfead5a66eb5b3a8596

Huh, get_object seems to be ~in the same speed with the boto implementation.

10 sync get_object calls with boto:  11.472657151000021
10 sync get_object calls with aioboto:  12.949970024999857
10 async get_object calls with aioboto:  1.9521049269997093