pydantic: Very slow (FastAPI) application startup after using V2 due to `_core_utils.py:walk`

Initial Checks

  • I confirm that I’m using Pydantic V2

Description

Our FastAPI application startup performance degraded from about 5s to over 20s after using V2. This makes eg our test runs very slow (haven’t tried how long it takes in prod to get servers up and running). Issue seems to be related to walking of core schemas as below cProfile results show (from our app init).

Issue seems to be coming from poor performance of initializing TypeAdapters in FastAPI. The issue might be also related on how FastAPI uses new TypeAdapter interface. But there might be some issue also in walking of core schemas, what do you think? Full cProfile print output: cprofile.txt

See latest profiles here: https://github.com/pydantic/pydantic/issues/6768#issuecomment-1975276590

Below are outdated profiles, see link above.

(SLOW) Pydantic v2.0.3 & FastAPI v0.100.0
ncalls            tottime  percall cumtime   percall filename:lineno(function)
3854141/395376    3.575    0.000   18.577    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:193(_walk)
3850208/395376    3.395    0.000   20.191    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:190(walk)
3003272           2.298    0.000    2.644    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:106(get_ref)
9812              1.773    0.000   31.855    0.003   xxx/python3.11/site-packages/pydantic/type_adapter.py:142(__init__)
1352590/179311    1.712    0.000    6.842    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:434(flatten_refs)
309655/256580     1.587    0.000   12.204    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:309(handle_model_fields_schema)
1145758/38773     1.536    0.000    6.745    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:411(collect_refs)
5766050           1.451    0.000    1.451    0.000   {method 'copy' of 'dict' objects}
3229655/682916    1.382    0.000   16.172    0.000   xxx/python3.11/site-packages/pydantic/_internal/_core_utils.py:200(_handle_other_schemas)
9718277           1.131    0.000    1.131    0.000   {method 'get' of 'dict' objects}
83003             0.831    0.000    3.959    0.000   xxx/python3.11/site-packages/pydantic/_internal/_generate_schema.py:1288(_get_prepare_pydantic_annotations_for_known_type)


(FAST) Pydantic v1.10.10 & FastAPI v0.100.0
ncalls            tottime  percall  cumtime  percall filename:lineno(function)
871294/78525      2.790    0.000    3.005    0.000   {built-in method _abc._abc_subclasscheck}
545241/8813       0.775    0.000    1.942    0.000   xxx/python3.11/copy.py:128(deepcopy)
10618             0.490    0.000    1.593    0.000   xxx/python3.11/site-packages/fastapi/utils.py:63(create_response_field)
19486/917         0.377    0.000    1.906    0.002   xxx/python3.11/copy.py:227(_deepcopy_dict)
3202/2127         0.321    0.000    3.182    0.001   xxx/python3.11/site-packages/pydantic/generics.py:75(__class_getitem__)
40290/28472       0.313    0.000    1.657    0.000   xxx/python3.11/inspect.py:2428(_signature_from_callable)
28472             0.285    0.000    0.979    0.000   xxx/python3.11/inspect.py:2333(_signature_from_function)
871294/78525      0.240    0.000    3.033    0.000   <frozen abc>:121(__subclasscheck__)
1422963/1412127   0.231    0.000    0.244    0.000   {built-in method builtins.isinstance}
52781             0.213    0.000    0.265    0.000   xxx/python3.11/inspect.py:2972(__init__)
14998             0.205    0.000    0.206    0.000   {method '__reduce_ex__' of 'object' objects}


(FAST) Pydantic v1.10.10 & FastAPI v0.99.1
ncalls            tottime  percall  cumtime  percall filename:lineno(function)
858955/76183      1.997    0.000    2.190    0.000   {built-in method _abc._abc_subclasscheck}
538585/2157       0.825    0.000    1.788    0.001   xxx/python3.11/copy.py:128(deepcopy)
23230             0.509    0.000    0.521    0.000   xxx/python3.11/site-packages/fastapi/dependencies/models.py:16(__init__)
52781             0.425    0.000    0.476    0.000   xxx/python3.11/inspect.py:2972(__init__)
10618             0.419    0.000    1.480    0.000   xxx/python3.11/site-packages/fastapi/utils.py:75(create_response_field)
19486/917         0.367    0.000    1.776    0.002   xxx/python3.11/copy.py:227(_deepcopy_dict)
3202/2127         0.312    0.000    3.025    0.001   xxx/python3.11/site-packages/pydantic/generics.py:75(__class_getitem__)
40290/28472       0.279    0.000    1.778    0.000   xxx/python3.11/inspect.py:2428(_signature_from_callable)
28472             0.265    0.000    1.148    0.000   xxx/python3.11/inspect.py:2333(_signature_from_function)
858955/76183      0.219    0.000    2.216    0.000   <frozen abc>:121(__subclasscheck__)
1360643/1349807   0.212    0.000    0.224    0.000   {built-in method builtins.isinstance}

Example Code

# This TypeAdapter usage is very slow with large amount of API models (eg API models are generic over inner data models)
# https://github.com/tiangolo/fastapi/blob/f7e3559bd5997f831fb9b02bef9c767a50facbc3/fastapi/_compat.py#L101
self._type_adapter: TypeAdapter[Any] = TypeAdapter(
    Annotated[self.field_info.annotation, self.field_info]
)

Python, Pydantic & OS Version

pydantic version: 2.0.3
        pydantic-core version: 2.3.0 release build profile
                 install path: /Users/markussintonen/Library/Caches/pypoetry/virtualenvs/userdata-IiJ3qKwC-py3.11/lib/python3.11/site-packages/pydantic
               python version: 3.11.4 (main, Jul 17 2023, 14:44:40) [Clang 14.0.3 (clang-1403.0.22.14.1)]
                     platform: macOS-13.4.1-x86_64-i386-64bit
     optional deps. installed: ['typing-extensions']

Selected Assignee: @samuelcolvin

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Reactions: 8
  • Comments: 43 (21 by maintainers)

Most upvoted comments

We’ve actually fully resolved this now. On the benchmark from this issue, startup time has gone from about 11s on my laptop down to 2.5s, so back to v1 levels of performance. There were no compromises / observable changes made (aside from possibly some changes to the generated CoreSchemas), just simplifying code and adding some caches.

It can be even faster if you set the PYDANTIC_SKIP_VALIDATING_CORE_SCHEMAS env var being introduced in https://github.com/pydantic/pydantic/pull/7565. Most schemas are valid and do not need validation, and if you are starting up your FastAPI app in tests then you’re already validating it there, it doesn’t need to be re-validated multiple times. So you could set that env var by default to 'true' (in conftest.py, in your deployment env vars) and then override it to 'false' in a single test responsible for validating your schemas. That way even your tests will run faster.

@MarkusSintonen I’d appreciate if you can test the next release (or main) and let us know if we can close this issue.

Humm, sorry about that, we’ll fix.

I don’t think it needs to be!

While I doubt it will be “easy” to improve this, I do think there’s likely to be a lot of potential for improvement because performance of model/TypeAdapter instantiation hasn’t really been a concern up to this point (though clearly we want to improve your situation).

@MarkusSintonen I’ve pushed a branch to pydantic called lazy-type-adapter which makes type adapters instantiate the core schemas and validators/serializers lazily. Is there any chance you could try installing pydantic from that branch and see if it improves your app startup time? I don’t think that branch is currently viable for merging since it won’t (yet) trigger build-time errors in all cases where it should, BUT, I think the extent to which that branch impacts startup performance will significantly help me understand where there is potential for improvements.

Note also that that branch may not actually improve total startup time for accessing all endpoints, I just think it may delay per-model/endpoint initialization until the first request that hits that endpoint, which I’m thinking might result in a significant improvement to startup time (i.e., time it takes to get a single response from an arbitrary endpoint). I’d be interested to understand if that is not a sufficient improvement for you (i.e., you need to keep total initialization time low even if you can get a response from some endpoint very fast), but either way it will be useful to understand the impact.

I think Adrian and I also have some ideas about how we can do better caching in the schema generation process that should reduce the amount of work being done by the TypeAdapter; that’s not quite ready, but improving this is a priority for us.

Thanks so much for reporting, we’re looking into this.