pydantic: Recursive model with discriminated union has exponential time and space complexity for validation

Initial Checks

  • I confirm that I’m using Pydantic V2

Description

When validating the recursive model given in the example, Pydantic has exponential time and space complexity, meaning that validation time and memory usage may grow indefinitely and rapidly depending on model depth, even when using a discriminator field.

Given the presence of a discriminator field, this should not be the case.

A model like this can potentially be abused in a denial-of-service attack, if the data passed to the validator can be controlled by the user.

Example Code

from __future__ import annotations
from typing import Literal, Annotated
from pydantic import Field, TypeAdapter, BaseModel

class NestedState(BaseModel):
    state_type: Literal["nested"]
    substate: AnyState

# If this type is left out, the model behaves normally again
class LoopState(BaseModel):
    state_type: Literal["loop"]
    substate: AnyState

class LeafState(BaseModel):
    state_type: Literal["leaf"]

AnyState = Annotated[NestedState | LoopState | LeafState, Field(..., discriminator="state_type")]

def build_nested_state(n):
    if n <= 0:
        return {"state_type": "leaf"}
    else:
        return {"state_type": "loop", "substate": {"state_type": "nested", "substate": build_nested_state(n-1)}}
        
adapter = TypeAdapter(AnyState)

# the next statement takes around 0.8s
adapter.validate_python(build_nested_state(9))

# the next statement takes around 3.5s
adapter.validate_python(build_nested_state(10))

# the next statement takes around 12.5s
adapter.validate_python(build_nested_state(11))

Python, Pydantic & OS Version

pydantic version: 2.6.0
        pydantic-core version: 2.16.1
          pydantic-core build: profile=release pgo=true
                 install path: /<redacted>/venv-3.11/lib/python3.11/site-packages/pydantic
               python version: 3.11.7 (main, Dec  4 2023, 18:10:11) [Clang 15.0.0 (clang-1500.1.0.2.5)]
                     platform: macOS-14.3-arm64-arm-64bit
             related packages: fastapi-0.105.0 typing_extensions-4.8.0 pydantic-settings-2.1.0
                       commit: unknown

About this issue

  • Original URL
  • State: closed
  • Created 5 months ago
  • Reactions: 1
  • Comments: 20 (12 by maintainers)

Most upvoted comments

Hey all 😄,

After some extended debugging, we found a fix, and will release 2.6.3 today with the changes. Please ping me if you’re experiencing any other issues. Happy coding! 🚀

Hi all,

Given the significance of this bug, we’re working on a fix now, and will plan to do a 2.6.3 release with said fix, once we have it! We’ll keep this thread updated as we progress.

If anyone else needs a workaround until this fixed I’m currently using a Discriminator with a custom function and annotating each type with Tag. The built schema always has a tagged-union in this case:

def get_state_type(v: Any) -> Hashable:
    if isinstance(v, dict):
        return v.get("state_type")
    return getattr(v, "state_type", None)

AnyState = Annotated[
    Annotated[NestedState, Tag("nested")]
    | Annotated[LoopState, Tag("loop")]
    | Annotated[LeafState, Tag("leaf")],
    Discriminator(get_state_type),
]

@sydney-runkle unfortunately still reproducible with 2.6.1, as David already noticed.

Afraid I’m reproducing this on main 😢