Skip to content

Add rebuild() method for TypeAdapter and simplify defer_build patterns #10537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
Nov 13, 2024

Conversation

sydney-runkle
Copy link
Contributor

@sydney-runkle sydney-runkle commented Oct 1, 2024

This started as a simplification for the defer_build logic, but also touches a few other things in type_adapter.py.

We are removing the experimental_ flag associated with defer_build for TypeAdapter in v2.10, so I think we should also go ahead and simplify this logic now so that we can avoid breaking changes in future releases.

Closes #10632

Main changes:

  • Remove complexity of frame_depth wrappers
  • Add rebuild() function to type adapter instances to mimick the model_rebuild() pattern on BaseModels
  • Revert core_schema, validator, and serializer back to simple attributes rather than properties - this is more consistent with that of BaseModels - said properties silently fail to Mock... structure son schema build if annotations cannot be evaluated.
  • Add rebuild logic in json schema generation (like that for models)
  • Make defer_build and model_config properties, not functions

Note, there's a nice little perf boost here :)

@github-actions github-actions bot added the relnotes-fix Used for bugfixes. label Oct 1, 2024
Copy link

cloudflare-workers-and-pages bot commented Oct 1, 2024

Deploying pydantic-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: b14206b
Status: ✅  Deploy successful!
Preview URL: https://b431536a.pydantic-docs.pages.dev
Branch Preview URL: https://ta-defer-build-simplificatio.pydantic-docs.pages.dev

View logs

Copy link

codspeed-hq bot commented Oct 1, 2024

CodSpeed Performance Report

Merging #10537 will improve performances by 17.17%

Comparing ta-defer-build-simplification (b14206b) with main (9783bc8)

Summary

⚡ 1 improvements
✅ 43 untouched benchmarks

Benchmarks breakdown

Benchmark main ta-defer-build-simplification Change
test_efficiency_with_highly_nested_examples 919.5 µs 784.8 µs +17.17%

@sydney-runkle sydney-runkle added relnotes-change Used for changes to existing functionality which don't have a better categorization. and removed relnotes-fix Used for bugfixes. labels Oct 2, 2024
@sydney-runkle sydney-runkle changed the title WIP: refactor type adapter namespace management Add rebuild() method for TypeAdapter and simplify defer_build patterns Oct 2, 2024
@Viicos
Copy link
Member

Viicos commented Oct 2, 2024

My best guess regarding why the deferred parent namespace fetching was implemented was to support the following:

from pydantic import TypeAdapter

def func():
    ta = TypeAdapter("Forward", config={"defer_build": True})

    Forward = int

    ta.core_schema  # triggers the core schema build, and `frame.f_locals` is only fetched at this point, so it includes `Forward`

Which presumably won't work anymore with this PR. I'll not that this pattern feels "dangerous", as one can do:

def func():
    ta = TypeAdapter("Forward", config={"defer_build": True})

    Forward = int

    return ta

ta = func()
ta.core_schema  # PydanticUndefinedAnnotation: name 'Forward' is not defined, because locals were gc'ed.

# Things could also be worse, if you defined `Forward = str` before calling `ta.core_schema` here

So I think we need to be careful here before making the change. People might be relying on this behavior?

@sydney-runkle
Copy link
Contributor Author

My best guess regarding why the deferred parent namespace fetching was implemented was to support the following:

I'd argue that we shouldnt' support rebuilds on attribute accesses for core_schema, validator, and serializer, as we don't do that for __pydantic_core_schema__, etc.

Indeed, this is an issue, but it's one that we run into with BaseModel as well, which is why we cache the parent namespace 😢, so maybe we should be doing that for consistency?

@sydney-runkle
Copy link
Contributor Author

I'll note, this works:

from pydantic import BaseModel, TypeAdapter, ConfigDict

Int = int

def tester_model() -> None:
    class Model(BaseModel):
        i: Int

        model_config = ConfigDict(defer_build=True)

    print(Model.__pydantic_core_schema__)
    # mock core schema
    Model.model_rebuild()
    print(Model.__pydantic_core_schema__)
    # works, displays a core schema

tester_model()

def tester_adapter() -> None:
    ta = TypeAdapter(Int, config=ConfigDict(defer_build=True))

    print(ta.core_schema)
    # mock core schema
    ta.rebuild()
    print(ta.core_schema)
    # works, displays a core schema
  
tester_adapter()

@sydney-runkle
Copy link
Contributor Author

Admittedly, I don't think this should undergo serious review until we merge #10530

@sydney-runkle
Copy link
Contributor Author

Closing based on #10632.

Will resume in v2.11

@sydney-runkle sydney-runkle added the relnotes-performance Used for performance improvements. label Nov 10, 2024
type_repr: Name of the type used in the adapter, used in error messages
"""
undefined_type_error_message = (
f'`TypeAdapter[{type_repr}]` is not fully defined; you should define `{type_repr}` and all referenced types,'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TypeAdapter[...] form seems a bit weird, especially because type_repr seems to be str(type) when called from TypeAdapter. Probably fine as is, just wanted to note that this may lead to weird string representations (maybe _display.display_as_type could be used, although it is costly to compute).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could probably rework the APIs a bit to make it possible to defer the cost of computing the error message if it's a concern. Like, make MockCoreSchema accept a callable for the error message rather than just a string

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested the same in other comment about making the error message forming lazy. So it only happens on errors

@sydney-runkle
Copy link
Contributor Author

sydney-runkle commented Nov 12, 2024

EDIT: reducing this to closed within a note now that this has been resolved

Details

So I'll note, this works:

from typing import Generic, Optional, TypeVar

from typing_extensions import NamedTuple as TypingExtensionsNamedTuple

from pydantic import TypeAdapter

T = TypeVar('T')


class MyNamedTuple(TypingExtensionsNamedTuple, Generic[T]):
    x: T
    y: Optional['MyNamedTuple[T]']


ta = TypeAdapter(MyNamedTuple[int])
print(ta.validate_python((1, None)))
#> MyNamedTuple(x=1, y=None)

So does this (with the same imports as above)

T = TypeVar('T')

def test_in_function () -> None:
    class MyNamedTuple(TypingExtensionsNamedTuple, Generic[T]):
        x: T
        y: Optional['MyNamedTuple[T]']

    ta = TypeAdapter(MyNamedTuple[int])
    print(ta.validate_python((1, None)))

test_in_function()
#> MyNamedTuple(x=1, y=None)

But alas, this, does not. Debugging now.

def test_in_function () -> None:
    T = TypeVar('T')
    
    class MyNamedTuple(TypingExtensionsNamedTuple, Generic[T]):
        x: T
        y: Optional['MyNamedTuple[T]']

    ta = TypeAdapter(MyNamedTuple[int])
    print(ta.validate_python((1, None)))

test_in_function()

Comment on lines +119 to +121
_parent_depth: Depth at which to search the for the parent namespace used for schema building.
We also use this as a reference level to find the global namespace (_parent_depth - 1).
Defaults to 2 because we expect to reference the frame that called the `TypeAdapter` constructor.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth considering - should we default to 1, then add 1 when we call into parent_frame_namespace? That seems more intuitive to me from a user perspective. cc @Viicos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I think this could be a valuable change, it is breaking, as folks are depending on the current behavior. This is _ prefixed though, so we sort of have range to do this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could make more sense but would be inconsistent with model_rebuild which suffers from the same issue :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alrighty, let's leave as is :(

@sydney-runkle
Copy link
Contributor Author

cc @MarkusSintonen, if you're available, this should be ready for another review :)

Comment on lines +271 to +285
schema_generator = _generate_schema.GenerateSchema(config_wrapper, ns_resolver=ns_resolver)

try:
core_schema = schema_generator.generate_schema(self._type)
except PydanticUndefinedAnnotation:
if raise_errors:
raise
_mock_val_ser.set_type_adapter_mocks(self, str(self._type))
return False

try:
self.core_schema = schema_generator.clean_schema(core_schema)
except schema_generator.CollectedInvalid:
_mock_val_ser.set_type_adapter_mocks(self, str(self._type))
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen the same try-catch-if-raise_error-mocking logic repeated in many places. Wondering could it be extracted to some common handling... Maybe later...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's consolidate later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe with 341a778, I can create an issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sydney-runkle
Copy link
Contributor Author

sydney-runkle commented Nov 13, 2024

Another idea I had, which doesn't fit in scope here. See: 341a778

Details

diff --git a/pydantic/_internal/_mock_val_ser.py b/pydantic/_internal/_mock_val_ser.py
index ddf0a6ac7..9bd8be7d4 100644
--- a/pydantic/_internal/_mock_val_ser.py
+++ b/pydantic/_internal/_mock_val_ser.py
@@ -1,6 +1,7 @@
 from __future__ import annotations
 
-from typing import TYPE_CHECKING, Any, Callable, Generic, Iterator, Mapping, TypeVar, Union
+from functools import partial
+from typing import TYPE_CHECKING, Any, Callable, Generic, Iterator, Mapping, TypeAlias, TypeVar, Union
 
 from pydantic_core import CoreSchema, SchemaSerializer, SchemaValidator
 from typing_extensions import Literal
@@ -13,9 +14,8 @@ if TYPE_CHECKING:
     from ..main import BaseModel
     from ..type_adapter import TypeAdapter
 
-
-ValSer = TypeVar('ValSer', bound=Union[SchemaValidator, PluggableSchemaValidator, SchemaSerializer])
-T = TypeVar('T')
+ValidatorOrSerializer: TypeAlias = Union[SchemaValidator, PluggableSchemaValidator, SchemaSerializer]
+ValSer = TypeVar('ValSer', bound=ValidatorOrSerializer)
 
 
 class MockCoreSchema(Mapping[str, Any]):
@@ -109,44 +109,80 @@ class MockValSer(Generic[ValSer]):
         return None
 
 
-def set_type_adapter_mocks(adapter: TypeAdapter, type_repr: str) -> None:
-    """Set `core_schema`, `validator` and `serializer` to mock core types on a type adapter instance.
+MockContainer = TypeVar('MockContainer', bound=type[BaseModel] | TypeAdapter | type[PydanticDataclass])
+RebuildReturnType = TypeVar('RebuildReturnType', bound=CoreSchema | ValidatorOrSerializer)
 
-    Args:
-        adapter: The type adapter instance to set the mocks on
-        type_repr: Name of the type used in the adapter, used in error messages
-    """
-    undefined_type_error_message = (
-        f'`TypeAdapter[{type_repr}]` is not fully defined; you should define `{type_repr}` and all referenced types,'
-        f' then call `.rebuild()` on the instance.'
-    )
 
-    def attempt_rebuild_fn(attr_fn: Callable[[TypeAdapter], T]) -> Callable[[], T | None]:
-        def handler() -> T | None:
-            if adapter.rebuild(_parent_namespace_depth=5) is not False:
-                return attr_fn(adapter)
+class MockFactory(Generic[MockContainer]):
+    """Factory for creating `MockCoreSchema`, `MockValSer` and `MockValSer` instances for a given type."""
+
+    __slots__ = '_obj', '_error_message', '_rebuild'
+
+    def __init__(
+        self,
+        obj: MockContainer,
+        error_message: str,
+        rebuild: Callable[[MockContainer], Callable[..., bool | None]],
+    ) -> None:
+        self._obj = obj
+        self._error_message = error_message
+        self._rebuild = rebuild
+
+    def _attempt_rebuild_fn(
+        self, attr_fn: Callable[[MockContainer], RebuildReturnType | None]
+    ) -> Callable[[], RebuildReturnType | None]:
+        def handler() -> RebuildReturnType | None:
+            if self._rebuild(self._obj)(_parent_namespace_depth=5) is not False:
+                return attr_fn(self._obj)
             else:
                 return None
 
         return handler
 
-    adapter.core_schema = MockCoreSchema(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        attempt_rebuild=attempt_rebuild_fn(lambda ta: ta.core_schema),
-    )
-    adapter.validator = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='validator',
-        attempt_rebuild=attempt_rebuild_fn(lambda ta: ta.validator),
-    )
-    adapter.serializer = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='serializer',
-        attempt_rebuild=attempt_rebuild_fn(lambda ta: ta.serializer),
+    def mock_core_schema(self, attr_fn: Callable[[MockContainer], CoreSchema | None]) -> MockCoreSchema:
+        return MockCoreSchema(
+            self._error_message,
+            code='class-not-fully-defined',
+            attempt_rebuild=self._attempt_rebuild_fn(attr_fn),
+        )
+
+    def mock_schema_validator(
+        self, attr_fn: Callable[[MockContainer], SchemaValidator | PluggableSchemaValidator | None]
+    ) -> MockValSer:
+        return MockValSer(
+            self._error_message,
+            code='class-not-fully-defined',
+            val_or_ser='validator',
+            attempt_rebuild=self._attempt_rebuild_fn(attr_fn),
+        )
+
+    def mock_schema_serializer(self, attr_fn: Callable[[MockContainer], SchemaSerializer | None]) -> MockValSer:
+        return MockValSer(
+            self._error_message,
+            code='class-not-fully-defined',
+            val_or_ser='serializer',
+            attempt_rebuild=self._attempt_rebuild_fn(attr_fn),
+        )
+
+
+def set_type_adapter_mocks(adapter: TypeAdapter, type_repr: str) -> None:
+    """Set `core_schema`, `validator` and `serializer` to mock core types on a type adapter instance.
+
+    Args:
+        adapter: The type adapter instance to set the mocks on
+        type_repr: Name of the type used in the adapter, used in error messages
+    """
+    mock_factory = MockFactory[TypeAdapter](
+        obj=adapter,
+        error_message=(
+            f'`TypeAdapter[{type_repr}]` is not fully defined; you should define `{type_repr}` and all referenced types,'
+            f' then call `.rebuild()` on the instance.'
+        ),
+        rebuild=lambda ta: partial(ta.rebuild, raise_errors=False),
     )
+    adapter.core_schema = mock_factory.mock_core_schema(attr_fn=lambda ta: ta.core_schema)  # pyright: ignore[reportAttributeAccessIssue]
+    adapter.validator = mock_factory.mock_schema_validator(attr_fn=lambda ta: ta.validator)  # pyright: ignore[reportAttributeAccessIssue]
+    adapter.serializer = mock_factory.mock_schema_serializer(attr_fn=lambda ta: ta.serializer)  # pyright: ignore[reportAttributeAccessIssue]
 
 
 def set_model_mocks(cls: type[BaseModel], cls_name: str, undefined_name: str = 'all referenced types') -> None:
@@ -157,37 +193,18 @@ def set_model_mocks(cls: type[BaseModel], cls_name: str, undefined_name: str = '
         cls_name: Name of the model class, used in error messages
         undefined_name: Name of the undefined thing, used in error messages
     """
-    undefined_type_error_message = (
-        f'`{cls_name}` is not fully defined; you should define {undefined_name},'
-        f' then call `{cls_name}.model_rebuild()`.'
+    mock_factory = MockFactory[type[BaseModel]](
+        obj=cls,
+        error_message=(
+            f'`{cls_name}` is not fully defined; you should define {undefined_name},'
+            f' then call `{cls_name}.model_rebuild()`.'
+        ),
+        rebuild=lambda c: partial(c.model_rebuild, raise_errors=False),
     )
 
-    def attempt_rebuild_fn(attr_fn: Callable[[type[BaseModel]], T]) -> Callable[[], T | None]:
-        def handler() -> T | None:
-            if cls.model_rebuild(raise_errors=False, _parent_namespace_depth=5) is not False:
-                return attr_fn(cls)
-            else:
-                return None
-
-        return handler
-
-    cls.__pydantic_core_schema__ = MockCoreSchema(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_core_schema__),
-    )
-    cls.__pydantic_validator__ = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='validator',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_validator__),
-    )
-    cls.__pydantic_serializer__ = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='serializer',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_serializer__),
-    )
+    cls.__pydantic_core_schema__ = mock_factory.mock_core_schema(attr_fn=lambda c: c.__pydantic_core_schema__)  # pyright: ignore[reportAttributeAccessIssue]
+    cls.__pydantic_validator__ = mock_factory.mock_schema_validator(attr_fn=lambda c: c.__pydantic_validator__)  # pyright: ignore[reportAttributeAccessIssue]
+    cls.__pydantic_serializer__ = mock_factory.mock_schema_serializer(attr_fn=lambda c: c.__pydantic_serializer__)  # pyright: ignore[reportAttributeAccessIssue]
 
 
 def set_dataclass_mocks(
@@ -202,34 +219,15 @@ def set_dataclass_mocks(
     """
     from ..dataclasses import rebuild_dataclass
 
-    undefined_type_error_message = (
-        f'`{cls_name}` is not fully defined; you should define {undefined_name},'
-        f' then call `pydantic.dataclasses.rebuild_dataclass({cls_name})`.'
+    mock_factory = MockFactory[type[PydanticDataclass]](
+        obj=cls,
+        error_message=(
+            f'`{cls_name}` is not fully defined; you should define {undefined_name},'
+            f' then call `pydantic.dataclasses.rebuild_dataclass({cls_name})`.'
+        ),
+        rebuild=lambda c: partial(rebuild_dataclass, cls=c, raise_errors=False),
     )
 
-    def attempt_rebuild_fn(attr_fn: Callable[[type[PydanticDataclass]], T]) -> Callable[[], T | None]:
-        def handler() -> T | None:
-            if rebuild_dataclass(cls, raise_errors=False, _parent_namespace_depth=5) is not False:
-                return attr_fn(cls)
-            else:
-                return None
-
-        return handler
-
-    cls.__pydantic_core_schema__ = MockCoreSchema(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_core_schema__),
-    )
-    cls.__pydantic_validator__ = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='validator',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_validator__),
-    )
-    cls.__pydantic_serializer__ = MockValSer(  # pyright: ignore[reportAttributeAccessIssue]
-        undefined_type_error_message,
-        code='class-not-fully-defined',
-        val_or_ser='serializer',
-        attempt_rebuild=attempt_rebuild_fn(lambda c: c.__pydantic_serializer__),
-    )
+    cls.__pydantic_core_schema__ = mock_factory.mock_core_schema(attr_fn=lambda c: c.__pydantic_core_schema__)  # pyright: ignore[reportAttributeAccessIssue]
+    cls.__pydantic_validator__ = mock_factory.mock_schema_validator(attr_fn=lambda c: c.__pydantic_validator__)  # pyright: ignore[reportAttributeAccessIssue]
+    cls.__pydantic_serializer__ = mock_factory.mock_schema_serializer(attr_fn=lambda c: c.__pydantic_serializer__)  # pyright: ignore[reportAttributeAccessIssue]

@Viicos, you might appreciate this, seems better from a type checking perspective perhaps? Or just more consolidated generally.

Copy link
Member

@Viicos Viicos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will still be able to tweak things before the final release if this break existing code with the next beta

@sydney-runkle sydney-runkle enabled auto-merge (squash) November 13, 2024 17:13
@sydney-runkle sydney-runkle merged commit c2647ab into main Nov 13, 2024
51 checks passed
@sydney-runkle sydney-runkle deleted the ta-defer-build-simplification branch November 13, 2024 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes-change Used for changes to existing functionality which don't have a better categorization. relnotes-performance Used for performance improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Namespace management cleanup for TypeAdapter
4 participants