fix(parsing): prevent Pydantic schema validator leak in parse_response#3235
Open
xodn348 wants to merge 1 commit into
Open
fix(parsing): prevent Pydantic schema validator leak in parse_response#3235xodn348 wants to merge 1 commit into
xodn348 wants to merge 1 commit into
Conversation
…Pydantic schema leaks When parse_response constructs ParsedResponseOutputText[TextFormatT], ParsedResponseOutputMessage[TextFormatT], and ParsedResponse[TextFormatT] with an unresolved free TypeVar, Pydantic v2 calls model_rebuild on every invocation and never caches the result because the TypeVar cannot be resolved. Each call therefore allocates fresh SchemaValidator and SchemaSerializer objects (heavy Rust structs) that accumulate without bound in long-running servers. Use the unparameterised base classes instead. All three guard their Generic-annotated fields behind `if TYPE_CHECKING:` so the type argument has no runtime effect on ParsedResponseOutputMessage and ParsedResponse; ParsedResponseOutputText stores the actual parsed value via the dict passed to construct_type_unchecked, so the schema type of the `parsed` field (Optional[Any] vs Optional[TextFormatT]) does not matter at runtime. Cast the results to preserve static type information. Adds two regression tests: - correctness: parsed attribute contains the expected Pydantic model - no-leak: SchemaValidator count does not grow after the first call Fixes openai#3084
nomiveritas
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ParsedResponseOutputText[TextFormatT],ParsedResponseOutputMessage[TextFormatT], andParsedResponse[TextFormatT]with their unparameterised forms inopenai/lib/_parsing/_responses.py.cast(...)wrappers to preserve the static type signatures for callers.SchemaValidatorobjects do not grow after the initial warm-up call.Issue
Closes #3084
parse_responseconstructsParsedResponse[TextFormatT](and the two inner types) using a freeTypeVaras the type argument. Pydantic v2'smodel_rebuildcannot resolve a free TypeVar, so it returnsFalseand never populatesMockCoreSchema._built_memo. This causes a freshSchemaValidatorandSchemaSerializer(heavy Rust objects) to be allocated on every singleresponses.parse()call. In a long-running server the process RSS grows linearly with request count.Root cause:
Fix — use the unparameterised class so Pydantic builds and caches the schema once:
All three parameterised Generic fields are either guarded by
if TYPE_CHECKING:(runtime-inert) or set explicitly from the dict passed toconstruct_type_unchecked, so the unparameterised form is runtime-equivalent.Local verification
Schema-leak verification (warm-up + 50 calls, delta must be 0):
Risk
parsedfield value is already embedded in the dict passed toconstruct_type_unchecked.castpreserves all existing type signatures; no downstream callers need updating.construct_type_uncheckedthe same way; the caching fix only matters for Pydantic v2'smodel_rebuildpath, and the unparameterised form is valid for both.