feat: add static schema generation with legacy fallback (0.17 gated)#2788
feat: add static schema generation with legacy fallback (0.17 gated)#2788michaeldwan merged 12 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors Cog’s static schema generation to use a recursive SchemaType algebraic data type for outputs, adds cross-file BaseModel resolution when outputs are imported from local modules, and improves type-resolution error messages.
Changes:
- Replace the legacy flat output type representation with a recursive
SchemaTypeplusJSONSchema()generation. - Update the Python tree-sitter parser and generator plumbing to support cross-file model resolution via a
sourceDirparameter. - Add/expand unit and integration tests for nested output types, cross-file imports, and new error messages.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/schema/types.go | Removes legacy OutputType and switches PredictorInfo.Output to SchemaType. |
| pkg/schema/schema_type.go | Introduces SchemaType, recursive JSON Schema generation, and ResolveSchemaType. |
| pkg/schema/python/parser.go | Adds sourceDir support and cross-file BaseModel discovery via resolveExternalModels. |
| pkg/schema/python/parser_test.go | Updates parser API usage and adds extensive tests for recursion + cross-file resolution. |
| pkg/schema/openapi.go | Uses SchemaType.JSONSchema() for the output component. |
| pkg/schema/openapi_test.go | Updates OpenAPI tests to construct outputs with SchemaType constructors. |
| pkg/schema/generator.go | Extends parser function signature to accept sourceDir and threads it through generation. |
| pkg/schema/generator_test.go | Updates generator tests for new parser signature and SchemaType output construction. |
| pkg/schema/errors.go | Adds ErrUnresolvableType and new actionable error helpers for unresolved outputs. |
| pkg/image/build.go | Minor formatting-only change in static schema generation path. |
| integration-tests/tests/multi_file_schema.txtar | New integration test validating multi-file output type resolution end-to-end. |
| architecture/02-schema.md | Documentation overhaul describing the new static parser pipeline and SchemaType system. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Haven't reviewed this in detail, but I can confirm this fixes resnet in cog-examples! |
c52fc0c to
0045ae9
Compare
michaeldwan
left a comment
There was a problem hiding this comment.
reviewed, fixes applied yesterday
The tree-sitter schema parser rejected unparameterized dict and list
return types with 'unsupported type'. This broke the resnet example
and any predictor returning -> dict or -> list.
- dict/Dict map to TypeAny (JSON Schema: {"type": "object"})
- list/List map to OutputList with TypeAny (JSON Schema: {"type": "array", "items": {"type": "object"}})
This matches the old Python schema gen behavior exactly:
dict -> Dict[str, Any] -> {"type": "object"}
list -> List[Any] -> {"type": "array", "items": {"type": "object"}}
…nsive tests Replace the flat OutputType system with a recursive SchemaType algebraic data type that supports arbitrary nesting (dict[str, list[dict[str, int]]], etc.). Key changes: - SchemaType ADT with 7 kinds: Primitive, Any, Array, Dict, Object, Iterator, ConcatIterator - ResolveSchemaType: recursive resolver replacing ResolveOutputType - Cross-file model resolution: imports from local .py files are found on disk, parsed with tree-sitter, and BaseModel subclasses extracted automatically - Handles all local import permutations: relative, dotted, subpackage, aliased - Clear error messages for unresolvable types (includes import source and guidance) - Remove legacy OutputType, OutputKind, ObjectField, ResolveOutputType - Thread sourceDir through Parser -> ParsePredictor for filesystem access - Rewrite architecture/02-schema.md for the static Go parser Tests: 93 unit tests (12 recursive nesting, 5 unresolvable errors, 3 pydantic compat, 11 cross-file resolution, 1 end-to-end schema gen) + 1 integration test
…ursive model fields - Propagate errors from dict value type resolution instead of silently falling back to opaque SchemaAny (dict[str, Tensor] now errors) - Extract UnwrapOptional helper used by ResolveFieldType, resolveUnionSchemaType, and resolveFieldSchemaType (3 callsites) - resolveModelToSchemaType now uses ResolveSchemaType via resolveFieldSchemaType, supporting dict/nested types inside BaseModel fields (previously limited to primitives, Optional[T], List[T]) - Fix stale comments: Optional rejected not nullable, Iterator allows nested types - Fix garbled unicode in architecture docs, fix SchemaAny table entry
- Fix nullable incorrectly set on non-required fields instead of field.Type.Nullable (debug: bool = False was appearing nullable) - Remove dead CogArrayType/CogArrayDisplay struct fields (hardcoded in coreSchema, never read from struct) - Distinguish os.ErrNotExist from permission errors in cross-file resolution; warn on parse failures instead of silently ignoring - Fix bare dict/list JSON Schema examples in architecture docs - Add regression tests: defaulted non-Optional field not nullable, Optional field nullable in JSON Schema, dict[str, Tensor] errors instead of silently producing SchemaAny (both top-level and inside BaseModel fields)
Four fuzz targets exercising the schema pipeline: - FuzzResolveSchemaType: arbitrary TypeAnnotation trees through the recursive resolver - FuzzJSONSchema: random SchemaType trees through JSON Schema rendering - FuzzParsePredictor: arbitrary bytes as Python source through the tree-sitter parser - FuzzParseTypeAnnotation: arbitrary return type strings in predict signatures Includes: - mise task 'test:fuzz' (FUZZTIME=30s per target by default) - CI job 'fuzz-go' running 30s per target on Go changes - Byte encoder/decoder for deterministic TypeAnnotation and SchemaType tree construction from fuzz corpus
* fix: make static schema generation opt-in via COG_STATIC_SCHEMA env var The static Go tree-sitter schema generator was the default for SDK >= 0.17.0, which risks breaking builds when the parser encounters types it cannot resolve. - Gate static schema gen behind COG_STATIC_SCHEMA=1 (or "true") env var - Legacy runtime schema generation (boot container + python introspection) remains the default - When opted in, gracefully fall back to legacy on ErrUnresolvableType instead of hard-failing the build - Add unit tests for canUseStaticSchemaGen (12 table-driven cases) - Add integration test for the static->legacy fallback path - Update existing static/multi-file integration tests to set the env var * docs: update schema architecture doc for opt-in static generation model Reflect that static schema generation is opt-in via COG_STATIC_SCHEMA=1, with legacy runtime path as the default and automatic fallback. * fix: restore legacy runtime schema generation modules for fallback path Restore the Python modules needed by the legacy runtime schema generation path (python -m cog.command.openapi_schema). These were deleted in 61eedf3 but are needed as the fallback when the static Go parser encounters types it cannot resolve. Restored modules: _adt, _inspector, _schemas, coder, config, errors, mode, suppress_output, command/__init__, command/openapi_schema. config.py is trimmed to only what openapi_schema.py needs (removed get_predictor_types which depended on deleted get_predict/get_train). * fix: simplify schema gen gating — let coglet handle missing schema for train/serve Remove the skipLabels override that forced static schema gen for cog train/serve paths. Now useStatic is purely opt-in via COG_STATIC_SCHEMA=1 for all commands. For train/serve without the env var, no schema is generated at build time. Coglet gracefully handles this (warns and accepts all input). These are local-only images that don't need strict schema validation. Also improves the static_schema_fallback integration test to use a realistic mypackage/__init__.py scenario instead of a plain class. * fix: restore static gen for train/serve paths, handle pydantic v2 in legacy inspector Two IT failures: 1. training_setup: cog train needs schema for -i flag parsing. The CLI fetches it from coglet's /openapi.json, which returns 503 when no schema file exists. Re-enable static gen for skipLabels paths (same as main) since there's no post-build legacy fallback for these. 2. pydantic2_output: the legacy runtime inspector (_inspector.py) didn't handle pydantic v2 BaseModel as output types — only cog.BaseModel. Add conditional pydantic.BaseModel check with model_fields iteration. * fix: suppress pyright reportMissingImports for optional pydantic import * fix: flatten nested error handling to early returns, case-insensitive env var check
…ands Previously, cog train/predict/serve (skipLabels=true) always used static schema generation, bypassing the COG_STATIC_SCHEMA env var check. Now all commands require COG_STATIC_SCHEMA=1 to enable static schema generation.
…s path) The skipLabels optimization (8b1c141) skipped the entire post-build phase including legacy schema generation. This broke cog train/predict/serve which need the schema for -i flag parsing and input validation. Move legacy schema gen above the skipLabels early return and add a minimal second Docker build that bundles only the schema file (no labels, pip freeze, or git info). Restore the sourceDir parameter on GenerateOpenAPISchema so ExcludeSource builds can volume-mount the project directory for Python introspection. Re-enable the train_basic and training_setup integration tests that were temporarily skipped.
1e78590 to
575454e
Compare
Summary
This PR lands the new static schema generation engine (recursive schema types + cross-file model resolution), but gates rollout for
0.17.0so we can ship safely and enable by default in0.18.0.Rollout behavior
cog build:0.17.0COG_STATIC_SCHEMA=1cog train/cog predict/cog serve:COG_STATIC_SCHEMA=1python -m cog.command.openapi_schema)skipLabelsperformance optimization)What's included
SchemaTypeADT for nesteddict/listoutput shapesBaseModelresolution (imports, aliases, subpackages)skipLabelspaths socog train/predict/servework withoutCOG_STATIC_SCHEMA=1skipLabelsoptimization (8b1c141) inadvertently skipped schema generation, not just label-addingskipLabelsearly returnGenerateOpenAPISchemarestoredsourceDirparam to volume-mount project dir forExcludeSourcebuildsWhy this rollout
0.17.0release without changing defaultcog buildbehavior0.18.0Follow-up for
0.18.0