Skip to content

feat: add static schema generation with legacy fallback (0.17 gated)#2788

Merged
michaeldwan merged 12 commits intomainfrom
fix/dict-list-output-schema
Mar 6, 2026
Merged

feat: add static schema generation with legacy fallback (0.17 gated)#2788
michaeldwan merged 12 commits intomainfrom
fix/dict-list-output-schema

Conversation

@tempusfrangit
Copy link
Member

@tempusfrangit tempusfrangit commented Feb 28, 2026

Summary

This PR lands the new static schema generation engine (recursive schema types + cross-file model resolution), but gates rollout for 0.17.0 so we can ship safely and enable by default in 0.18.0.

Rollout behavior

  • cog build:
    • Uses legacy runtime schema generation by default in 0.17.0
    • Static generation is opt-in via COG_STATIC_SCHEMA=1
    • Falls back to legacy runtime generation on unresolvable static types
  • cog train/cog predict/cog serve:
    • Static generation is opt-in via COG_STATIC_SCHEMA=1
    • Without it, the legacy runtime schema generation runs (boots the built image and runs python -m cog.command.openapi_schema)
    • The schema is bundled into the image via a minimal second Docker build (no labels, pip freeze, or git info — preserving the skipLabels performance optimization)

What's included

  • Recursive SchemaType ADT for nested dict/list output shapes
  • Cross-file BaseModel resolution (imports, aliases, subpackages)
  • Expanded tests (unit + fuzz + integration) for nested schemas and multi-file parsing
  • Cleanup/fixes from review rounds (error handling, parser behavior, debug logging removal)
  • Fix: restore legacy schema gen for skipLabels paths so cog train/predict/serve work without COG_STATIC_SCHEMA=1
    • The skipLabels optimization (8b1c141) inadvertently skipped schema generation, not just label-adding
    • Legacy schema gen now runs before the skipLabels early return
    • GenerateOpenAPISchema restored sourceDir param to volume-mount project dir for ExcludeSource builds

Why this rollout

  • Unblocks 0.17.0 release without changing default cog build behavior
  • Lets us validate static generation in real projects before making it default in 0.18.0

Follow-up for 0.18.0

  • Remove/relax rollout gates after validation window
  • Promote static generation to default where safe
  • Keep legacy fallback for edge cases as needed

@tempusfrangit tempusfrangit requested a review from a team as a code owner February 28, 2026 00:40
@tempusfrangit tempusfrangit requested a review from bfirsh February 28, 2026 00:41
@tempusfrangit tempusfrangit enabled auto-merge (squash) February 28, 2026 00:41
@tempusfrangit tempusfrangit added this to the 0.17.0 Release milestone Feb 28, 2026
@mfainberg-cf mfainberg-cf changed the title fix: support dict and bare list as prediction output types in schema gen feat: recursive schema type system with cross-file model resolution Feb 28, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors Cog’s static schema generation to use a recursive SchemaType algebraic data type for outputs, adds cross-file BaseModel resolution when outputs are imported from local modules, and improves type-resolution error messages.

Changes:

  • Replace the legacy flat output type representation with a recursive SchemaType plus JSONSchema() generation.
  • Update the Python tree-sitter parser and generator plumbing to support cross-file model resolution via a sourceDir parameter.
  • Add/expand unit and integration tests for nested output types, cross-file imports, and new error messages.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/schema/types.go Removes legacy OutputType and switches PredictorInfo.Output to SchemaType.
pkg/schema/schema_type.go Introduces SchemaType, recursive JSON Schema generation, and ResolveSchemaType.
pkg/schema/python/parser.go Adds sourceDir support and cross-file BaseModel discovery via resolveExternalModels.
pkg/schema/python/parser_test.go Updates parser API usage and adds extensive tests for recursion + cross-file resolution.
pkg/schema/openapi.go Uses SchemaType.JSONSchema() for the output component.
pkg/schema/openapi_test.go Updates OpenAPI tests to construct outputs with SchemaType constructors.
pkg/schema/generator.go Extends parser function signature to accept sourceDir and threads it through generation.
pkg/schema/generator_test.go Updates generator tests for new parser signature and SchemaType output construction.
pkg/schema/errors.go Adds ErrUnresolvableType and new actionable error helpers for unresolved outputs.
pkg/image/build.go Minor formatting-only change in static schema generation path.
integration-tests/tests/multi_file_schema.txtar New integration test validating multi-file output type resolution end-to-end.
architecture/02-schema.md Documentation overhaul describing the new static parser pipeline and SchemaType system.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bfirsh
Copy link
Member

bfirsh commented Mar 2, 2026

Haven't reviewed this in detail, but I can confirm this fixes resnet in cog-examples!

@tempusfrangit tempusfrangit disabled auto-merge March 2, 2026 18:28
@tempusfrangit tempusfrangit enabled auto-merge (rebase) March 2, 2026 18:28
@tempusfrangit tempusfrangit disabled auto-merge March 3, 2026 12:16
@markphelps markphelps force-pushed the fix/dict-list-output-schema branch from c52fc0c to 0045ae9 Compare March 3, 2026 21:54
michaeldwan
michaeldwan previously approved these changes Mar 5, 2026
Copy link
Member

@michaeldwan michaeldwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed, fixes applied yesterday

@michaeldwan michaeldwan modified the milestones: 0.17.0 Release, 0.18.0 Mar 5, 2026
@michaeldwan michaeldwan changed the title feat: recursive schema type system with cross-file model resolution feat: add static schema generation with legacy fallback (0.17 gated) Mar 5, 2026
@markphelps markphelps modified the milestones: 0.18.0, 0.17.0 Release Mar 5, 2026
The tree-sitter schema parser rejected unparameterized dict and list
return types with 'unsupported type'. This broke the resnet example
and any predictor returning -> dict or -> list.

- dict/Dict map to TypeAny (JSON Schema: {"type": "object"})
- list/List map to OutputList with TypeAny (JSON Schema: {"type": "array", "items": {"type": "object"}})

This matches the old Python schema gen behavior exactly:
  dict -> Dict[str, Any] -> {"type": "object"}
  list -> List[Any] -> {"type": "array", "items": {"type": "object"}}
…nsive tests

Replace the flat OutputType system with a recursive SchemaType algebraic data
type that supports arbitrary nesting (dict[str, list[dict[str, int]]], etc.).

Key changes:
- SchemaType ADT with 7 kinds: Primitive, Any, Array, Dict, Object, Iterator, ConcatIterator
- ResolveSchemaType: recursive resolver replacing ResolveOutputType
- Cross-file model resolution: imports from local .py files are found on disk,
  parsed with tree-sitter, and BaseModel subclasses extracted automatically
- Handles all local import permutations: relative, dotted, subpackage, aliased
- Clear error messages for unresolvable types (includes import source and guidance)
- Remove legacy OutputType, OutputKind, ObjectField, ResolveOutputType
- Thread sourceDir through Parser -> ParsePredictor for filesystem access
- Rewrite architecture/02-schema.md for the static Go parser

Tests: 93 unit tests (12 recursive nesting, 5 unresolvable errors, 3 pydantic
compat, 11 cross-file resolution, 1 end-to-end schema gen) + 1 integration test
…ursive model fields

- Propagate errors from dict value type resolution instead of silently
  falling back to opaque SchemaAny (dict[str, Tensor] now errors)
- Extract UnwrapOptional helper used by ResolveFieldType,
  resolveUnionSchemaType, and resolveFieldSchemaType (3 callsites)
- resolveModelToSchemaType now uses ResolveSchemaType via
  resolveFieldSchemaType, supporting dict/nested types inside BaseModel
  fields (previously limited to primitives, Optional[T], List[T])
- Fix stale comments: Optional rejected not nullable, Iterator allows
  nested types
- Fix garbled unicode in architecture docs, fix SchemaAny table entry
tempusfrangit and others added 8 commits March 5, 2026 11:53
- Fix nullable incorrectly set on non-required fields instead of
  field.Type.Nullable (debug: bool = False was appearing nullable)
- Remove dead CogArrayType/CogArrayDisplay struct fields (hardcoded
  in coreSchema, never read from struct)
- Distinguish os.ErrNotExist from permission errors in cross-file
  resolution; warn on parse failures instead of silently ignoring
- Fix bare dict/list JSON Schema examples in architecture docs
- Add regression tests: defaulted non-Optional field not nullable,
  Optional field nullable in JSON Schema, dict[str, Tensor] errors
  instead of silently producing SchemaAny (both top-level and inside
  BaseModel fields)
Four fuzz targets exercising the schema pipeline:
- FuzzResolveSchemaType: arbitrary TypeAnnotation trees through the
  recursive resolver
- FuzzJSONSchema: random SchemaType trees through JSON Schema rendering
- FuzzParsePredictor: arbitrary bytes as Python source through the
  tree-sitter parser
- FuzzParseTypeAnnotation: arbitrary return type strings in predict
  signatures

Includes:
- mise task 'test:fuzz' (FUZZTIME=30s per target by default)
- CI job 'fuzz-go' running 30s per target on Go changes
- Byte encoder/decoder for deterministic TypeAnnotation and SchemaType
  tree construction from fuzz corpus
* fix: make static schema generation opt-in via COG_STATIC_SCHEMA env var

The static Go tree-sitter schema generator was the default for SDK >= 0.17.0,
which risks breaking builds when the parser encounters types it cannot resolve.

- Gate static schema gen behind COG_STATIC_SCHEMA=1 (or "true") env var
- Legacy runtime schema generation (boot container + python introspection)
  remains the default
- When opted in, gracefully fall back to legacy on ErrUnresolvableType
  instead of hard-failing the build
- Add unit tests for canUseStaticSchemaGen (12 table-driven cases)
- Add integration test for the static->legacy fallback path
- Update existing static/multi-file integration tests to set the env var

* docs: update schema architecture doc for opt-in static generation model

Reflect that static schema generation is opt-in via COG_STATIC_SCHEMA=1,
with legacy runtime path as the default and automatic fallback.

* fix: restore legacy runtime schema generation modules for fallback path

Restore the Python modules needed by the legacy runtime schema generation
path (python -m cog.command.openapi_schema). These were deleted in 61eedf3
but are needed as the fallback when the static Go parser encounters types
it cannot resolve.

Restored modules: _adt, _inspector, _schemas, coder, config, errors, mode,
suppress_output, command/__init__, command/openapi_schema.

config.py is trimmed to only what openapi_schema.py needs (removed
get_predictor_types which depended on deleted get_predict/get_train).

* fix: simplify schema gen gating — let coglet handle missing schema for train/serve

Remove the skipLabels override that forced static schema gen for
cog train/serve paths. Now useStatic is purely opt-in via
COG_STATIC_SCHEMA=1 for all commands.

For train/serve without the env var, no schema is generated at
build time. Coglet gracefully handles this (warns and accepts all
input). These are local-only images that don't need strict schema
validation.

Also improves the static_schema_fallback integration test to use
a realistic mypackage/__init__.py scenario instead of a plain class.

* fix: restore static gen for train/serve paths, handle pydantic v2 in legacy inspector

Two IT failures:

1. training_setup: cog train needs schema for -i flag parsing. The CLI
   fetches it from coglet's /openapi.json, which returns 503 when no
   schema file exists. Re-enable static gen for skipLabels paths (same
   as main) since there's no post-build legacy fallback for these.

2. pydantic2_output: the legacy runtime inspector (_inspector.py) didn't
   handle pydantic v2 BaseModel as output types — only cog.BaseModel.
   Add conditional pydantic.BaseModel check with model_fields iteration.

* fix: suppress pyright reportMissingImports for optional pydantic import

* fix: flatten nested error handling to early returns, case-insensitive env var check
…ands

Previously, cog train/predict/serve (skipLabels=true) always used static
schema generation, bypassing the COG_STATIC_SCHEMA env var check. Now all
commands require COG_STATIC_SCHEMA=1 to enable static schema generation.
…s path)

The skipLabels optimization (8b1c141) skipped the entire post-build phase
including legacy schema generation. This broke cog train/predict/serve which
need the schema for -i flag parsing and input validation.

Move legacy schema gen above the skipLabels early return and add a minimal
second Docker build that bundles only the schema file (no labels, pip freeze,
or git info). Restore the sourceDir parameter on GenerateOpenAPISchema so
ExcludeSource builds can volume-mount the project directory for Python
introspection.

Re-enable the train_basic and training_setup integration tests that were
temporarily skipped.
@michaeldwan michaeldwan force-pushed the fix/dict-list-output-schema branch from 1e78590 to 575454e Compare March 5, 2026 17:53
Copy link
Contributor

@markphelps markphelps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHIP ITTTTT

@michaeldwan michaeldwan merged commit 63cf1d8 into main Mar 6, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants