Add structured output format with assembly line mappings#14
Add structured output format with assembly line mappings#14mattgodbolt-molty wants to merge 1 commit intomainfrom
Conversation
Adds an optional 'format' field to requests: 'markdown' (default,
backward compatible) or 'structured' (JSON with assembly line ranges).
Structured format returns:
- summary: one-sentence overview
- sections: array of {title, asmStartLine, asmEndLine, content}
- keyInsight: the most important takeaway
Each section maps to specific 0-indexed assembly lines, enabling
frontends to highlight relevant assembly as the user reads each
section.
Uses Anthropic's structured output API (output_config with
json_schema) for guaranteed valid JSON. Tested with Sonnet 4.6:
line references are accurate across simple, complex, optimised,
and unoptimised examples.
Backward compatible: existing clients see no change. The
structuredExplanation field is null when format is 'markdown'.
🤖 Generated by LLM (Claude, via OpenClaw)
There was a problem hiding this comment.
Pull request overview
This PR adds structured output support to the Claude Explain API, enabling assembly line mappings for the Compiler Explorer frontend. The API now accepts an optional format field that can be either "markdown" (default, backward compatible) or "structured" (JSON with assembly line references).
Changes:
- Added
ExplanationFormatenum andformatfield toExplainRequestwith"markdown"as default - Introduced
StructuredExplanationandExplanationSectionmodels with assembly line mapping fields (asmStartLine,asmEndLine) - Modified explain logic to use Anthropic's structured output API (
output_configwith JSON schema) for structured format requests - Added comprehensive test coverage for both structured and markdown formats
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| app/explain_api.py | Defines new data models (ExplanationFormat, StructuredExplanation, ExplanationSection) and adds format field to ExplainRequest and structuredExplanation to ExplainResponse |
| app/explain.py | Implements format-specific logic including output_config setup, system prompt modification, and response parsing for structured vs markdown formats |
| app/test_explain.py | Adds TestStructuredOutput class with 4 tests covering structured response format, output_config usage, and backward compatibility verification |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if cache_provider is not None: | ||
| cached_response = await get_cached_response(body, prompt, cache_provider) |
There was a problem hiding this comment.
The cache key generation doesn't account for the format field, which means structured and markdown format requests with identical inputs will share the same cache key. This could return cached responses in the wrong format.
For example, if a markdown response is cached first, a subsequent structured format request with the same inputs will receive the cached markdown text, which will then fail JSON validation at line 135 when calling StructuredExplanation.model_validate_json().
The format field must be included in the cache key calculation in cache.py's generate_cache_key function. Since the format affects the actual API parameters (system prompt via line 116-117, messages via line 115, and output_config via lines 120-125), these differences should be reflected in the cache key.
| structured: StructuredExplanation | None = None | ||
|
|
||
| if use_structured: | ||
| structured = StructuredExplanation.model_validate_json(raw_text) |
There was a problem hiding this comment.
The JSON validation here can raise a ValidationError if the API returns JSON that doesn't match the StructuredExplanation schema. While Anthropic's structured output with output_config should guarantee schema-compliant JSON, it would be more robust to handle potential validation errors explicitly and return a meaningful error response rather than letting the exception propagate to FastAPI's default error handler.
Consider wrapping this in a try-except block to catch pydantic.ValidationError and return an appropriate error response with status='error'.
| class ExplanationSection(BaseModel): | ||
| """A section of a structured explanation, mapped to assembly lines.""" | ||
|
|
||
| model_config = {"json_schema_extra": {"additionalProperties": False}} |
There was a problem hiding this comment.
The model_config setting may not correctly add additionalProperties: false to the generated JSON schema. In Pydantic v2, the json_schema_extra dict is merged with the generated schema, but the placement here might not achieve the desired effect.
To properly disallow additional properties in the JSON schema sent to Anthropic, you should verify that model_json_schema() actually includes "additionalProperties": false at the schema root level. Consider using Pydantic's ConfigDict with extra='forbid' instead, or verify the generated schema matches expectations:
model_config = ConfigDict(extra='forbid')This ensures both runtime validation and JSON schema generation correctly reject additional properties.
| class StructuredExplanation(BaseModel): | ||
| """Structured explanation with assembly line mappings.""" | ||
|
|
||
| model_config = {"json_schema_extra": {"additionalProperties": False}} |
There was a problem hiding this comment.
Same as above - the model_config setting may not correctly add additionalProperties: false to the generated JSON schema. Consider using ConfigDict(extra='forbid') instead to ensure both runtime validation and JSON schema generation correctly reject additional properties.
| api_kwargs["max_tokens"] = max(prompt_data["max_tokens"], 2048) | ||
| api_kwargs["output_config"] = { | ||
| "format": { | ||
| "type": "json_schema", | ||
| "schema": StructuredExplanation.model_json_schema(), |
There was a problem hiding this comment.
The max_tokens is increased to a minimum of 2048 for structured output to account for JSON overhead. While this is reasonable, it means structured format requests may consume more tokens (and cost more) than markdown requests with the same prompt configuration. Consider documenting this behavior in the API documentation or PR description so users are aware of the potential cost difference.
| api_kwargs["max_tokens"] = max(prompt_data["max_tokens"], 2048) | |
| api_kwargs["output_config"] = { | |
| "format": { | |
| "type": "json_schema", | |
| "schema": StructuredExplanation.model_json_schema(), | |
| # Ensure enough tokens for JSON overhead in structured output. This may increase | |
| # token usage and cost compared to markdown responses with the same prompt. | |
| if prompt_data["max_tokens"] < 2048: | |
| LOGGER.info( | |
| "Structured output: increasing max_tokens from %s to 2048 to account for JSON " | |
| "overhead. This may increase token usage and cost versus markdown output.", | |
| prompt_data["max_tokens"], | |
| ) | |
| api_kwargs["max_tokens"] = max(prompt_data["max_tokens"], 2048) | |
| api_kwargs["output_config"] = { | |
| "format": { | |
| "type": "json_schema", | |
| "schema": StructuredExplanation.model_json_schema(), |
| asmStartLine: int = Field(..., description="0-indexed start line in the assembly listing") | ||
| asmEndLine: int = Field(..., description="0-indexed end line (inclusive) in the assembly listing") |
There was a problem hiding this comment.
The asmStartLine and asmEndLine fields don't include validation to ensure they're within the bounds of the actual assembly array. While Anthropic's structured output should be accurate, there's no guarantee these indices won't exceed the assembly array length.
Consider adding a validator to check bounds, or at minimum document that API consumers (like the CE frontend) should validate these indices before attempting to use them for highlighting. You could add a Pydantic field validator that checks if end >= start at minimum.
| model_config = {"json_schema_extra": {"additionalProperties": False}} | ||
|
|
||
| summary: str = Field(..., description="One-sentence overview of what the compiler did") | ||
| sections: list[ExplanationSection] = Field(..., description="Explanation sections mapped to assembly lines") |
There was a problem hiding this comment.
The sections field doesn't have a minimum length constraint, meaning an empty list would be valid. For most assembly explanations, having at least one section seems reasonable. Consider adding a validator to ensure at least one section exists:
sections: list[ExplanationSection] = Field(..., min_length=1, description="Explanation sections mapped to assembly lines")This would make the schema more robust and prevent degenerate cases where the LLM returns no sections.
| sections: list[ExplanationSection] = Field(..., description="Explanation sections mapped to assembly lines") | |
| sections: list[ExplanationSection] = Field( | |
| ..., min_length=1, description="Explanation sections mapped to assembly lines" | |
| ) |
Summary
Adds an optional
formatfield to the explain API:"markdown"(default, fully backward compatible) or"structured"(JSON with assembly line mappings).Structured format response
{ "structuredExplanation": { "summary": "GCC -O2 compiles square() into three instructions", "sections": [ { "title": "Multiply the input", "asmStartLine": 1, "asmEndLine": 1, "content": "`imul edi, edi` multiplies the register by itself..." } ], "keyInsight": "No stack frame needed — everything stays in registers" } }What this enables in CE
Implementation
formatfield onExplainRequest("markdown"default,"structured"opt-in)output_configwithjson_schema) for guaranteed valid JSONoutput_config), adds line indexing hint to system promptStructuredExplanationandExplanationSectionTesting
Unit tests: 95 pass (91 existing + 4 new covering both paths)
structuredExplanation, notexplanationoutput_configto APIoutput_config, returnsexplanation)Live testing: Tested on 3 examples (square -O2, fibonacci -O2, unoptimised add):
Backward compatibility
Fully backward compatible:
formatis"markdown"— existing clients see identical responsesstructuredExplanationfield isnullfor markdown requests/options endpointNext steps (CE frontend, separate PR)
The CE frontend would need to:
format: "structured"in requestseventHub(I'm Molty, an AI assistant acting on behalf of @mattgodbolt)