feat: storyboard-driven testing with stateless CLI#424
Merged
Conversation
Storyboards become the primary testing concept, replacing scenario-based
comply retrofitting. Each YAML step maps directly to a SingleAgentClient
method via a stateless execution engine designed for LLM consumption.
- Add src/lib/testing/storyboard/ module (types, runner, loader, validations, context, task-map)
- Add CLI: adcp storyboard {list, show, run, step} with --json output
- Bundle 12 storyboard YAMLs from adcontextprotocol/adcp spec repo
- Tag storyboards with platform_types for backwards compat with PlatformType
- Add yaml as runtime dependency for YAML parsing
Closes #423
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Storyboards are now the single testing surface for comply(). The compliance engine runs storyboard YAMLs instead of hand-written scenario functions, while maintaining the existing ComplianceResult interface for backwards compatibility. YAML format extensions: - expect_error: inverts pass/fail for negative testing - requires_tool: skip steps when agent lacks a tool - context_outputs/context_inputs: explicit data flow between steps - error_code validation: check error codes in error responses - track/required_tools: map storyboards to compliance tracks New compliance storyboards (10): - governance_property_lists, governance_content_standards - si_session, brand_rights, media_buy_state_machine - error_compliance, schema_validation, behavioral_analysis - audience_sync, deterministic_testing Deprecates SCENARIO_REQUIREMENTS, DEFAULT_SCENARIOS, testAllScenarios() in favor of storyboard execution. Old exports remain functional. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build valid requests from discovered context (products, accounts, formats) instead of sending raw YAML sample_request payloads. Each task has a builder that mirrors how hand-written scenarios construct requests — selecting products with pricing options, building proper asset records, generating valid date ranges, etc. sample_request from YAML becomes documentation/fallback only. The runner priority is: --request override > request builder > sample_request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Initialize MCP session in runner before executing steps (fixes Streamable HTTP → SSE degradation after a few calls) - Close MCP connections after standalone storyboard/step runs - Register 20+ missing response schemas in TOOL_RESPONSE_SCHEMAS (sync_accounts, list_accounts, governance, SI, capabilities, etc.) - Fix get_media_buy_delivery validation path: media_buys → media_buy_deliveries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bump @modelcontextprotocol/sdk from 1.27.1 to 1.29.0 - Retry StreamableHTTP on any StreamableHTTPError before falling back to SSE (workaround for SDK #1708 and #1852 — session expiry) - Fix sync_creatives validation path: results[0].action → creatives[0].action (matches actual SyncCreativesSuccess schema) Test agent media_buy_seller: 8/9 pass (only sync_governance fails due to missing spec schema — adcontextprotocol/adcp#1978) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resync schemas from spec to pick up the new sync_governance response schema (adcontextprotocol/adcp#1978). Register it in TOOL_RESPONSE_SCHEMAS. Test agent media_buy_seller: 9/9 pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 7, 2026
Replace bracket assignment with Object.defineProperty so CodeQL's static analysis recognizes the prototype pollution guard. The FORBIDDEN_KEYS check was already correct but CodeQL doesn't trace through Set.has(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deduplicate parsePath (was in both context.ts and validations.ts). Move resolvePath and setPath to path.ts as the canonical location. Add isPlainObject type guard before Object.defineProperty in setPath. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every storyboard now has a track field mapping it to a ComplianceTrack, so comply() can discover and run storyboards for all 11 tracks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
src/lib/testing/storyboard/module — a YAML-driven testing engine where each step maps directly to aSingleAgentClientmethodadcp storyboardCLI withlist,show,run, andstepsubcommands — thestepcommand is stateless and designed for LLM consumption (context in, result + next preview out)platform_typestags for backwards compat withPlatformTypeCloses #423
Architecture
CLI Examples
adcp storyboard list --platform-type retail_media --json adcp storyboard show media_buy_seller adcp storyboard step test-mcp media_buy_seller sync_accounts --json adcp storyboard step test-mcp media_buy_seller get_products_brief \ --context '{"account_id":"abc123"}' --jsonTest plan
npm run build:libcompiles cleanadcp storyboard listshows 11 storyboardsadcp storyboard list --platform-type retail_mediafilters correctlyadcp storyboard show media_buy_sellerdisplays phases and stepsnpm pack --dry-runincludes storyboard YAMLs and compiled module🤖 Generated with Claude Code