feat(compile): autorewrite front matter via detection-based codemods#476
feat(compile): autorewrite front matter via detection-based codemods#476jamesadevine wants to merge 8 commits intomainfrom
Conversation
Introduces a Django/Rails-style schema-version field on FrontMatter and a registered migration framework so breaking grammar changes auto-migrate existing user sources during compile rather than hard-failing. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid design, well-tested — a couple of small issues worth fixing before merge. Findings🐛 Bugs / Logic Issues
|
Address code review finding: the runner did `1 + registry.len() as u32` and `m.to_version == current + 1` without overflow checks. With realistic registries this is unreachable, but rust panics on overflow in debug mode and wraps in release. Switch to checked_add so we surface a clear error either way; preserves existing behavior on all realistic inputs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Strong design with a few correctness issues worth addressing before merge. Findings🐛 Bugs / Logic Issues
|
Four findings from the Rust PR Reviewer bot: 1. check_pipeline's bail message had a redundant "error:" prefix that produced "Error: error: ..." once anyhow's top-level handler added its own "Error:" wrapper. Drop the prefix and reformat the hint onto an indented continuation line. 2. compile_pipeline_inner did a redundant serde_yaml::from_value round-trip just to satisfy ParsedSource.front_matter, even though perform_source_rewrite_if_needed never reads that field. Refactor the helper to take the four primitive fields it actually uses (mapping, body_raw, source_sha256, migrations) instead of the full struct. reconstruct_source likewise takes individual fields now. 3. Misleading comment in tests/migration_tests.rs claimed a "thread-local counter" was used; only a nanosecond timestamp was. Replaced rand_suffix() with timestamp + AtomicU64 monotonic seq so parallel tests scheduled in the same nanosecond always get distinct dirs. 4. Leading whitespace before the opening `---` was silently stripped on migration rewrite. parse_markdown_detailed already tolerates it on read; capture it as a new ParsedSource.leading_whitespace field and emit it from reconstruct_source so byte-faithful preservation extends to whitespace prefixes (BOM-stripped editor blank lines, etc.). All 1369 tests still pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
0545068 to
72e97c7
Compare
🔍 Rust PR ReviewSummary: Solid architecture and excellent test coverage — one real bug in Findings🐛 Bugs / Logic Issues
|
Four findings from a fresh PR Reviewer pass: 1. `rename_key` left the mapping partially mutated when `ConflictPolicy::Error` fired: `take_key` removed `old` before the conflict check ran, so the bail observed a half-mutated state. Refactor to be transactional — check both keys + policy first, then mutate. Document the invariant in the doc comment and extend the existing test to assert both keys are intact on error. 2. `set_schema_version` appended the field to the bottom of the mapping (default `Mapping::insert` behavior), so rewritten sources showed `schema-version` at the end of the front-matter block, diverging from the canonical template in `docs/front-matter.md`. Add a `hoist_schema_version` step inside `reconstruct_source` so the serialized output places `schema-version` first while preserving the relative order of the user's other keys. 3. Doc comment in `tests/migration_tests.rs` referenced the wrong path (`migrations/integration_test.rs`); the actual file is `migration_integration_test.rs` directly under `compile/`. 4. The `#[allow(dead_code)]` on `FrontMatter::schema_version` was noted as a code smell. Add an explicit explanatory comment that the field exists solely to satisfy `deny_unknown_fields` and that the authoritative version lives in the untyped mapping (set by the migration runner before typed deserialization). All 1370 tests pass. New test: `reconstruct_source_hoists_schema_version_to_top`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Well-designed, production-ready migration framework — architecture is sound, error handling is thorough, and the test pyramid (unit → white-box integration → black-box CLI) gives strong confidence. A few observations worth noting. Findings
|
…codemods Aligns with gh-aw's codemod model. The previous design (versioned chain + `schema-version` field on user front matter) leaked an internal compiler concern into user files. gh-aw — the project we explicitly say we're inspired by in AGENTS.md — uses 41 codemods with no version stamping; users never see a `schema-version` field. This refactor brings ado-aw to the same model while keeping our ergonomic compile-time autorewrite UX (gh-aw uses a separate `gh aw fix --write` command, we don't). Changes: - Drop FrontMatter.schema_version field, default_schema_version fn, and the SCHEMA_VERSION_KEY const. - Rename src/compile/migrations/ to src/compile/codemods/. Rename src/compile/migration_integration_test.rs to src/compile/codemod_integration_test.rs. Rename tests/migration_tests.rs to tests/codemod_tests.rs. Rename docs/migrations.md to docs/codemods.md (rewritten from scratch). - Replace Migration / MigrationContext / MigrationReport / AppliedMigration / MIGRATIONS with Codemod / CodemodContext / CodemodReport / AppliedCodemod / CODEMODS. - Codemod.apply signature changes from `fn(&mut Mapping, &Ctx) -> Result<()>` to `fn(&mut Mapping, &Ctx) -> Result<bool>` so each codemod reports whether it actually modified the mapping. - Simplify the runner: flat per-codemod loop, no version arithmetic, no contiguity checks, no overflow guards, no future-version rejection. - Drop hoist_schema_version from reconstruct_source — no schema-version field to hoist. - Update parse_markdown_detailed, compile_pipeline_inner, perform_source_rewrite_if_needed, check_pipeline, run_execute, and compile_all_pipelines to use the codemod types and reword their messages. - Helpers (take_key, insert_no_overwrite, rename_key, ConflictPolicy) are unchanged — they're as useful for codemods as they were for migrations. rename_key already returned Result<bool>, matching the new codemod contract. - Drop ~13 schema-version-specific tests (registry contiguity, future version, non-integer, etc.). Add 7 codemod-runner tests (empty registry, all-no-op, mixed, error context, idempotency). Black-box CLI tests reduced from 10 to 3 (deleted tests asserted on schema-version-specific behavior that no longer exists). PR #476 has not merged; this lands as a follow-up commit on the same branch so users never see the schema-version concept ship. All 1349 tests pass. PR title and body to be updated separately. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good overall — the framework design is sound, error handling is thorough, and test coverage is comprehensive. Two concrete issues worth addressing before merge. Findings🐛 Bugs / Logic Issues
|
Two findings + two suggestions from the latest Rust PR Reviewer pass: 1. Naming clash in `perform_source_rewrite_if_needed`: the parameter `codemods: &codemods::CodemodReport` shadowed the sibling `codemods` module. Functionally fine (the module isn't used inside the function body) but a future caller adding a `codemods::something()` reference would hit a confusing type error. Renamed to `report`. 2. `docs/codemods.md` claimed "Front-matter key order is preserved" without qualification. Renamed keys move to the END of the front-matter block because `Mapping::insert` appends new keys when absent. Updated the doc to call this out, and added the same note to the stderr warning emitted when codemods rewrite the source. 3. Marked the two registry-health tests (`registry_ids_are_unique`, `codemod_filenames_match_registry_count`) as vacuously true while CODEMODS is empty, so future reviewers don't mistake the green test for a meaningful invariant today. 4. Added a comment on the `pub use helpers::*` re-export and on the helper-level `#[allow(dead_code)]`s explaining they're defensive against the empty-registry warning surface and should be stripped once the first real codemod lands. Plus, while in the docs, added an "unsanitized input" invariant to the codemod author contract: the runner sanitizes the typed `FrontMatter` AFTER codemods run, so the raw `Mapping` codemods receive is whatever the user wrote. Codemods should treat values as opaque rather than parse/interpolate them. All 1349 tests pass. No behavior changes outside the docs and the warning text. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good overall — the framework is well-designed and well-tested. Two issues worth addressing before merge. Findings🐛 Bugs / Logic Issues
|
Two findings + three suggestions from the latest Rust PR Reviewer pass:
1. Stage-3 typed-deserialize errors after codemods reused the same
"Failed to parse YAML front matter" context as the Stage-1 raw
parse, so a codemod that produced an invalid shape was
indistinguishable from a user typo. Switch to with_context that
surfaces the applied codemod ids when codemods ran:
"Failed to parse YAML front matter after applying codemods (id1, id2);
a codemod likely produced an invalid shape".
2. rename_key with ConflictPolicy::PreferNew returns Ok(true) when
it drops a stale `old` key even though `new` is unchanged. That
triggers a source-rewrite warning saying "we migrated" when
really we "cleaned up a remnant". The behavior is correct (the
mapping did mutate); the doc was misleading. Rewrote the
rename_key doc comment to enumerate exactly when Ok(true) fires
and call out the PreferNew/cleanup case.
3. Added `// TODO(codemods): remove when the first real codemod is
registered.` markers next to every #[allow(dead_code)] /
#[allow(unused_imports)] in src/compile/codemods/{mod,helpers}.rs
so search-based cleanup is trivial when the registry stops being
empty.
4. tests/codemod_tests.rs leaked temp dirs on assertion failure
because `let _ = fs::remove_dir_all(&dir)` only ran on the happy
path. Refactored to return tempfile::TempDir from the helpers
and rely on RAII cleanup, matching the pattern already used by
src/compile/codemod_integration_test.rs.
Skipped: the empty-mapping concern in reconstruct_source (a
mapping reduced to empty wouldn't deserialize as FrontMatter
anyway because of required name/description fields, so the
existing typed-deserialize error path catches it; #1 makes that
error informative).
All 1349 tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Solid, well-structured framework — one real bug to fix before merge, plus a few minor observations. Findings🐛 Bugs / Logic Issues
|
Two fixes from the latest Rust PR Reviewer pass:
1. atomic_write created the tempfile in std::env::temp_dir() when
the destination path had no directory component (e.g. a bare
"agent.md" run from cwd). On Linux when /tmp is a tmpfs mount,
the subsequent persist() rename fails with EXDEV because rename
cannot cross filesystems. Fix: when path.parent() returns
None or Some("") (the bare-filename case), fall back to "." so
the tempfile lives on the same filesystem as the destination.
Adds a pure-logic regression test for the parent-dir derivation
that doesn't mutate process cwd (would race with parallel
tests).
2. reconstruct_source's `format!("---\n{}---{}", ...)` template
relies on serde_yaml::to_string ending the output with `\n`.
This holds in practice but is an undocumented invariant. Add a
defensive `ensure!` so a future serde_yaml change surfaces a
loud, actionable error rather than silently producing malformed
YAML.
Skipped: the redundant `'static` annotation on `CODEMODS` (cosmetic,
reviewer marked as minor).
All 1350 tests pass.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
🔍 Rust PR ReviewSummary: Looks good — well-engineered framework with sound design, comprehensive tests, and one minor suggestion for the helper API docs. Findings✅ What Looks Good
|
Summary
Every breaking change to the front-matter grammar (e.g.
engine: claude-opus-4.5→engine: { id: copilot, model: ... },read-only-service-connection→permissions:,upload-build-artifact→upload-build-attachment) currently produces a hard YAML deserialization error for users on older sources. They have to upgrade by hand before the compiler will even read their file.This PR introduces a codemod framework — flat, append-only, detection-based transformations that rewrite deprecated front-matter shapes to current ones. The model mirrors gh-aw's
pkg/cli/fix_codemods.go; the project we explicitly say we're inspired by inAGENTS.md.The compiler runs the registry in memory during
compile, then — only on a fully successful compile — atomically rewrites the source.mdand emits a clear stderr warning. Failed compiles leave the source byte-identical. There is no schema-version field on user front matter; users never see the framework's bookkeeping.The framework ships with an empty registry; the historical breaking changes can land in dedicated follow-up PRs. See
docs/codemods.mdfor a worked example (engine_id_split).What's in
Codemoddata model:{ id, summary, introduced_in, apply: fn(&mut Mapping, &Ctx) -> Result<bool> }.applyreturnstruewhen it modified the mapping,falsefor no-op (mapping doesn't match),Errfor hard failure.CODEMODS: &[&'static Codemod]static registry insrc/compile/codemods/mod.rs. Append-only; order matters when codemods compose.CodemodContextis#[non_exhaustive]for forward-compat.take_key,insert_no_overwrite,rename_key(ConflictPolicy). Default policy is error (no silent overwrite).rename_keyis transactional — mapping unchanged on the error path.parse_markdown_detailed): untypedMapping→ run codemods → typedFrontMatter. Rejects non-mapping top-level YAML. Body preserved byte-for-byte. Leading whitespace before the opening---preserved..lock.ymlwrites.checkcommand bails with<path> contains deprecated front-matter shapes that need codemod fixes. hint: runado-aw compileto apply the codemods in place.when codemods would fire. Rationale: in-pipeline integrity check pins to its own ado-aw version, so failing locally only impacts a developer running newer ado-aw against an older source — exactly when we want to fail loudly.execute(Stage 3) runs codemods in memory only vialog::warn!; never rewrites source.compile_all_pipelinessummary line includes; N source file(s) rewritten by codemods.when any rewrite occurred.Why codemods (not version-stamped migrations)?
Earlier in the branch's history we shipped a Django/Rails-style versioned migration chain with a
schema-version: <u32>field onFrontMatter. After landing it through review, we noticed gh-aw — our explicit reference — uses codemods with no version stamping. We aligned to keep:gh aw fix --writefor our users).Trade-offs
serde_yamlround-trip drops them). Body markdown is preserved byte-for-byte. Documented prominently in the warning anddocs/codemods.md.serde_yaml's default formatting; quote/scalar styles may be normalized. The skip-rewrite condition (new_content != original) avoids gratuitous diffs when no semantic change occurred.docs/codemods.md).Test plan
cargo buildclean.cargo testruns 1349 tests, all passing across four layers:src/compile/codemods/{mod.rs,helpers.rs}): registry id uniqueness, filename-prefix sanity, runner empty/all-no-op/mixed/error-context/idempotency paths, helper conflict policies, transactionalrename_key.src/compile/{types.rs,common.rs}):parse_markdown_detailedbyte preservation across trailing-newline / no-trailing-newline / blank-lines variants, leading whitespace tolerance, non-mapping top-level rejection, source SHA-256 stability,atomic_writesymlink replacement and Unix mode preservation.src/compile/codemod_integration_test.rs): exercise the rewrite path end-to-end using a stubCodemodregistry injected via crate-privatecompile_pipeline_with_registry/parse_markdown_detailed_with_registryhooks. Covers stale-source rewrite + body byte-preservation, no-op skip when no codemods apply, no-op skip when codemod returnsfalse, lost-update guard fires on concurrent edits.tests/codemod_tests.rs): healthy compile of a current source (verifies no rewrite, source byte-identical), fullcompile→checkround-trip, non-mapping top-level YAML rejection.Manual smoke test:
No new clippy warnings beyond the 4 pre-existing ones on
main.