TransformerLensOrg · jlarson4 · Jun 3, 2026 · Jun 2, 2026 · Jun 2, 2026 · Jun 2, 2026
diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md
@@ -0,0 +1,53 @@
+---
+description: Guided workflow for adding a new architecture adapter to TransformerBridge.
+argument-hint: <hf_repo>
+---
+
+Adding TransformerBridge support for HF model `$ARGUMENTS`. If empty, ask the user for the HF repo path first.
+
+Each step names the doc to read **when you reach that step** — don't load all up front.
+
+1. **Check registry state and decide whether to verify.**
+
+   State:
+   - Architecture supported? Check `SUPPORTED_ARCHITECTURES` in [`architecture_adapter_factory.py`](../../transformer_lens/factories/architecture_adapter_factory.py).
+   - Model in registry? Check [`supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json); note `status` (0=unverified, 1=verified, 2=skipped, 3=failed).
+
+   Branch:
+
+   - **Supported AND `status==1`** → already verified. Ask the user the symptom (bug-report path, not add-support). Stop.
+   - **Supported, `status != 1`** → proceed to **Confirm before verification**. If `status==3`, read existing `note` for the prior failure mode.
+   - **Supported, not in registry** → add an entry per [§Adding the HF repo to the registry](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry) with `status: 0` and null scores, then proceed.
+   - **Not supported** → skip to step 2.
+
+   ### Confirm before verification
+
+   Always ask the user first, even for small models:
+
+   1. Dry-run to project cost:
+      ```
+      set -a; source .env; set +a
+      uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run
+      ```
+   2. Show: model ID, architecture class, estimated parameters, projected memory (GB), HF_TOKEN needed?, runtime (30 s–2 min sub-1B, 2–15 min 1B–7B, 15+ min 7B+/multimodal), what verification does (Phases 1–4; updates `supported_models.json` on success).
+   3. Ask: "Run verification on this machine? (Y/N)"
+
+   **Confirm** → `/verify-model $ARGUMENTS`. On pass, done. On fail, see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md) (per-sibling adapter bug).
+
+   **Reject** → `gh issue create --template verify-model.md` (fill from dry-run output). No `gh`? <https://github.com/TransformerLensOrg/TransformerLens/issues/new?template=verify-model.md>. Stop.
+
+2. **Analyze the HF model.** Read `config.json` and source — identify embedding, attention, MLP, normalization, output-head layouts. Read [§Config-attr propagation](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) and decide which non-standard attrs (`final_logit_softcapping`, `sliding_window`, etc.) need surfacing on `self.cfg`.
+
+3. **Pick a starting adapter.** See [§Starter-adapter table](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy into [`supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `<arch>.py`. **Tokenizer-policy flags are per-model** — see [§Tokenizer policy](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy).
+
+4. **Fill `self.component_mapping`.** Bridge-native hook names. Reference: [§Minimal contract](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#minimal-contract), [§Common gotchas](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#common-gotchas).
+
+5. **Register in all four sites** per [§Registration steps](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). Then run the invariant test: `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory`.
+
+6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per [§Adding the HF repo to the registry](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Ask the user about adding canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[<HFArchClass>]`.
+
+7. **Verify** end-to-end: `/verify-model $ARGUMENTS`. Read both `status` AND per-phase scores. `STATUS_VERIFIED` means hard gates passed (see [§Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds)) — but P4's 50% bar is intentionally lenient. P4 well below 100% on a small parity-test model + `status==1` → suspect missing `preprocess_weights` fold or wrong `default_prepend_bos`; investigate before step 8.
+
+8. **Write tests** per [§Required tests](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#required-tests) (unit + integration). Copy the closest sibling.
+
+9. **`/task-complete`** — comment cleanup, `/format`, standard test tiers, loop until clean.
diff --git a/.claude/commands/build-docs.md b/.claude/commands/build-docs.md
@@ -0,0 +1,16 @@
+---
+description: Source .env then build the Sphinx docs.
+---
+
+Build the documentation locally:
+
+```
+set -a; source .env; set +a
+uv run build-docs
+```
+
+Sourcing `.env` is required so `HF_TOKEN` is available — some doctests and notebook embeddings load gated models. Output goes to [docs/build/](../../docs/build/).
+
+For an interactive live-reloading preview instead, run `uv run docs-hot-reload`.
+
+Docs follow Google docstring style with reST extensions; see [docs/source/content/contributing.md](../../docs/source/content/contributing.md) for the style guide.
diff --git a/.claude/commands/format.md b/.claude/commands/format.md
@@ -0,0 +1,14 @@
+---
+description: Type-check then format the working tree.
+---
+
+Run mypy first, then format. Mypy fixes (`isinstance`, `cast`, signatures) can introduce format drift — running format after means a single pass.
+
+```
+uv run mypy .
+make format
+```
+
+`uv run mypy .` uses the config in [pyproject.toml](../../pyproject.toml). `make format` runs `pycln --all` (unused imports), `isort`, and `black` (line length 100).
+
+If mypy reports errors, fix the underlying typing issue — never add `# type: ignore`. Prefer `isinstance` / `typing.cast` ([AGENTS.md §10](../../AGENTS.md#10-hard-rules)).
diff --git a/.claude/commands/task-complete.md b/.claude/commands/task-complete.md
@@ -0,0 +1,46 @@
+---
+description: End-of-task gate. Clean up new comments, format, type-check, and run the standard test tiers (unit + docstring + acceptance + integration) — fixing issues along the way.
+---
+
+Run the end-of-task gate. Do not declare the task complete until every step below passes cleanly.
+
+### 1. Clean up new comments
+
+Review every comment and docstring **added or modified during this task** against the rules in [AGENTS.md §10](../../AGENTS.md#10-hard-rules):
+
+- Comments should be terse one-liners; docstrings are one-line where possible.
+- Inline comments explain WHY, not WHAT — delete any that just restate the code.
+- Multi-paragraph explanations belong in PR descriptions or design docs, not source.
+- Remove any references to plan files, audit IDs, finding IDs, or "see plan section X" — those rot as the codebase evolves and belong only in the PR description.
+
+Use `git diff` against the merge-base to scope the review to genuinely new comments — do NOT rewrite unrelated comments elsewhere in the file.
+
+### 2. Type-check, then format
+
+Run mypy **before** format. Mypy fixes (`isinstance`, `typing.cast`, signature changes) can introduce format drift — running format after mypy means a single format pass.
+
+```
+uv run mypy .
+make format
+```
+
+If mypy reports new errors, fix the underlying typing issue. Do not add `# type: ignore`.
+
+### 3. Run the standard test tiers
+
+```
+set -a; source .env; set +a
+make test-pr
+```
+
+`make test-pr` runs unit + docstring + acceptance + integration — the tiers that gate PR review for almost every change. Notebook and benchmark suites are intentionally skipped (slow, gated models, CI runs them separately). If your change specifically touched a notebook or a benchmark, also run that file directly (`pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/<notebook>.ipynb` or `make benchmark-test`).
+
+Investigate every failure. Do not dismiss any failure as "pre-existing" or "unrelated" — fix the underlying issue, even if it predates this task (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)). Do not add platform skips or `xfail` markers to dodge a failing test.
+
+### 4. Re-loop on failure
+
+If any step surfaces issues, fix them and restart from step 1 — fixes can reintroduce comment, format, type, or test drift.
+
+### 5. Report
+
+Report the actual final command output, not a summary. Reviewers re-run tests; agent self-reports are not evidence ([AGENTS.md §10](../../AGENTS.md#10-hard-rules)).
diff --git a/.claude/commands/test-all.md b/.claude/commands/test-all.md
@@ -0,0 +1,16 @@
+---
+description: Run the full test suite (unit + integration + acceptance + benchmark + docstring + notebook). Slow.
+---
+
+Run every test tier in TransformerLens via the top-level `make test` target:
+
+```
+make test
+```
+
+This is slow — it runs unit, integration, acceptance, benchmark, docstring, and notebook tests sequentially. It hits HuggingFace Hub and loads multiple models. Before running, confirm:
+
+1. `.env` is sourced so `HF_TOKEN` is set (`set -a; source .env; set +a`).
+2. No other heavy GPU/MPS jobs are running on this machine — model verification cannot run concurrently (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)).
+
+Report the actual command output. Investigate any failures rather than dismissing them.
diff --git a/.claude/commands/test-unit.md b/.claude/commands/test-unit.md
@@ -0,0 +1,11 @@
+---
+description: Run the unit test suite.
+---
+
+Run the TransformerLens unit tests:
+
+```
+make unit-test
+```
+
+If any test fails, investigate the failure rather than dismissing it as "pre-existing" or unrelated — see [AGENTS.md §10](../../AGENTS.md#10-hard-rules). Report the actual command output, not a summary.
diff --git a/.claude/commands/typecheck.md b/.claude/commands/typecheck.md
@@ -0,0 +1,11 @@
+---
+description: Run mypy across the project.
+---
+
+Run the type checker:
+
+```
+uv run mypy .
+```
+
+Config lives in `[tool.mypy]` of [pyproject.toml](../../pyproject.toml). If mypy reports errors, fix the underlying typing issue — do not add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing.
diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md
@@ -0,0 +1,69 @@
+---
+description: Run verify_models.py against a single model (non-parallel). Always dry-run first.
+argument-hint: <model_name_or_hf_repo>
+---
+
+Verify model `$ARGUMENTS`. If empty, ask for an HF repo path (e.g. `gpt2`, `meta-llama/Llama-2-7b-hf`) or registry alias.
+
+## Always dry-run first
+
+Verification loads the full model and runs Phases 1–4 — 30 s to 30 min, needs memory to hold the model. **Never invoke the real run blindly.**
+
+```
+set -a; source .env; set +a
+uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run
+```
+
+Capture: estimated parameter count, projected memory (GB), HF_TOKEN requirement, architecture class.
+
+| Model | Action |
+|---|---|
+| Cached small (`gpt2`, `attn-only-*`, `tiny-stories-1M`, `distilgpt2`, …) | Proceed; report dry-run in your response so user can intervene |
+| ≥1B params, gated, or anything else | Present dry-run, ask before running |
+
+## Run the verification
+
+```
+set -a; source .env; set +a
+uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS"
+```
+
+## Optional flags
+
+Full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transformer_lens/tools/model_registry/AGENTS.md#flag-reference).
+
+- `--device cpu|cuda|mps` — override device selection
+- `--dtype float32|bfloat16` — override dtype
+- `--max-memory <gb>` — skip if param estimate exceeds; e.g. `16` on a 24 GB GPU leaves headroom for activations
+- `--phases 1 2 3` — restrict (P4 is slowest; restrict when debugging P1 forward parity)
+- `--dry-run` — see above; always first
+- `--no-hf-reference` / `--no-ht-reference` — skip HF / HT comparison (faster, lower confidence)
+- `--reverify` — re-test `status==1`
+- `--retry-failed` — re-test `status==3` (read existing `note` first)
+
+Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) don't apply to `--model <repo>` — use [§Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations).
+
+## Interpreting the output
+
+Hard thresholds (`_MIN_PHASE_SCORES` in `verify_models.py`):
+
+| Phase | Min score | Required tests | Below = |
+|---|---|---|---|
+| 1 | 100% | — | `STATUS_FAILED` |
+| 2 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` |
+| 3 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` |
+| 4 | 50% | — | **Non-gating** — adds `"low text quality"` to `note`; never fails. |
+| 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL = fail. |
+| 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL = fail. |
+
+`STATUS_VERIFIED` means hard gates passed. `note` carries quality flags or failure details.
+
+**Adapter-author caveat:** P4's 50% bar is intentionally lenient (coherence, not correctness). P4 well below 100% on a small parity-test model can indicate a real bug the system doesn't gate on — most often a missing [`preprocess_weights` fold](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights) or wrong [`default_prepend_bos`](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy). Investigate even on VERIFIED.
+
+Full reference: [§Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds).
+
+## Hard rules
+
+**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json` ([tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md)).
+
+One model at a time — concurrent loads OOM. Report actual per-phase scores; investigate failures per [AGENTS.md §10](../../AGENTS.md#10-hard-rules).
diff --git a/.claude/settings.json b/.claude/settings.json
@@ -0,0 +1,6 @@
+{
+  "permissions": {
+    "allow": [],
+    "deny": []
+  }
+}
diff --git a/.cursor/rules/transformerlens.mdc b/.cursor/rules/transformerlens.mdc
@@ -0,0 +1,21 @@
+---
+description: TransformerLens project conventions for Cursor agents.
+alwaysApply: true
+---
+
+Read `AGENTS.md` at the repo root before doing any work. It is the single source of truth for project conventions, quickstart commands, repo layout, hook-naming rules, the HookedTransformer ↔ TransformerBridge mirroring rule, PR conventions, and hard rules.
+
+Sub-folder `AGENTS.md` files apply when you're working in those directories — read them too:
+
+- `tests/AGENTS.md` — tier placement, conftest hierarchy, MPS rules
+- `transformer_lens/model_bridge/supported_architectures/AGENTS.md` — adapter contract, starter-adapter table, 4-place registration
+- `transformer_lens/tools/model_registry/AGENTS.md` — `verify_models` workflow, the `main_benchmark` trap
+
+Quick reminders that override common defaults:
+
+- Use `uv`, not `pip` or `poetry`. Install with `uv sync`; run commands with `uv run …` or `make` targets.
+- This repo has two parallel systems (`HookedTransformer` legacy and `TransformerBridge` v3). Changes to HookedTransformer that have equivalents in TransformerBridge must be mirrored to TransformerBridge.
+- Base PRs against `dev`, not `main`. Never name a branch `main` or `dev`.
+- No pre-commit hook is installed. Run `make format` and `uv run mypy .` manually before push.
+- Source `.env` (e.g. `set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command.
+- Never add `# type: ignore`, never dismiss failing tests as "pre-existing", never add platform skips to dodge CI, never claim drift is "fp noise" without empirical evidence.
diff --git a/.env.example b/.env.example
@@ -0,0 +1,20 @@
+# TransformerLens environment variables.
+# Copy to .env (gitignored) and fill in. Then source before HF-Hub-hitting commands:
+#   set -a; source .env; set +a
+
+# HuggingFace access token. Required for gated models: Llama, Mistral, Gemma,
+# gated Qwen variants, etc. Create at https://huggingface.co/settings/tokens
+HF_TOKEN=
+
+# Optional: retry HuggingFace Hub 429s. Only matters for ad-hoc non-pytest scripts
+# (docs build, demo notebooks, one-off `python -m ...` invocations). The pytest
+# suite already enables retries unconditionally via tests/conftest.py's
+# _enable_hf_retry_for_tests fixture, and CI sets this env var in checks.yml.
+# TRANSFORMERLENS_HF_RETRY=1
+
+# Optional: allow MPS device on macOS (off by default to avoid hard-to-debug
+# divergence from CUDA / CPU). CI's mps-checks job sets this.
+# TRANSFORMERLENS_ALLOW_MPS=1
+
+# Optional: silence the tokenizers parallelism warning when running under uv.
+# TOKENIZERS_PARALLELISM=false
diff --git a/.github/ISSUE_TEMPLATE/verify-model.md b/.github/ISSUE_TEMPLATE/verify-model.md
@@ -0,0 +1,51 @@
+---
+name: Verify model support
+about: Track a request to run verify_models on a specific model that someone without appropriate hardware can't run themselves
+title: "[Verify Model] org/model-id"
+labels: verification-request
+
+---
+
+<!--
+File this issue when you've followed the /add-model-support workflow but couldn't run verification yourself (insufficient memory / no GPU / gated model without HF_TOKEN access). A maintainer with appropriate hardware will pick it up.
+
+The architecture adapter must already exist for this template to fit. If it doesn't, file a feature request / proposal instead.
+-->
+
+## Model
+
+- HF repo: `https://huggingface.co/REPLACE_WITH_MODEL_ID`
+- Architecture class (from `config.architectures[0]`): `REPLACE_WITH_HF_ARCH_CLASS`
+- Estimated parameters: `REPLACE_WITH_PARAM_COUNT`
+- Projected memory: `REPLACE_WITH_GB` GB (from `verify_models --dry-run`)
+- Gated repo: `yes / no` (HF_TOKEN required: `yes / no`)
+
+## Registry state
+
+- Architecture adapter exists: `yes` (file: `transformer_lens/model_bridge/supported_architectures/REPLACE_WITH_ADAPTER.py`)
+- Currently in `data/supported_models.json`:
+  - [ ] No — entry added in the PR linked from this issue
+  - [ ] Yes, status: `REPLACE_WITH_STATUS` (0=unverified, 2=skipped, 3=failed)
+
+## Motivation
+
+<!-- What are you trying to do? Any symptom you've observed on this model that motivated the request? -->
+
+
+## How to run
+
+```bash
+set -a; source .env; set +a
+uv run python -m transformer_lens.tools.model_registry.verify_models --model REPLACE_WITH_MODEL_ID
+```
+
+For the full workflow, see [Creating Architecture Adapters in contributing.md](../../docs/source/content/contributing.md#creating-architecture-adapters).
+
+## Result
+
+<!-- A maintainer running the verification fills this in:
+- Phases passed / failed
+- Final status written to supported_models.json
+- Any notes added to the registry entry
+- Link to the PR that closes this issue
+-->
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -0,0 +1,26 @@
+# GitHub Copilot Instructions
+
+**Read [AGENTS.md](../AGENTS.md) at the repo root for the full set of project conventions, quickstart commands, repo layout, and hard rules — start with its TL;DR section.** This file inlines the highest-friction defaults Copilot most often gets wrong.
+
+## Top rules to remember
+
+1. **Use `uv`, not `pip` or `poetry`.** `uv sync` to install; `uv run <cmd>` or a `make` target to run anything.
+2. **Mirror `HookedTransformer` → `TransformerBridge`** in the same PR when behaviour exists in both. The HT registry [`transformer_lens/supported_models.py`](../transformer_lens/supported_models.py) is HT-only — Bridge-only models go in the Bridge registry under [`transformer_lens/tools/model_registry/`](../transformer_lens/tools/model_registry/).
+3. **Base PRs against `dev`**, not `main`. PRs to `main` are maintainer-only.
+
+## Common commands
+
+```bash
+uv sync                  # install
+make unit-test           # fast tests
+make format              # pycln + isort + black
+uv run mypy .            # type check
+uv run docs-hot-reload   # live docs preview
+```
+
+## Copilot-specific anti-patterns
+
+- Don't add `# type: ignore`. Prefer `isinstance` / `typing.cast`.
+- Don't dismiss failing tests as "pre-existing" — investigate every failure.
+
+The full set of hard rules (numerics, parallel benchmarks, plan-file references, etc.) lives in [AGENTS.md §10](../AGENTS.md#10-hard-rules).
diff --git a/.gitignore b/.gitignore
@@ -24,6 +24,8 @@ docs/source/generated
 .venv
 .env
 .adapter-workspace
-.claude
+.claude/*
+!.claude/settings.json
+!.claude/commands/
 .adapter-progress.json
 transformer_lens/tools/model_registry/data/verification_checkpoint.json