From 6b3d0587f67c6ce3aff1c782d62440634833b0f6 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 10:43:45 -0500 Subject: [PATCH 01/11] Initial Agentic Workflow support setup --- .claude/commands/add-model-support.md | 24 ++++ .claude/commands/build-docs.md | 16 +++ .claude/commands/format.md | 14 ++ .claude/commands/task-complete.md | 47 ++++++ .claude/commands/test-all.md | 16 +++ .claude/commands/test-unit.md | 11 ++ .claude/commands/typecheck.md | 11 ++ .claude/commands/verify-model.md | 17 +++ .claude/settings.json | 6 + .cursor/rules/transformerlens.mdc | 17 +++ .env.example | 18 +++ .github/copilot-instructions.md | 32 +++++ .gitignore | 4 +- AGENTS.md | 200 ++++++++++++++++++++++++++ CLAUDE.md | 23 +++ README.md | 8 +- 16 files changed, 459 insertions(+), 5 deletions(-) create mode 100644 .claude/commands/add-model-support.md create mode 100644 .claude/commands/build-docs.md create mode 100644 .claude/commands/format.md create mode 100644 .claude/commands/task-complete.md create mode 100644 .claude/commands/test-all.md create mode 100644 .claude/commands/test-unit.md create mode 100644 .claude/commands/typecheck.md create mode 100644 .claude/commands/verify-model.md create mode 100644 .claude/settings.json create mode 100644 .cursor/rules/transformerlens.mdc create mode 100644 .env.example create mode 100644 .github/copilot-instructions.md create mode 100644 AGENTS.md create mode 100644 CLAUDE.md diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md new file mode 100644 index 000000000..411bee911 --- /dev/null +++ b/.claude/commands/add-model-support.md @@ -0,0 +1,24 @@ +--- +description: Guided workflow for adding a new architecture adapter to TransformerBridge. +argument-hint: +--- + +You are adding TransformerBridge support for the HuggingFace model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user for the HF repo path before continuing. + +Read [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) first — these are the authoritative step-by-step references. + +Execute this checklist, stopping at each step until it is genuinely done: + +1. Confirm the model is not already supported by an existing architecture adapter. Short circuit and direct the user to the adapter if it already exists. +2. **Analyze the HF model**: read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output head layouts. Note any quirks (rotary embeddings, GQA/MQA, RMSNorm vs LayerNorm, biases, attention sinks, MoE routing). +3. **Pick a starting adapter** from [transformer_lens/model_bridge/supported_architectures/](../../transformer_lens/model_bridge/supported_architectures/) closest to the target architecture. Copy it (or [docs/source/_static/adapter-template.py](../../docs/source/_static/adapter-template.py)) into a new file named after the architecture. +4. **Fill in component mappings** so each HF module path resolves to a canonical TransformerLens Bridge name. Hook names should be Bridge-native (`blocks.{i}.hook_out` etc.), not HookedTransformer aliases. +5. **Register the adapter** in BOTH: + - [transformer_lens/model_bridge/supported_architectures/__init__.py](../../transformer_lens/model_bridge/supported_architectures/__init__.py) + - [transformer_lens/factories/architecture_adapter_factory.py](../../transformer_lens/factories/architecture_adapter_factory.py) +6. **Add the HF repo path to the Bridge registry** under [transformer_lens/tools/model_registry/](../../transformer_lens/tools/model_registry/). Do NOT add it to [transformer_lens/supported_models.py](../../transformer_lens/supported_models.py) — that file is HookedTransformer-only. +7. **Verify** end-to-end: run `/verify-model $ARGUMENTS` (one model only — do not parallelize). +8. **Write an integration test** under [tests/integration/](../../tests/integration/) that asserts logit parity with HuggingFace. Use fp32 + eager attention to match `boot_transformers`' load configuration. Gate probes for optional structural features (`resid_mid` etc.) rather than assuming they exist. +9. **Format and typecheck**: run `/format`. Fix any new mypy errors — do not add `# type: ignore`. + +When all steps pass cleanly, run `/task-complete` to confirm the change holds up under the full local check. diff --git a/.claude/commands/build-docs.md b/.claude/commands/build-docs.md new file mode 100644 index 000000000..ffa90f979 --- /dev/null +++ b/.claude/commands/build-docs.md @@ -0,0 +1,16 @@ +--- +description: Source .env then build the Sphinx docs. +--- + +Build the documentation locally: + +``` +set -a; source .env; set +a +uv run build-docs +``` + +Sourcing `.env` is required so `HF_TOKEN` is available — some doctests and notebook embeddings load gated models. Output goes to [docs/build/](../../docs/build/). + +For an interactive live-reloading preview instead, run `uv run docs-hot-reload`. + +Docs follow Google docstring style with reST extensions; see [docs/source/content/contributing.md](../../docs/source/content/contributing.md) for the style guide. diff --git a/.claude/commands/format.md b/.claude/commands/format.md new file mode 100644 index 000000000..21be5a3a9 --- /dev/null +++ b/.claude/commands/format.md @@ -0,0 +1,14 @@ +--- +description: Format the codebase (pycln + isort + black) and run mypy. +--- + +Format and type-check the working tree: + +``` +make format +uv run mypy . +``` + +`make format` runs `pycln --all` (unused imports), `isort` (import sorting), and `black` (line length 100). `uv run mypy .` runs the type checker with the config in [pyproject.toml](../../pyproject.toml). + +If mypy reports new errors, fix them — do not add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing. See [AGENTS.md §10](../../AGENTS.md#10-hard-rules). diff --git a/.claude/commands/task-complete.md b/.claude/commands/task-complete.md new file mode 100644 index 000000000..8ba37e722 --- /dev/null +++ b/.claude/commands/task-complete.md @@ -0,0 +1,47 @@ +--- +description: End-of-task gate. Clean up new comments, format, type-check, and run the standard test tiers (unit + docstring + acceptance + integration) — fixing issues along the way. +--- + +Run the end-of-task gate. Do not declare the task complete until every step below passes cleanly. + +### 1. Clean up new comments + +Review every comment and docstring **added or modified during this task** against the rules in [AGENTS.md §10](../../AGENTS.md#10-hard-rules): + +- Comments should be terse one-liners; docstrings are one-line where possible. +- Inline comments explain WHY, not WHAT — delete any that just restate the code. +- Multi-paragraph explanations belong in PR descriptions or design docs, not source. +- Remove any references to plan files, audit IDs, finding IDs, or "see plan section X" — those rot as the codebase evolves and belong only in the PR description. + +Use `git diff` against the merge-base to scope the review to genuinely new comments — do NOT rewrite unrelated comments elsewhere in the file. + +### 2. Format + +``` +make format +uv run mypy . +``` + +If mypy reports new errors, fix the underlying typing issue. Do not add `# type: ignore` — prefer `isinstance` assertions or `typing.cast`. + +### 3. Run the standard test tiers + +``` +set -a; source .env; set +a +make unit-test +make docstring-test +make acceptance-test +make integration-test +``` + +These are the tiers that gate PR review for almost every change. Notebook and benchmark suites are intentionally skipped — they are slow, hit gated models, and are not required to land typical PRs (CI runs them separately). If your change specifically touches a notebook or a benchmark, run that file directly (`pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` or `make benchmark-test`) in addition. + +Investigate every failure. Do not dismiss any failure as "pre-existing" or "unrelated" — fix the underlying issue, even if it predates this task (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)). Do not add platform skips or `xfail` markers to dodge a failing test. + +### 4. Re-loop on failure + +If any of steps 1–3 surface issues, fix them, then **restart from step 1** — fixes can introduce new comments, new format/type drift, or new test failures. Repeat until all three steps pass without changes. + +### 5. Report + +Report the actual final command output, not a summary. Reviewers re-run tests; agent self-reports are not evidence ([AGENTS.md §10](../../AGENTS.md#10-hard-rules)). diff --git a/.claude/commands/test-all.md b/.claude/commands/test-all.md new file mode 100644 index 000000000..583eb08a7 --- /dev/null +++ b/.claude/commands/test-all.md @@ -0,0 +1,16 @@ +--- +description: Run the full test suite (unit + integration + acceptance + benchmark + docstring + notebook). Slow. +--- + +Run every test tier in TransformerLens via the top-level `make test` target: + +``` +make test +``` + +This is slow — it runs unit, integration, acceptance, benchmark, docstring, and notebook tests sequentially. It hits HuggingFace Hub and loads multiple models. Before running, confirm: + +1. `.env` is sourced so `HF_TOKEN` is set (`set -a; source .env; set +a`). +2. No other heavy GPU/MPS jobs are running on this machine — model verification cannot run concurrently (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)). + +Report the actual command output. Investigate any failures rather than dismissing them. diff --git a/.claude/commands/test-unit.md b/.claude/commands/test-unit.md new file mode 100644 index 000000000..11274ff81 --- /dev/null +++ b/.claude/commands/test-unit.md @@ -0,0 +1,11 @@ +--- +description: Run the unit test suite. +--- + +Run the TransformerLens unit tests: + +``` +make unit-test +``` + +If any test fails, investigate the failure rather than dismissing it as "pre-existing" or unrelated — see [AGENTS.md §10](../../AGENTS.md#10-hard-rules). Report the actual command output, not a summary. diff --git a/.claude/commands/typecheck.md b/.claude/commands/typecheck.md new file mode 100644 index 000000000..68dd34921 --- /dev/null +++ b/.claude/commands/typecheck.md @@ -0,0 +1,11 @@ +--- +description: Run mypy across the project. +--- + +Run the type checker: + +``` +uv run mypy . +``` + +Config lives in `[tool.mypy]` of [pyproject.toml](../../pyproject.toml). If mypy reports errors, fix the underlying typing issue — do not add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing. diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md new file mode 100644 index 000000000..5e0cd0fae --- /dev/null +++ b/.claude/commands/verify-model.md @@ -0,0 +1,17 @@ +--- +description: Run verify_models.py against a single model (non-parallel). +argument-hint: +--- + +Run model verification for `$ARGUMENTS`: + +``` +set -a; source .env; set +a +uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" +``` + +If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. + +**Critical**: run verification for **one model at a time**. Never parallelize across models — a single CUDA/MPS device does not have memory for concurrent loads. See [AGENTS.md §10](../../AGENTS.md#10-hard-rules). + +Report the actual benchmark output. If any phase fails or shows unexpected drift, investigate before declaring done. Do not claim drift is "fp noise" without empirical evidence. diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 000000000..992d4abaf --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,6 @@ +{ + "permissions": { + "allow": [], + "deny": [] + } +} diff --git a/.cursor/rules/transformerlens.mdc b/.cursor/rules/transformerlens.mdc new file mode 100644 index 000000000..11fad36a4 --- /dev/null +++ b/.cursor/rules/transformerlens.mdc @@ -0,0 +1,17 @@ +--- +description: TransformerLens project conventions for Cursor agents. +alwaysApply: true +--- + +Read `AGENTS.md` at the repo root before doing any work. It is the single source of truth for project conventions, quickstart commands, repo layout, hook-naming rules, the HookedTransformer ↔ TransformerBridge mirroring rule, PR conventions, and hard rules. + +Quick reminders that override common defaults: + +- Use `uv`, not `pip` or `poetry`. Install with `uv sync`; run commands with `uv run …` or `make` targets. +- This repo has two parallel systems (`HookedTransformer` legacy and `TransformerBridge` v3). Changes to HookedTransformer that have equivalents in TransformerBridge must be mirrored to TransformerBridge. +- Base PRs against `dev`, not `main`. Never name a branch `main` or `dev`. +- No pre-commit hook is installed. Run `make format` and `uv run mypy .` manually before push. +- Source `.env` (e.g. `set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command. +- Never add `# type: ignore`, never dismiss failing tests as "pre-existing", never add platform skips to dodge CI, never claim drift is "fp noise" without empirical evidence. + +See `AGENTS.md` for the full context behind each of these rules. diff --git a/.env.example b/.env.example new file mode 100644 index 000000000..518673219 --- /dev/null +++ b/.env.example @@ -0,0 +1,18 @@ +# TransformerLens environment variables. +# Copy to .env (gitignored) and fill in. Then source before HF-Hub-hitting commands: +# set -a; source .env; set +a + +# HuggingFace access token. Required for gated models: Llama, Mistral, Gemma, +# gated Qwen variants, etc. Create at https://huggingface.co/settings/tokens +HF_TOKEN= + +# Optional: retry HuggingFace Hub 429s in non-pytest invocations (CI sets this +# automatically in checks.yml). Pytest enables retries via tests/conftest.py. +# TRANSFORMERLENS_HF_RETRY=1 + +# Optional: allow MPS device on macOS (off by default to avoid hard-to-debug +# divergence from CUDA / CPU). CI's mps-checks job sets this. +# TRANSFORMERLENS_ALLOW_MPS=1 + +# Optional: silence the tokenizers parallelism warning when running under uv. +# TOKENIZERS_PARALLELISM=false diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 000000000..b79dc0841 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,32 @@ +# GitHub Copilot Instructions + +**Read [AGENTS.md](../AGENTS.md) at the repo root for the full set of project conventions, quickstart commands, repo layout, and hard rules.** This file mirrors the top-of-mind essentials so Copilot Chat surfaces them inline. + +## Top rules to remember + +1. **Use `uv`, not `pip` or `poetry`.** Install with `uv sync`. Run commands with `uv run …` or via `make` targets. +2. **Two parallel systems**: `HookedTransformer` (legacy, in maintenance) and `TransformerBridge` (v3, default for new work). If you change behavior on HookedTransformer that has an equivalent in TransformerBridge, mirror the change to TransformerBridge in the same PR. New TransformerBridge features do not need to be mirrored in HookedTransformer. The HT model registry [transformer_lens/supported_models.py](../transformer_lens/supported_models.py) is **HT-only** — Bridge-only models go in the Bridge registry under [transformer_lens/tools/model_registry/](../transformer_lens/tools/model_registry/). +3. **Base PRs against `dev`**, not `main`. PRs targeting `main` are done by Maintainers only – Request permission before basing off `main`. +4. **No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing — CI will fail otherwise. +5. **Source `.env` before any HF-Hub command** (`set -a; source .env; set +a`). `HF_TOKEN` is required for gated models (Llama, Mistral, Gemma, gated Qwen variants). + +## Common commands + +```bash +uv sync # install +make unit-test # fast tests +make format # pycln + isort + black +uv run mypy . # type check +uv run docs-hot-reload # live docs preview +``` + +## Hard "do not"s + +- Don't dismiss failing tests as "pre-existing" — investigate every failure. +- Don't add `# type: ignore`. Prefer `isinstance` / `cast`. +- Don't run model verification or benchmarks in parallel (device OOM). +- Don't claim drift is "fp noise" without empirical evidence. +- Don't add skips / `xfail` to make failing CI green. +- Don't reference plan-file details in source comments or docstrings. + +See [AGENTS.md](../AGENTS.md) for the full set with rationale. diff --git a/.gitignore b/.gitignore index 5b2752d66..570bd085c 100644 --- a/.gitignore +++ b/.gitignore @@ -24,6 +24,8 @@ docs/source/generated .venv .env .adapter-workspace -.claude +.claude/* +!.claude/settings.json +!.claude/commands/ .adapter-progress.json transformer_lens/tools/model_registry/data/verification_checkpoint.json diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 000000000..2c5acee59 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,200 @@ +# AGENTS.md + +Guidance for AI coding agents contributing to **TransformerLens**. + +This file is the single source of truth. Vendor-specific files ([CLAUDE.md](CLAUDE.md), [.github/copilot-instructions.md](.github/copilot-instructions.md), [.cursor/rules/transformerlens.mdc](.cursor/rules/transformerlens.mdc)) all defer here. + +--- + +## 1. What this repo is + +**TransformerLens** is a library for mechanistic interpretability of generative language models. It loads 9,000+ open-source models across 50+ architecture families (GPT-2, Llama, Gemma, Mistral, Qwen, T5, BERT, Mamba, etc. — see [transformer_lens/tools/model_registry/data/supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json) for the full inventory) and exposes their internal activations through a hook system so researchers can cache, edit, and ablate intermediate state. + +Upstream dependency: HuggingFace `transformers`. Most models are wrappers over HF implementations. + +## 2. Two systems live in this repo + +The library is in the middle of a v2 → v3 transition. **Both systems are still present** and changes to HookedTransformer often need to be mirrored to TransformerBridge. + +### `TransformerBridge` (v3, recommended for new work) + +- Lives in [transformer_lens/model_bridge/](transformer_lens/model_bridge/). +- Architecture-adapter pattern: each HF architecture has an adapter file in [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) that maps HF module paths to canonical TransformerLens names. +- Preserves raw HF weights by default (logits match HF). Call `bridge.enable_compatibility_mode()` after booting for HookedTransformer-equivalent numerics. +- Hook names are architecture-native (e.g. `blocks.{i}.hook_out`), with aliases provided for HT-style names where applicable. + +### `HookedTransformer` (legacy, maintenance mode) + +- Lives in [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) and [transformer_lens/components/](transformer_lens/components/). +- Folds LayerNorm and centers weights by default — useful for circuit analysis but means weights and logits **do not** match HF. +- Model registry is [transformer_lens/supported_models.py](transformer_lens/supported_models.py) — **HT-only**. Bridge-only models do not belong here; they live in the Bridge registry data file under [transformer_lens/tools/model_registry/](transformer_lens/tools/model_registry/). +- Deprecated as of TransformerLens 3.0; will be removed in the next major version. Still fully functional via the compatibility layer. + +### Mirroring rule + +If you change behavior in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. They are parallel implementations of the same surface; drift between them is a recurring source of bugs. Conversely, `supported_models.py` is HT-only — do not add Bridge-only models there. + +## 3. Quickstart + +```bash +# Install (this repo uses uv, NOT pip and NOT poetry) +uv sync +source .venv/bin/activate + +# Source HF token (required for gated models: Llama, Mistral, Gemma, gated Qwen, etc.) +set -a; source .env; set +a + +# Run tests +make unit-test # fast, no model loads — run this on every change +make integration-test # cross-component, hits HF Hub +make acceptance-test # end-to-end +make notebook-test # demos/*.ipynb (slow; CI parallelizes) +make docstring-test # doctest + doctest-plus +make test # all of the above (long) + +# Format + typecheck (no pre-commit hook is installed — run manually before push) +make format # pycln + isort + black +make check-format # CI-equivalent check +uv run mypy . + +# Docs +uv run docs-hot-reload # live preview +uv run build-docs # build to docs/build/ +``` + +Python: **>=3.10, <4.0**. CI tests 3.10, 3.11, 3.12. Format/type/docstring checks run on 3.12. + +## 4. Repo map + +| Path | What's there | +|---|---| +| [transformer_lens/](transformer_lens/) | Core package | +| [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) | Legacy `HookedTransformer` API | +| [transformer_lens/HookedEncoder.py](transformer_lens/HookedEncoder.py), [HookedEncoderDecoder.py](transformer_lens/HookedEncoderDecoder.py), [HookedAudioEncoder.py](transformer_lens/HookedAudioEncoder.py) | Encoder-only / seq2seq / audio variants | +| [transformer_lens/model_bridge/](transformer_lens/model_bridge/) | `TransformerBridge` system | +| [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) | One adapter file per HF architecture | +| [transformer_lens/model_bridge/generalized_components/](transformer_lens/model_bridge/generalized_components/) | Bridge-side reusable components | +| [transformer_lens/components/](transformer_lens/components/) | HT-side components (attention, MLP, LN, embed) | +| [transformer_lens/factories/](transformer_lens/factories/) | `architecture_adapter_factory.py`, `mlp_factory.py`, `activation_function_factory.py` | +| [transformer_lens/config/](transformer_lens/config/) | `HookedTransformerConfig` and `TransformerBridgeConfig` | +| [transformer_lens/utilities/](transformer_lens/utilities/) | Device management, weight processing, HF utilities | +| [transformer_lens/hook_points.py](transformer_lens/hook_points.py) | `HookPoint` class and `LensHandle` | +| [transformer_lens/supported_models.py](transformer_lens/supported_models.py) | **HT-only** registry (`OFFICIAL_MODEL_NAMES`, `MODEL_ALIASES`) | +| [transformer_lens/tools/model_registry/](transformer_lens/tools/model_registry/) | Bridge-side registry + `verify_models.py` benchmark suite | +| [transformer_lens/patching.py](transformer_lens/patching.py), [evals.py](transformer_lens/evals.py) | Activation patching, IOI, ROME, etc. | +| [tests/unit/](tests/unit/), [tests/integration/](tests/integration/), [tests/acceptance/](tests/acceptance/), [tests/benchmarks/](tests/benchmarks/), [tests/mps/](tests/mps/) | Test tiers | +| [demos/](demos/) | Jupyter notebooks; a subset runs in CI under `nbval` with sanitization from [demos/doc_sanitize.cfg](demos/doc_sanitize.cfg) | +| [docs/source/content/](docs/source/content/) | Sphinx markdown sources | +| [docs/source/content/adapter_development/](docs/source/content/adapter_development/) | Adapter-authoring guides — read these before adding a new architecture | +| [makefile](makefile) | Canonical test/format/docs targets | +| [pyproject.toml](pyproject.toml) | Deps, pytest / mypy / format / build config. | +| [.github/workflows/checks.yml](.github/workflows/checks.yml) | CI gates | +| [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md) | PR template + base-branch rules | + +## 5. Hook naming — HT vs Bridge + +This trips up agents and humans alike. + +- **HT canonical names**: uniform across architectures. `hook_embed`, `blocks.{i}.hook_resid_pre`, `blocks.{i}.attn.hook_q`, `blocks.{i}.hook_resid_post`, etc. +- **Bridge-native names**: architecture-shaped. E.g. `blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`. Aliases are registered via `build_alias_to_canonical_map()` in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py) so HT names continue to resolve. + +When writing new code, prefer Bridge-native names. + +When raw-HF-forward drivers compare against `boot_transformers`, match its load configuration (fp32, eager attention). Probe for structural features like `resid_mid` rather than assuming all architectures expose them. + +## 6. Adding a model + +TransformerBridge adapters are written per architecture family, not per individual model. Adding `gpt2` registers all GPT-2 variants; you generally won't add a new file for a sibling checkpoint of an already-supported architecture. + +Start with [docs/source/content/adapter_development/](docs/source/content/adapter_development/) — there are step-by-step guides covering the full workflow. Summary: + +1. Read the HF model's `config.json` and source to identify components (embed, attention, MLP, normalization, output head). +2. Copy [docs/source/_static/adapter-template.py](docs/source/_static/adapter-template.py) into [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) as `.py`. +3. Fill in component mappings. +4. Register in both [transformer_lens/model_bridge/supported_architectures/__init__.py](transformer_lens/model_bridge/supported_architectures/__init__.py) and [transformer_lens/factories/architecture_adapter_factory.py](transformer_lens/factories/architecture_adapter_factory.py). +5. Add the HF repo path to the Bridge registry (model registry data file) and run `verify_models.py` against the new architecture. +6. Add an integration test under [tests/integration/](tests/integration/) that asserts logit parity with HF. + +The existing adapters under [supported_architectures/](transformer_lens/model_bridge/supported_architectures/) are the best references for tricky cases. + +## 7. Prioritization + +When picking solutions to any problem, prioritize by **research impact**, not implementation ease. A correct, broadly-applicable feature is worth more than a one-off shortcut. + +## 8. PR conventions + +- **Base branch**: + - `dev` — default for all PRs (new features, refactors, docs, most bug fixes). + - `main` — **only** for bug fixes against the currently-released version. PRs to main made at Maintainer's discretion. +- **Branch names**: do NOT name your branch `main` or `dev` — these conflict with the canonical branches when maintainers periodically refresh PRs against upstream. Include the word `docs` in the branch name if your PR is primarily a docs change (this triggers the docs build job). +- **New branches must track their own remote**, not the source branch's remote. When you `git push -u`, push to the new branch's namespace, not the branch you forked from. +- **PR template**: [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md). No conventional-commits enforcement. +- **No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing; CI will fail otherwise. + +## 9. CI gates a PR must pass + +From [.github/workflows/checks.yml](.github/workflows/checks.yml): + +| Job | What it runs | +|---|---| +| `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` across Python 3.10 / 3.11 / 3.12 | +| `mps-checks` | macOS unit + integration + smoke tests with MPS (PRs to `main` only) | +| `format-check` | `make check-format` | +| `type-check` | `uv run mypy .` | +| `docstring-test` | `make docstring-test` | +| `coverage-test` | Full test suite with coverage report (uploaded as `test-coverage` artifact) | +| `notebook-checks` | Subset of `demos/*.ipynb` validated via `nbval` against [demos/doc_sanitize.cfg](demos/doc_sanitize.cfg). Notebooks that require `HF_TOKEN` are skipped when the secret is absent. | +| `build-docs` | Sphinx build (only on push to `main`/`dev` or branches containing `docs`) | +| `deploy-docs` | Deploys to GitHub Pages (only on push to `main`) | + +In-progress PR runs are cancelled when a new commit lands; tag/release runs are not. + +## 10. Hard rules + +These are the explicit "do / do not" rules. They are informed by maintainer experience. + +**On test failures:** + +- Never dismiss a failing test as "pre-existing" or "unrelated" — investigate every failure and fix the underlying issue. +- Never fabricate test counts or claim tests passed without running them. Reviewers re-run tests; agent self-reports are not trusted as evidence. +- Never add platform skips, `@pytest.mark.skip`, or `xfail` to make a failing CI green. Debug the actual bug. +- If new tests surface pre-existing bugs, fix them. Don't punt to a future PR or `xfail` marker. + +**On code quality:** + +- Never add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing. +- Comments: terse, one-line docstrings; inline comments should explain WHY, not WHAT. No multi-paragraph explanations. +- Never reference plan-file details (audit IDs, finding IDs, "see plan section X") in source comments or docstrings. Plan artifacts belong in PR descriptions, not the codebase. + +**On numerical work:** + +- Never claim observed drift is "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale. +- Never run model verification or benchmarks in parallel, unless explicitly requested to — a single CUDA/MPS device does not have memory for concurrent loads. Serialize them. + +**On environment and tooling:** + +- Use `uv`, never `pip` or `poetry`. Run commands via `uv run ` or the appropriate `make` target. +- Source `.env` before any HF-Hub-hitting command (docs build, notebook runs, `boot_transformers` against gated models). The `HF_TOKEN` is required for Llama, Mistral, Gemma, and gated Qwen variants. +- No pre-commit hook is installed; run `make format` + `uv run mypy .` manually before push. + +## 11. "Done" checklist + +Before declaring a task complete: + +| Task type | Must do | +|---|---| +| Bug fix | Reproduce in a test, fix, confirm test passes, `make format`, `uv run mypy .` | +| New adapter | Adapter file + factory registration + Bridge registry entry + integration test asserting HF logit parity + `verify_models.py` run for the new architecture | +| Docs change | `uv run build-docs` succeeds; branch name contains `docs` so the docs job triggers in CI | +| Notebook change | `pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` passes locally | +| Anything else | `make unit-test` + `make format` + `uv run mypy .` clean before push | + +If you're using Claude Code, the `/task-complete` slash command in [.claude/commands/](.claude/commands/) automates the last row. + +## 12. Pointers for further reading + +- [docs/source/content/contributing.md](docs/source/content/contributing.md) — the human-facing contributor guide +- [docs/source/content/migrating_to_v3.md](docs/source/content/migrating_to_v3.md) — HT → Bridge migration recipes +- [docs/source/content/getting_started.md](docs/source/content/getting_started.md) — full env-vars list and first-time setup +- [docs/source/content/adapter_development/](docs/source/content/adapter_development/) — adapter authoring deep dive +- [demos/Main_Demo.ipynb](demos/Main_Demo.ipynb), [demos/Bridge_Evals_Demo.ipynb](demos/Bridge_Evals_Demo.ipynb) — start here to see the library in use diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 000000000..5f3ce09dd --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,23 @@ +# CLAUDE.md + +**Read [AGENTS.md](AGENTS.md) first.** It is the single source of truth for project conventions, quickstart commands, repo layout, hard rules, and PR conventions. Everything below is Claude-Code-specific tooling on top of it. + +## Claude Code conveniences in this repo + +- **Slash commands** — see [.claude/commands/](.claude/commands/) for canonical workflows: + - `/test-unit` — `make unit-test` + - `/test-all` — `make test` (slow; warns before running) + - `/format` — `make format` then `uv run mypy .` + - `/typecheck` — `uv run mypy .` + - `/build-docs` — sources `.env`, runs `uv run build-docs` + - `/verify-model ` — guided single-model run of `verify_models.py` + - `/add-model-support ` — checklist-driven new-adapter workflow + - `/task-complete` — end-of-task gate: clean up new comments per the comment rules in AGENTS.md, then `/format`, then run `unit-test` + `docstring-test` + `acceptance-test` + `integration-test`, fixing any issues until all three steps pass cleanly. Skips notebook + benchmark tiers (run those directly if your change touched them). +- **Project settings** — [.claude/settings.json](.claude/settings.json) is the checked-in scaffold for project-wide Claude Code settings (hooks, env). It intentionally ships with no permission allowlist; configure your own permissions in `.claude/settings.local.json` (gitignored). +- **User-specific settings** — `.claude/settings.local.json`, `.claude/agents/`, `.claude/worktrees/` are all gitignored. Configure them per-user. + +## Pointers when working with Claude Code in this repo + +- The hard rules in [AGENTS.md §10](AGENTS.md#10-hard-rules) reflect rules the maintainer has repeatedly enforced. Treat them as load-bearing. +- The mirroring rule in [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — `HookedTransformer` to `TransformerBridge` — is the single most common source of "you forgot to mirror this to TransformerBridge" PR feedback. +- For tasks spanning more than three steps, use TodoWrite to track progress and finish with `/task-complete` before declaring done. diff --git a/README.md b/README.md index 28c47cecb..d909863c6 100644 --- a/README.md +++ b/README.md @@ -20,9 +20,9 @@ interpretability](https://distill.pub/2020/circuits/zoom-in/) of GPT-2 Style lan goal of mechanistic interpretability is to take a trained model and reverse engineer the algorithms the model learned during training from its weights. -TransformerLens lets you load in 50+ different open source language models, and exposes the internal -activations of the model to you. You can cache any internal activation in the model, and add in -functions to edit, remove or replace these activations as the model runs. +TransformerLens lets you load in 9,000+ open source language models across 50+ architecture families, +and exposes the internal activations of the model to you. You can cache any internal activation in the +model, and add in functions to edit, remove or replace these activations as the model runs. ## Quick Start @@ -52,7 +52,7 @@ logits, activations = bridge.run_with_cache("Hello World") > Gated models (Llama, Mistral, Gemma, ...) require `HF_TOKEN` in your environment. See [Environment Variables](https://TransformerLensOrg.github.io/TransformerLens/content/getting_started.html#environment-variables) for the full list. -`TransformerBridge` is the recommended 3.0 path and supports 50+ architectures. By default it preserves raw HuggingFace weights – logits and activations match HF, *not* legacy `HookedTransformer` (which folds LayerNorm and centers weights by default). Call `bridge.enable_compatibility_mode()` after booting for HookedTransformer-equivalent numerics. The legacy `HookedTransformer.from_pretrained` API is still available but deprecated — see the [Migrating to TransformerLens 3](https://TransformerLensOrg.github.io/TransformerLens/content/migrating_to_v3.html) guide. +`TransformerBridge` is the recommended 3.0 path and supports 9,000+ models across 50+ architecture families (see [`supported_models.json`](transformer_lens/tools/model_registry/data/supported_models.json) for the full inventory). By default it preserves raw HuggingFace weights – logits and activations match HF, *not* legacy `HookedTransformer` (which folds LayerNorm and centers weights by default). Call `bridge.enable_compatibility_mode()` after booting for HookedTransformer-equivalent numerics. The legacy `HookedTransformer.from_pretrained` API is still available but deprecated — see the [Migrating to TransformerLens 3](https://TransformerLensOrg.github.io/TransformerLens/content/migrating_to_v3.html) guide. ## Key Tutorials From a3c6c4f46aaab7f1e822d4a44a95d6b82c447925 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 11:02:40 -0500 Subject: [PATCH 02/11] initial setup of sub-directory files --- AGENTS.md | 58 ++++--- tests/AGENTS.md | 87 ++++++++++ .../supported_architectures/AGENTS.md | 153 ++++++++++++++++++ .../tools/model_registry/AGENTS.md | 92 +++++++++++ 4 files changed, 364 insertions(+), 26 deletions(-) create mode 100644 tests/AGENTS.md create mode 100644 transformer_lens/model_bridge/supported_architectures/AGENTS.md create mode 100644 transformer_lens/tools/model_registry/AGENTS.md diff --git a/AGENTS.md b/AGENTS.md index 2c5acee59..5a297172c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,19 +2,30 @@ Guidance for AI coding agents contributing to **TransformerLens**. -This file is the single source of truth. Vendor-specific files ([CLAUDE.md](CLAUDE.md), [.github/copilot-instructions.md](.github/copilot-instructions.md), [.cursor/rules/transformerlens.mdc](.cursor/rules/transformerlens.mdc)) all defer here. +This file is the single source of truth. Vendor-specific files ([CLAUDE.md](CLAUDE.md), [.github/copilot-instructions.md](.github/copilot-instructions.md), [.cursor/rules/transformerlens.mdc](.cursor/rules/transformerlens.mdc)) and sub-folder `AGENTS.md` files all defer here. + +## TL;DR — read before doing anything + +1. **Use `uv`**, not `pip` or `poetry`. Install with `uv sync`. +2. **Source `.env`** (`set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command. Gated models need `HF_TOKEN`. +3. **Base PRs on `dev`**, not `main`. Never name a branch `main` or `dev`. +4. **Mirror HookedTransformer → TransformerBridge** when changing behaviour that exists in both ([§2](#2-two-systems-live-in-this-repo)). +5. **Run `make format` + `uv run mypy .` before push** — no pre-commit hook is installed. +6. **Never add `# type: ignore`** — use `isinstance` / `typing.cast` ([§10](#10-hard-rules)). +7. **Never dismiss a failing test as "pre-existing"** — investigate every failure ([§10](#10-hard-rules)). + +Sub-folder rules: [tests/AGENTS.md](tests/AGENTS.md) · [supported_architectures/AGENTS.md](transformer_lens/model_bridge/supported_architectures/AGENTS.md) · [tools/model_registry/AGENTS.md](transformer_lens/tools/model_registry/AGENTS.md). --- ## 1. What this repo is -**TransformerLens** is a library for mechanistic interpretability of generative language models. It loads 9,000+ open-source models across 50+ architecture families (GPT-2, Llama, Gemma, Mistral, Qwen, T5, BERT, Mamba, etc. — see [transformer_lens/tools/model_registry/data/supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json) for the full inventory) and exposes their internal activations through a hook system so researchers can cache, edit, and ablate intermediate state. - -Upstream dependency: HuggingFace `transformers`. Most models are wrappers over HF implementations. +**TransformerLens** — mechanistic-interpretability library. Loads 9,000+ models across 50+ architecture families (see [supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json)) and exposes internal activations through a hook system for caching, editing, and ablating intermediate state. Built on HuggingFace `transformers`. ## 2. Two systems live in this repo -The library is in the middle of a v2 → v3 transition. **Both systems are still present** and changes to HookedTransformer often need to be mirrored to TransformerBridge. +- **`TransformerBridge` (v3)** — default for new work. Lives in [transformer_lens/model_bridge/](transformer_lens/model_bridge/). +- **`HookedTransformer`** — legacy, maintenance mode, deprecated in 3.0. ### `TransformerBridge` (v3, recommended for new work) @@ -37,22 +48,22 @@ If you change behavior in `HookedTransformer` that has a counterpart in `Transfo ## 3. Quickstart ```bash -# Install (this repo uses uv, NOT pip and NOT poetry) +# Install — uv only (not pip, not poetry) uv sync source .venv/bin/activate -# Source HF token (required for gated models: Llama, Mistral, Gemma, gated Qwen, etc.) +# Source HF token before any HF-Hub-hitting command set -a; source .env; set +a -# Run tests -make unit-test # fast, no model loads — run this on every change -make integration-test # cross-component, hits HF Hub +# Tests +make unit-test # fast, no model loads +make integration-test # cross-component make acceptance-test # end-to-end -make notebook-test # demos/*.ipynb (slow; CI parallelizes) make docstring-test # doctest + doctest-plus -make test # all of the above (long) +make notebook-test # slow; subset run in CI +make test # all of the above -# Format + typecheck (no pre-commit hook is installed — run manually before push) +# Format + typecheck — no pre-commit hook; run manually before push make format # pycln + isort + black make check-format # CI-equivalent check uv run mypy . @@ -93,8 +104,6 @@ Python: **>=3.10, <4.0**. CI tests 3.10, 3.11, 3.12. Format/type/docstring check ## 5. Hook naming — HT vs Bridge -This trips up agents and humans alike. - - **HT canonical names**: uniform across architectures. `hook_embed`, `blocks.{i}.hook_resid_pre`, `blocks.{i}.attn.hook_q`, `blocks.{i}.hook_resid_post`, etc. - **Bridge-native names**: architecture-shaped. E.g. `blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`. Aliases are registered via `build_alias_to_canonical_map()` in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py) so HT names continue to resolve. @@ -135,19 +144,19 @@ When picking solutions to any problem, prioritize by **research impact**, not im From [.github/workflows/checks.yml](.github/workflows/checks.yml): -| Job | What it runs | +| Job | Runs | |---|---| -| `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` across Python 3.10 / 3.11 / 3.12 | -| `mps-checks` | macOS unit + integration + smoke tests with MPS (PRs to `main` only) | +| `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` × py 3.10 / 3.11 / 3.12 | +| `mps-checks` | macOS MPS unit + integration + smoke (PRs to `main` only) | | `format-check` | `make check-format` | | `type-check` | `uv run mypy .` | | `docstring-test` | `make docstring-test` | -| `coverage-test` | Full test suite with coverage report (uploaded as `test-coverage` artifact) | -| `notebook-checks` | Subset of `demos/*.ipynb` validated via `nbval` against [demos/doc_sanitize.cfg](demos/doc_sanitize.cfg). Notebooks that require `HF_TOKEN` are skipped when the secret is absent. | -| `build-docs` | Sphinx build (only on push to `main`/`dev` or branches containing `docs`) | -| `deploy-docs` | Deploys to GitHub Pages (only on push to `main`) | +| `coverage-test` | Full suite + coverage artifact | +| `notebook-checks` | `nbval` over subset of `demos/*.ipynb`; `HF_TOKEN`-gated notebooks skip when secret absent | +| `build-docs` | Sphinx build (push to `main`/`dev` or branch containing `docs`) | +| `deploy-docs` | GitHub Pages (push to `main` only) | -In-progress PR runs are cancelled when a new commit lands; tag/release runs are not. +In-progress PR runs cancel on new commit; tag/release runs do not. ## 10. Hard rules @@ -193,8 +202,5 @@ If you're using Claude Code, the `/task-complete` slash command in [.claude/comm ## 12. Pointers for further reading -- [docs/source/content/contributing.md](docs/source/content/contributing.md) — the human-facing contributor guide - [docs/source/content/migrating_to_v3.md](docs/source/content/migrating_to_v3.md) — HT → Bridge migration recipes -- [docs/source/content/getting_started.md](docs/source/content/getting_started.md) — full env-vars list and first-time setup - [docs/source/content/adapter_development/](docs/source/content/adapter_development/) — adapter authoring deep dive -- [demos/Main_Demo.ipynb](demos/Main_Demo.ipynb), [demos/Bridge_Evals_Demo.ipynb](demos/Bridge_Evals_Demo.ipynb) — start here to see the library in use diff --git a/tests/AGENTS.md b/tests/AGENTS.md new file mode 100644 index 000000000..14e25cd4d --- /dev/null +++ b/tests/AGENTS.md @@ -0,0 +1,87 @@ +# Tests — AGENTS.md + +Read [the root AGENTS.md](../AGENTS.md) for project-wide rules. This file covers conventions specific to `tests/`. + +--- + +## TL;DR + +- **Tier placement matters.** Use [the tier table](#test-tiers) below — a unit test loading a model belongs in `integration/`, not `unit/`. +- **No mocking of model loads or the HF Hub.** Tests load real models. The session-scoped fixtures amortize the cost. +- **Use cached models for fast tests** (`gpt2`, `attn-only-{1,2,3,4}l`, `tiny-stories-1M`). Anything else gets `@pytest.mark.slow`. +- **MPS is a carve-out, not a green light.** `TRANSFORMERLENS_ALLOW_MPS=1` is required; only [`tests/mps/`](mps/) runs there. +- **Hard rules in [AGENTS.md §10](../AGENTS.md#10-hard-rules) apply**: no `xfail`/`skipif` to dodge CI, no dismissing failing tests as "pre-existing," no platform skips outside MPS. + +--- + +## Test tiers + +| Tier | Path | Loads models? | Hits HF Hub? | Scope | Example | +|---|---|---|---|---|---| +| `unit` | [`tests/unit/`](unit/) | None / synthetic | No | Function or single module | [`tests/unit/test_key_value_cache_entry.py`](unit/test_key_value_cache_entry.py) | +| `integration` | [`tests/integration/`](integration/) | 1–2 cached models, module-scoped | Yes | Cross-component | [`tests/integration/test_generation_compatibility.py`](integration/test_generation_compatibility.py) | +| `acceptance` | [`tests/acceptance/`](acceptance/) | Full models (`gpt2`, `bloom-560m`), session-scoped | Yes | End-to-end behaviour | [`tests/acceptance/conftest.py`](acceptance/conftest.py) | +| `benchmarks` | [`tests/benchmarks/`](benchmarks/) | Varies; performance focus | Yes | Throughput / memory | [`tests/benchmarks/test_boot_memory.py`](benchmarks/test_boot_memory.py) | +| `mps` | [`tests/mps/`](mps/) | TinyStories-1M, fp32 only | Yes | macOS-MPS smoke only | [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) | + +**Rule of thumb:** if your test calls `HookedTransformer.from_pretrained(...)` or `TransformerBridge.boot_transformers(...)`, it is NOT a unit test. + +--- + +## Conftest hierarchy + +| Path | Provides | +|---|---| +| [`tests/conftest.py`](conftest.py) | `cleanup_memory` (function autouse) and `cleanup_class_memory` — CUDA/MPS cache + GC; `_enable_hf_retry_for_tests` (session autouse) — wraps HF `from_pretrained` with 429 retry; seeded RNG (numpy/torch/Python @ 42); `gpt2_tokenizer` + `gpt2_hooked_processed` (session); `temp_dir` | +| [`tests/acceptance/conftest.py`](acceptance/conftest.py) | `gpt2_model`, `bloom_560m_hooked`, `bloom_560m_hf_model`, `bloom_560m_hf_tokenizer` (all session) | +| [`tests/acceptance/model_bridge/conftest.py`](acceptance/model_bridge/conftest.py) | Bridge variants of gpt2 with/without compat mode | +| [`tests/integration/model_bridge/conftest.py`](integration/model_bridge/conftest.py) | distilgpt2 + gpt2 Bridge variants × {compat, no-compat, no-processing} | + +**Notes:** + +- All `transformer_lens` imports inside conftest fixtures live in fixture bodies, not at module top — jaxtyping's `pytest_configure` hook must install before the package is first imported. +- The session-scoped model fixtures (`gpt2_hooked_processed`, `gpt2_bridge`, …) are read-only — mutating them leaks across the entire test session. + +--- + +## Cached-model allowlist + +The CI cache in [`.github/workflows/checks.yml`](../.github/workflows/checks.yml) covers: `gpt2`, `gpt2-xl`, `distilgpt2`, `pythia-70m`, `gpt-neo-125M`, `gemma-2-2b-it`, `bloom-560m`, `Qwen2-0.5B`, `bert-base-cased`, `NeelNanda/Attn_Only*`, `roneneldan/TinyStories-1M*`, `NeelNanda/SoLU*`, `redwood_attn_2l`, `tiny-random-llama-2`, `DialoGPT-medium`. + +Of those, prefer **`attn-only-{1,2,3,4}l`** and **`tiny-stories-1M`** for fast tests — `gpt2` is "quite slow" on CI's CPU runners ([contributing.md](../docs/source/content/contributing.md)). Use `gpt2` when you specifically need GPT-2 numerics. + +Anything outside this set → `@pytest.mark.slow`. + +--- + +## The `slow` marker + +Defined in [`pyproject.toml`](../pyproject.toml) as `slow: marks tests as slow (deselect with '-m "not slow"')`. Add it when the test: + +- loads a non-cached model +- iterates exhaustively over many parameter combinations +- takes more than ~5 seconds per invocation + +Deselect with `pytest -m "not slow"`. Note: the default `make` targets do NOT filter out `slow`; the marker is for ad-hoc local runs. + +--- + +## MPS rules + +- The macOS CI job [`mps-checks`](../.github/workflows/checks.yml) sets `TRANSFORMERLENS_ALLOW_MPS=1` and runs `tests/unit`, `tests/integration`, and `tests/mps` on `macos-latest`. +- `get_device()` returns `"cpu"` unless `TRANSFORMERLENS_ALLOW_MPS=1` is set — the default protects against silent MPS divergence. +- The long `--ignore=` list in the workflow is for tests known to diverge on MPS (MoE, optimizer compatibility, KV-cache layout). It is **not** a license to add new skips — it documents existing limitations. +- [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) is the template: float32 only (MPS lacks bfloat16), TinyStories-1M only (50 MB fits the runner), `torch.mps.empty_cache()` + `gc.collect()` between tests. +- Top of the module: `pytestmark = pytest.mark.skipif(not torch.backends.mps.is_available(), reason="MPS not available")` — required on any MPS-only test. + +--- + +## Hard "don'ts" + +Repo-wide rules in [AGENTS.md §10](../AGENTS.md#10-hard-rules) apply. Specifics for this directory: + +- **No mocking model loads.** Use the session-scoped fixtures; they're cheap enough. +- **No mocking the HF Hub.** Tests hit the real hub with `enable_hf_retry()` handling 429s. +- **No platform `skipif` other than MPS.** No `skipif(sys.platform == 'win32')` or `skipif(not torch.cuda.is_available())` to dodge CI. +- **No `xfail` to dodge a failing test.** Fix the underlying bug — even if it predates your PR. +- **No copying acceptance-tier tests as templates for unit tests.** Their model fixtures will time out or OOM in the unit tier. diff --git a/transformer_lens/model_bridge/supported_architectures/AGENTS.md b/transformer_lens/model_bridge/supported_architectures/AGENTS.md new file mode 100644 index 000000000..560bb11e8 --- /dev/null +++ b/transformer_lens/model_bridge/supported_architectures/AGENTS.md @@ -0,0 +1,153 @@ +# Supported Architectures — AGENTS.md + +Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file covers conventions specific to writing TransformerBridge architecture adapters. + +For the **verification** workflow after writing an adapter, see [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). Do not duplicate verification content here. + +--- + +## TL;DR + +- **One file per architecture family**, not per model. `llama.py` covers Llama 1 / 2 / 3 / 3.1 / 3.2; do NOT add `llama7b.py` or `llama2.py`. Family splits happen only when internal structure changes (`gemma1.py` / `gemma2.py` / `gemma3.py`, `qwen.py` / `qwen2.py` / `qwen3.py`, `mistral.py` / `mixtral.py`). +- **Two-place registration is mandatory.** Re-export in [`__init__.py`](__init__.py) AND add to `SUPPORTED_ARCHITECTURES` in [`../../factories/architecture_adapter_factory.py`](../../factories/architecture_adapter_factory.py). Missing either causes silent runtime failure at boot. +- **`self.component_mapping: ComponentMapping`** is the load-bearing contract. Maps canonical TransformerLens paths → Bridge components wrapping HF module paths. +- **Hook names are Bridge-native** (`blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`). Don't invent HT-style aliases here — those are registered separately in [`bridge.py`](../bridge.py). +- **Copy from the closest existing adapter** before writing from scratch. See the starter table below. + +--- + +## Minimal contract + +Every adapter inherits from `ArchitectureAdapter` ([`../architecture_adapter.py`](../architecture_adapter.py)) and declares in `__init__`: + +```python +class MyArchitectureAdapter(ArchitectureAdapter): + def __init__(self, cfg: Any) -> None: + super().__init__(cfg) + + # Required config flags — set on self.cfg, not just default_config + self.cfg.normalization_type = "RMS" # "RMS" or "LN" + self.cfg.positional_embedding_type = "rotary" # "rotary" | "standard" | "relative_positional_bias" + self.cfg.final_rms = True + self.cfg.gated_mlp = True + self.cfg.attn_only = False + + # Architecture-conditional flags + if hasattr(cfg, "n_key_value_heads") and cfg.n_key_value_heads is not None: + self.cfg.n_key_value_heads = cfg.n_key_value_heads # GQA / MQA + + # Weight reshape rules where Bridge layout differs from HF + self.weight_processing_conversions = { + **self._qkvo_weight_conversions(), + } + + # The load-bearing mapping + self.component_mapping = { + "embed": EmbeddingBridge(name="model.embed_tokens"), + "blocks": BlockBridge( + name="model.layers", + submodules={ + "ln1": RMSNormalizationBridge(name="input_layernorm", config=self.cfg), + "attn": PositionEmbeddingsAttentionBridge(...), + "ln2": RMSNormalizationBridge(name="post_attention_layernorm", config=self.cfg), + "mlp": GatedMLPBridge(...), + }, + ), + "ln_final": RMSNormalizationBridge(name="model.norm", config=self.cfg), + "unembed": UnembeddingBridge(name="lm_head"), + } +``` + +Optional overrides on the class: + +- `default_cfg` — class-level defaults merged into the runtime config +- `preprocess_weights()` — pre-load weight transforms (e.g. Gemma embedding scaling) +- `applicable_phases` — subset of `[1, 2, 3, 4, 7, 8]` (default `[1, 2, 3, 4]`); see [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md) for what each phase tests +- `supports_generation` — set `False` for encoder-only models (BERT, HuBERT) + +--- + +## Starter-adapter table + +| Target shape | Start from | Why | +|---|---|---| +| Decoder-only causal LM, RMSNorm + RoPE | [`llama.py`](llama.py) | Canonical modern shape; already handles GQA via `n_key_value_heads` | +| Decoder-only causal LM, LayerNorm + combined QKV | [`gpt2.py`](gpt2.py) | Older GPT-style; demonstrates QKV split/rearrange | +| GQA / MQA | [`llama.py`](llama.py) | Pass `n_kv_heads=` to `_qkvo_weight_conversions()` | +| RMSNorm with offset | [`gemma1.py`](gemma1.py) | Uses `rmsnorm_uses_offset = True` + `ArithmeticTensorConversion` | +| Mixture-of-experts | [`mixtral.py`](mixtral.py) | Uses `MoEBridge` with batched experts | +| Vision-language | [`llava.py`](llava.py) | Dual encoder + projection pathways | +| Encoder-decoder (T5-style) | [`t5.py`](t5.py) | Sets `supports_generation = False`, separate encoder/decoder block lists | +| State-space model | [`mamba.py`](mamba.py), [`mamba2.py`](mamba2.py) | Off the transformer path entirely | +| Encoder-only with CTC head | [`bert.py`](bert.py), [`hubert.py`](hubert.py) | `supports_generation = False` | + +--- + +## Registration steps + +After writing `myarch.py` with `MyArchitectureAdapter`: + +1. **Re-export in [`__init__.py`](__init__.py):** + ```python + from transformer_lens.model_bridge.supported_architectures.myarch import ( + MyArchitectureAdapter, + ) + ``` +2. **Register in [`../../factories/architecture_adapter_factory.py`](../../factories/architecture_adapter_factory.py)** — add to the imports block AND to `SUPPORTED_ARCHITECTURES`: + ```python + from transformer_lens.model_bridge.supported_architectures import ( + ..., + MyArchitectureAdapter, + ) + + SUPPORTED_ARCHITECTURES = { + ..., + "MyArchForCausalLM": MyArchitectureAdapter, # HF class name from config.architectures[] + } + ``` + The key must match `config.architectures[0]` from the HF model's `config.json` exactly. +3. **Add to the registry constants** in [`../../tools/model_registry/__init__.py`](../../tools/model_registry/__init__.py): + - `HF_SUPPORTED_ARCHITECTURES` set + - `CANONICAL_AUTHORS_BY_ARCH` map (foundation-trained orgs for the scraper's download-threshold bypass) + +Missing step 1 → import error at boot. Missing step 2 → `boot_transformers` raises "unsupported architecture." Missing step 3 → scraper misses canonical models. + +--- + +## Common gotchas + +| HF convention | Handled in | How | +|---|---|---| +| RoPE (rotary positional embeddings) | `llama.py`, `mistral.py`, `qwen2.py`+ | `RotaryEmbeddingBridge(name="model.rotary_emb")` + `cfg.positional_embedding_type = "rotary"` | +| GQA / MQA (`n_key_value_heads < n_heads`) | `llama.py`, `mistral.py`, `falcon.py`, `cohere.py` | Set `cfg.n_key_value_heads`; pass `n_kv_heads=` to `_qkvo_weight_conversions()` | +| RMSNorm with offset | `gemma1.py`, `gemma2.py`, `gemma3.py` | `cfg.rmsnorm_uses_offset = True` + `ArithmeticTensorConversion(ADDITION, 1.0)` | +| Custom RMSNorm eps attribute | `llama.py` | `cfg.eps_attr = "variance_epsilon"` (Llama uses this instead of `eps`) | +| Standard LayerNorm | `gpt2.py`, `bloom.py` | `cfg.normalization_type = "LN"` | +| Gated MLP (`gate_proj`, `up_proj`, `down_proj`) | `llama.py`, `mistral.py`, `gemma1.py`, `qwen2.py`+ | `GatedMLPBridge` with submodules `{gate, in, out}` | +| Combined QKV (`c_attn`) | `gpt2.py`, `bloom.py` | `QKVSplitRearrangeConversion` to split + rearrange | +| Split Q/K/V (standard) | `llama.py`, `mistral.py`, most modern | `self._qkvo_weight_conversions()` helper | +| MoE routing | `mixtral.py`, `deepseek_v3.py`, `qwen3_moe.py`, `granite_moe.py` | `MoEBridge` with `gate` + batched expert submodules | +| Missing biases (RMSNorm has no `b`; Llama has no attn/MLP biases) | `llama.py` (documented in docstring) | Weight processing handles `None` via `ProcessWeights._safe_get_tensor()` | +| KV cache layout | All (implicit) | Adapter delegates; HF module manages internally | + +--- + +## Verification + +After registering the adapter, validate end-to-end via the model registry — see [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). The minimum check is: + +```bash +set -a; source .env; set +a +python -m transformer_lens.tools.model_registry.verify_models --architectures MyArchForCausalLM --per-arch 1 +``` + +And an integration test under [`tests/integration/`](../../../tests/integration/) asserting logit parity with HuggingFace at fp32 + eager attention. + +--- + +## Hard "don'ts" + +- **No single-model files.** One file per architecture family. +- **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` / `typing.cast` if the type system disagrees. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). +- **No skipping the factory registration.** The adapter file existing is not enough — without the entry in `SUPPORTED_ARCHITECTURES`, `boot_transformers` cannot find it and fails at runtime. +- **No HT-style hook aliases inside adapters.** Aliases live in [`../bridge.py`](../bridge.py) via `build_alias_to_canonical_map()`; adapters declare Bridge-native names only. diff --git a/transformer_lens/tools/model_registry/AGENTS.md b/transformer_lens/tools/model_registry/AGENTS.md new file mode 100644 index 000000000..39d00c911 --- /dev/null +++ b/transformer_lens/tools/model_registry/AGENTS.md @@ -0,0 +1,92 @@ +# Model Registry — AGENTS.md + +Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file covers only conventions specific to `transformer_lens/tools/model_registry/`. + +--- + +## TL;DR + +> **Use `verify_models`, not `main_benchmark`.** Only `python -m transformer_lens.tools.model_registry.verify_models …` updates `data/supported_models.json`. `python -m transformer_lens.benchmarks.main_benchmark …` runs the same benchmark math but defaults to **not** writing the registry (needs `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). **If you ran `main_benchmark`, the registry is stale — re-run via `verify_models`.** + +- **The only function that writes `supported_models.json` is `update_model_status()` in `registry_io.py`.** Anything that doesn't route through it leaves the registry stale. +- **Never edit `supported_models.json` by hand.** Always go through `update_model_status()`. +- **Never run `verify_models` in parallel.** A single CUDA/MPS device cannot hold concurrent loads (see [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). +- **HF token required.** `verify_models` calls `AutoConfig.from_pretrained(..., token=get_hf_token())` for parameter estimation; gated models fail without `HF_TOKEN`. Source `.env` first: `set -a; source .env; set +a`. + +--- + +## Canonical invocations + +| Goal | Command | +|---|---| +| Verify one specific model + update registry | `python -m transformer_lens.tools.model_registry.verify_models --model ` | +| Verify N models per architecture family | `python -m transformer_lens.tools.model_registry.verify_models --architectures --per-arch ` | +| Verify N models across all architectures | `python -m transformer_lens.tools.model_registry.verify_models --per-arch ` | +| Resume after Ctrl-C / crash | Re-run the same command with `--resume` (reads `data/verification_checkpoint.json`) | +| Re-verify already-verified models for an arch | `--reverify --architectures ` | +| See what would run without doing it | Add `--dry-run` | +| Restrict to specific phases | `--phases 1 2 3` | +| Override device / dtype / memory cap | `--device cuda --dtype float32 --max-memory 16` | + +`HFClassName` matches the strings in `HF_SUPPORTED_ARCHITECTURES` (see `__init__.py`) — e.g. `LlamaForCausalLM`, `GPT2LMHeadModel`, `Olmo2ForCausalLM`. + +--- + +## File roles + +| File | Role | +|---|---| +| `verify_models.py` | **Canonical CLI** for batch verification + registry updates | +| `registry_io.py` | I/O for `supported_models.json`; `update_model_status()` is the only writer | +| `verification.py` | `VerificationRecord` / `VerificationHistory` dataclasses (audit-trail schema) | +| `validate.py` | JSON-schema validation for registry files | +| `api.py` | Read-only programmatic access (`is_model_supported`, `get_architecture_models`, …) | +| `schemas.py` | Dataclasses for model entries, scan info, architecture stats | +| `exceptions.py` | Custom exception types | +| `alias_drift.py` | Detects when legacy `MODEL_ALIASES` and the registry have diverged | +| `discover_architectures.py` | Lightweight HF scan to enumerate architecture classes | +| `hf_scraper.py` | Full HF Hub scan; builds initial supported/unsupported model lists | +| `relevancy.py` | Filters models by download count, foundation-org provenance | +| `generate_report.py` | Renders human-readable status summaries | + +`__init__.py` exports the canonical `HF_SUPPORTED_ARCHITECTURES` set and `CANONICAL_AUTHORS_BY_ARCH` map; agents adding a new HF architecture must update both. + +--- + +## `data/verification_checkpoint.json` (gitignored) + +Resume state for long-running `verify_models` runs. Lists tested / verified / failed / skipped model IDs plus start timestamp. Behaviour: + +- On Ctrl-C, the SIGINT handler in `verify_models.py` finishes the current model, persists the checkpoint, and exits cleanly. +- Re-running with `--resume` reads it and skips already-tested models. +- Deleted automatically on a successful full run. +- Missing or corrupt → fresh run from scratch (safe, just slow). + +Agents should never edit it manually; let `verify_models` manage it. + +--- + +## Phase reference + +`verify_models` runs the model through phases and writes per-phase scores back into the registry entry. Phases (some don't apply to every architecture — see `applicable_phases` on the adapter): + +| Phase | Checks | +|---|---| +| 1 | Core forward correctness vs HuggingFace logits | +| 2 | Hook firing + gradient flow | +| 3 | Weight processing (compatibility mode, fold/centre) | +| 4 | Text-generation quality | +| 7 | Multimodal (vision/text alignment) — only Llava / Gemma3-multimodal | +| 8 | Audio — only Hubert | + +A model entry's `status` is `1` (verified) only after all applicable phases pass. + +--- + +## Hard "don'ts" + +- **No `main_benchmark` for registry updates.** It misses Phase 7 / 8, doesn't manage the checkpoint, and silently skips the registry write unless `--update-registry` is passed. +- **No parallel `verify_models` runs.** Device OOM. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). +- **No manual edits to `data/supported_models.json`.** Always `update_model_status()`. +- **No deleting `data/verification_checkpoint.json` mid-run.** Let the SIGINT handler clean up. +- **No skipping `.env` sourcing.** Gated-model arch verification (Llama / Mistral / Gemma / gated Qwen) needs `HF_TOKEN`. From 60e30aa4711516878ece065ddaf1deaa0a8a2780 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 14:58:05 -0500 Subject: [PATCH 03/11] Optimizing rools --- .claude/commands/add-model-support.md | 9 +++++---- .claude/commands/task-complete.md | 2 +- .claude/commands/verify-model.md | 4 ++-- .cursor/rules/transformerlens.mdc | 2 -- .github/copilot-instructions.md | 20 +++++++------------- CLAUDE.md | 1 - 6 files changed, 15 insertions(+), 23 deletions(-) diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index 411bee911..e1edd1634 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -5,7 +5,10 @@ argument-hint: You are adding TransformerBridge support for the HuggingFace model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user for the HF repo path before continuing. -Read [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) first — these are the authoritative step-by-step references. +Read these first: + +- [transformer_lens/model_bridge/supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — adapter contract, starter-adapter table, registration steps, common gotchas +- [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) — authoritative step-by-step references Execute this checklist, stopping at each step until it is genuinely done: @@ -19,6 +22,4 @@ Execute this checklist, stopping at each step until it is genuinely done: 6. **Add the HF repo path to the Bridge registry** under [transformer_lens/tools/model_registry/](../../transformer_lens/tools/model_registry/). Do NOT add it to [transformer_lens/supported_models.py](../../transformer_lens/supported_models.py) — that file is HookedTransformer-only. 7. **Verify** end-to-end: run `/verify-model $ARGUMENTS` (one model only — do not parallelize). 8. **Write an integration test** under [tests/integration/](../../tests/integration/) that asserts logit parity with HuggingFace. Use fp32 + eager attention to match `boot_transformers`' load configuration. Gate probes for optional structural features (`resid_mid` etc.) rather than assuming they exist. -9. **Format and typecheck**: run `/format`. Fix any new mypy errors — do not add `# type: ignore`. - -When all steps pass cleanly, run `/task-complete` to confirm the change holds up under the full local check. +9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers (unit + docstring + acceptance + integration), looping until clean. diff --git a/.claude/commands/task-complete.md b/.claude/commands/task-complete.md index 8ba37e722..6145371dc 100644 --- a/.claude/commands/task-complete.md +++ b/.claude/commands/task-complete.md @@ -40,7 +40,7 @@ Investigate every failure. Do not dismiss any failure as "pre-existing" or "unre ### 4. Re-loop on failure -If any of steps 1–3 surface issues, fix them, then **restart from step 1** — fixes can introduce new comments, new format/type drift, or new test failures. Repeat until all three steps pass without changes. +If any step surfaces issues, fix them and restart from step 1 — fixes can reintroduce comment, format, type, or test drift. ### 5. Report diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md index 5e0cd0fae..8d8280d40 100644 --- a/.claude/commands/verify-model.md +++ b/.claude/commands/verify-model.md @@ -12,6 +12,6 @@ uv run python -m transformer_lens.tools.model_registry.verify_models --model "$A If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. -**Critical**: run verification for **one model at a time**. Never parallelize across models — a single CUDA/MPS device does not have memory for concurrent loads. See [AGENTS.md §10](../../AGENTS.md#10-hard-rules). +**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json`. See [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) for the canonical-invocations table, file roles, and the resume/checkpoint mechanism. -Report the actual benchmark output. If any phase fails or shows unexpected drift, investigate before declaring done. Do not claim drift is "fp noise" without empirical evidence. +Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). diff --git a/.cursor/rules/transformerlens.mdc b/.cursor/rules/transformerlens.mdc index 11fad36a4..4fd7d6ae4 100644 --- a/.cursor/rules/transformerlens.mdc +++ b/.cursor/rules/transformerlens.mdc @@ -13,5 +13,3 @@ Quick reminders that override common defaults: - No pre-commit hook is installed. Run `make format` and `uv run mypy .` manually before push. - Source `.env` (e.g. `set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command. - Never add `# type: ignore`, never dismiss failing tests as "pre-existing", never add platform skips to dodge CI, never claim drift is "fp noise" without empirical evidence. - -See `AGENTS.md` for the full context behind each of these rules. diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index b79dc0841..bcb46e53e 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -1,14 +1,12 @@ # GitHub Copilot Instructions -**Read [AGENTS.md](../AGENTS.md) at the repo root for the full set of project conventions, quickstart commands, repo layout, and hard rules.** This file mirrors the top-of-mind essentials so Copilot Chat surfaces them inline. +**Read [AGENTS.md](../AGENTS.md) at the repo root for the full set of project conventions, quickstart commands, repo layout, and hard rules — start with its TL;DR section.** This file inlines the highest-friction defaults Copilot most often gets wrong. ## Top rules to remember -1. **Use `uv`, not `pip` or `poetry`.** Install with `uv sync`. Run commands with `uv run …` or via `make` targets. -2. **Two parallel systems**: `HookedTransformer` (legacy, in maintenance) and `TransformerBridge` (v3, default for new work). If you change behavior on HookedTransformer that has an equivalent in TransformerBridge, mirror the change to TransformerBridge in the same PR. New TransformerBridge features do not need to be mirrored in HookedTransformer. The HT model registry [transformer_lens/supported_models.py](../transformer_lens/supported_models.py) is **HT-only** — Bridge-only models go in the Bridge registry under [transformer_lens/tools/model_registry/](../transformer_lens/tools/model_registry/). -3. **Base PRs against `dev`**, not `main`. PRs targeting `main` are done by Maintainers only – Request permission before basing off `main`. -4. **No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing — CI will fail otherwise. -5. **Source `.env` before any HF-Hub command** (`set -a; source .env; set +a`). `HF_TOKEN` is required for gated models (Llama, Mistral, Gemma, gated Qwen variants). +1. **Use `uv`, not `pip` or `poetry`.** `uv sync` to install; `uv run ` or a `make` target to run anything. +2. **Mirror `HookedTransformer` → `TransformerBridge`** in the same PR when behaviour exists in both. The HT registry [`transformer_lens/supported_models.py`](../transformer_lens/supported_models.py) is HT-only — Bridge-only models go in the Bridge registry under [`transformer_lens/tools/model_registry/`](../transformer_lens/tools/model_registry/). +3. **Base PRs against `dev`**, not `main`. PRs to `main` are maintainer-only. ## Common commands @@ -20,13 +18,9 @@ uv run mypy . # type check uv run docs-hot-reload # live docs preview ``` -## Hard "do not"s +## Copilot-specific anti-patterns +- Don't add `# type: ignore`. Prefer `isinstance` / `typing.cast`. - Don't dismiss failing tests as "pre-existing" — investigate every failure. -- Don't add `# type: ignore`. Prefer `isinstance` / `cast`. -- Don't run model verification or benchmarks in parallel (device OOM). -- Don't claim drift is "fp noise" without empirical evidence. -- Don't add skips / `xfail` to make failing CI green. -- Don't reference plan-file details in source comments or docstrings. -See [AGENTS.md](../AGENTS.md) for the full set with rationale. +The full set of hard rules (numerics, parallel benchmarks, plan-file references, etc.) lives in [AGENTS.md §10](../AGENTS.md#10-hard-rules). diff --git a/CLAUDE.md b/CLAUDE.md index 5f3ce09dd..ee4eb882d 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -20,4 +20,3 @@ - The hard rules in [AGENTS.md §10](AGENTS.md#10-hard-rules) reflect rules the maintainer has repeatedly enforced. Treat them as load-bearing. - The mirroring rule in [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — `HookedTransformer` to `TransformerBridge` — is the single most common source of "you forgot to mirror this to TransformerBridge" PR feedback. -- For tasks spanning more than three steps, use TodoWrite to track progress and finish with `/task-complete` before declaring done. From f95ced8d1ed7403c98dd30da183b86bbc6aabd11 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 15:13:25 -0500 Subject: [PATCH 04/11] Adjustments to model support based on an initial end-to-end test --- .claude/commands/add-model-support.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index e1edd1634..f43041603 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -13,13 +13,15 @@ Read these first: Execute this checklist, stopping at each step until it is genuinely done: 1. Confirm the model is not already supported by an existing architecture adapter. Short circuit and direct the user to the adapter if it already exists. -2. **Analyze the HF model**: read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output head layouts. Note any quirks (rotary embeddings, GQA/MQA, RMSNorm vs LayerNorm, biases, attention sinks, MoE routing). -3. **Pick a starting adapter** from [transformer_lens/model_bridge/supported_architectures/](../../transformer_lens/model_bridge/supported_architectures/) closest to the target architecture. Copy it (or [docs/source/_static/adapter-template.py](../../docs/source/_static/adapter-template.py)) into a new file named after the architecture. +2. **Analyze the HF model**: read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output head layouts. Note any quirks (rotary embeddings, GQA/MQA, RMSNorm vs LayerNorm, biases, attention sinks, MoE routing). Then list every non-standard HF config attribute that the base adapter machinery doesn't already map — e.g. `final_logit_softcapping` / `attn_logit_softcapping`, `query_pre_attn_scalar`, `sliding_window`, `layer_types`, custom `eps_attr` names — and decide for each whether it needs to be surfaced on `self.cfg`. HF-raw attrs are invisible to TL-side consumers unless explicitly propagated. +3. **Pick a starting adapter** from [transformer_lens/model_bridge/supported_architectures/](../../transformer_lens/model_bridge/supported_architectures/) closest to the target architecture. Copy it (or [docs/source/_static/adapter-template.py](../../docs/source/_static/adapter-template.py)) into a new file named after the architecture. **Do NOT inherit `default_prepend_bos` from the starter** — that flag reflects the target's tokenizer policy, not its architecture, and same-family models often differ. Check the target's tokenizer config or chat-template behavior before setting it; if unsure, leave it unset and let the framework default apply. 4. **Fill in component mappings** so each HF module path resolves to a canonical TransformerLens Bridge name. Hook names should be Bridge-native (`blocks.{i}.hook_out` etc.), not HookedTransformer aliases. -5. **Register the adapter** in BOTH: - - [transformer_lens/model_bridge/supported_architectures/__init__.py](../../transformer_lens/model_bridge/supported_architectures/__init__.py) - - [transformer_lens/factories/architecture_adapter_factory.py](../../transformer_lens/factories/architecture_adapter_factory.py) -6. **Add the HF repo path to the Bridge registry** under [transformer_lens/tools/model_registry/](../../transformer_lens/tools/model_registry/). Do NOT add it to [transformer_lens/supported_models.py](../../transformer_lens/supported_models.py) — that file is HookedTransformer-only. +5. **Register the adapter** in ALL of: + - [transformer_lens/model_bridge/supported_architectures/__init__.py](../../transformer_lens/model_bridge/supported_architectures/__init__.py) — import + `__all__` entry + - [transformer_lens/factories/architecture_adapter_factory.py](../../transformer_lens/factories/architecture_adapter_factory.py) — import + `SUPPORTED_ARCHITECTURES` entry (key must match `config.architectures[0]` exactly) + - [transformer_lens/tools/model_registry/\_\_init\_\_.py](../../transformer_lens/tools/model_registry/__init__.py) — `HF_SUPPORTED_ARCHITECTURES` set entry + `CANONICAL_AUTHORS_BY_ARCH` map entry + - [transformer_lens/tools/model_registry/generate_report.py](../../transformer_lens/tools/model_registry/generate_report.py) — one-line entry in `ARCHITECTURE_DESCRIPTIONS` so the generated docs table covers the new architecture +6. **Add the HF repo path** to [transformer_lens/tools/model_registry/data/supported_models.json](../../transformer_lens/tools/model_registry/data/supported_models.json) so `/verify-model` can resolve it. Do NOT add it to [transformer_lens/supported_models.py](../../transformer_lens/supported_models.py) — that file is HookedTransformer-only. After adding the user-provided repo, ask whether to also add the canonical sibling variants under the same org (the values from `CANONICAL_AUTHORS_BY_ARCH[]`) — e.g. when adding `google/gemma-2-2b`, prompt about `google/gemma-2-2b-it`, `google/gemma-2-9b`, `google/gemma-2-9b-it`, `google/gemma-2-27b`, `google/gemma-2-27b-it`. The HF scraper picks these up eventually, but explicit entries unblock `/verify-model` against the siblings now. 7. **Verify** end-to-end: run `/verify-model $ARGUMENTS` (one model only — do not parallelize). 8. **Write an integration test** under [tests/integration/](../../tests/integration/) that asserts logit parity with HuggingFace. Use fp32 + eager attention to match `boot_transformers`' load configuration. Gate probes for optional structural features (`resid_mid` etc.) rather than assuming they exist. 9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers (unit + docstring + acceptance + integration), looping until clean. From 10d44e9283d83f8b75b58ec2d48b690a89269de8 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 17:09:24 -0500 Subject: [PATCH 05/11] Review changes --- .claude/commands/add-model-support.md | 89 +++++++-- .claude/commands/task-complete.md | 7 +- .claude/commands/verify-model.md | 15 +- .cursor/rules/transformerlens.mdc | 6 + .env.example | 6 +- AGENTS.md | 85 +++++---- CLAUDE.md | 6 +- docs/source/content/compatibility_mode.md | 94 +++++++++ .../content/debugging_numerical_divergence.md | 111 +++++++++++ makefile | 6 + tests/AGENTS.md | 19 +- tests/QUARANTINES.md | 124 ++++++++++++ .../supported_architectures/AGENTS.md | 180 ++++++++++++++++-- .../tools/model_registry/AGENTS.md | 60 +++++- 14 files changed, 721 insertions(+), 87 deletions(-) create mode 100644 docs/source/content/compatibility_mode.md create mode 100644 docs/source/content/debugging_numerical_divergence.md create mode 100644 tests/QUARANTINES.md diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index f43041603..7eff3199d 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -5,23 +5,84 @@ argument-hint: You are adding TransformerBridge support for the HuggingFace model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user for the HF repo path before continuing. -Read these first: +**Read first:** -- [transformer_lens/model_bridge/supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — adapter contract, starter-adapter table, registration steps, common gotchas -- [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) — authoritative step-by-step references +- [transformer_lens/model_bridge/supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — the adapter contract, starter-adapter table, **4-place registration steps**, config-attr propagation rules, the `default_prepend_bos` trap, common gotchas, and how to add the model entry to `supported_models.json`. +- [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) — the verification workflow and the `verify_models` vs `main_benchmark` trap. +- [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) — authoritative step-by-step references. Execute this checklist, stopping at each step until it is genuinely done: -1. Confirm the model is not already supported by an existing architecture adapter. Short circuit and direct the user to the adapter if it already exists. -2. **Analyze the HF model**: read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output head layouts. Note any quirks (rotary embeddings, GQA/MQA, RMSNorm vs LayerNorm, biases, attention sinks, MoE routing). Then list every non-standard HF config attribute that the base adapter machinery doesn't already map — e.g. `final_logit_softcapping` / `attn_logit_softcapping`, `query_pre_attn_scalar`, `sliding_window`, `layer_types`, custom `eps_attr` names — and decide for each whether it needs to be surfaced on `self.cfg`. HF-raw attrs are invisible to TL-side consumers unless explicitly propagated. -3. **Pick a starting adapter** from [transformer_lens/model_bridge/supported_architectures/](../../transformer_lens/model_bridge/supported_architectures/) closest to the target architecture. Copy it (or [docs/source/_static/adapter-template.py](../../docs/source/_static/adapter-template.py)) into a new file named after the architecture. **Do NOT inherit `default_prepend_bos` from the starter** — that flag reflects the target's tokenizer policy, not its architecture, and same-family models often differ. Check the target's tokenizer config or chat-template behavior before setting it; if unsure, leave it unset and let the framework default apply. -4. **Fill in component mappings** so each HF module path resolves to a canonical TransformerLens Bridge name. Hook names should be Bridge-native (`blocks.{i}.hook_out` etc.), not HookedTransformer aliases. -5. **Register the adapter** in ALL of: - - [transformer_lens/model_bridge/supported_architectures/__init__.py](../../transformer_lens/model_bridge/supported_architectures/__init__.py) — import + `__all__` entry - - [transformer_lens/factories/architecture_adapter_factory.py](../../transformer_lens/factories/architecture_adapter_factory.py) — import + `SUPPORTED_ARCHITECTURES` entry (key must match `config.architectures[0]` exactly) - - [transformer_lens/tools/model_registry/\_\_init\_\_.py](../../transformer_lens/tools/model_registry/__init__.py) — `HF_SUPPORTED_ARCHITECTURES` set entry + `CANONICAL_AUTHORS_BY_ARCH` map entry - - [transformer_lens/tools/model_registry/generate_report.py](../../transformer_lens/tools/model_registry/generate_report.py) — one-line entry in `ARCHITECTURE_DESCRIPTIONS` so the generated docs table covers the new architecture -6. **Add the HF repo path** to [transformer_lens/tools/model_registry/data/supported_models.json](../../transformer_lens/tools/model_registry/data/supported_models.json) so `/verify-model` can resolve it. Do NOT add it to [transformer_lens/supported_models.py](../../transformer_lens/supported_models.py) — that file is HookedTransformer-only. After adding the user-provided repo, ask whether to also add the canonical sibling variants under the same org (the values from `CANONICAL_AUTHORS_BY_ARCH[]`) — e.g. when adding `google/gemma-2-2b`, prompt about `google/gemma-2-2b-it`, `google/gemma-2-9b`, `google/gemma-2-9b-it`, `google/gemma-2-27b`, `google/gemma-2-27b-it`. The HF scraper picks these up eventually, but explicit entries unblock `/verify-model` against the siblings now. +1. **Check registry state and decide whether to verify.** + + Determine the current state: + - **Architecture supported?** Check `SUPPORTED_ARCHITECTURES` in [`transformer_lens/factories/architecture_adapter_factory.py`](../../transformer_lens/factories/architecture_adapter_factory.py). + - **Model in registry?** Check whether `$ARGUMENTS` appears in [`transformer_lens/tools/model_registry/data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json), and if so what `status` it has (0=unverified, 1=verified, 2=skipped, 3=failed). + + Then branch: + + - **Architecture supported AND model already in registry with `status == 1` (verified)** → the model is already verified end-to-end. Ask the user what symptom they're seeing — this is most likely a bug-report path, not an add-support path. Stop the workflow here. + - **Architecture supported, model in registry with `status != 1`** (0 unverified, 2 skipped, 3 failed) → proceed to **Confirm before verification** below. If `status == 3`, read the existing `note` field first to see what failure mode you may be about to re-trigger. + - **Architecture supported, model NOT in registry** → add a registry entry per [supported_architectures/AGENTS.md §"Adding the HF repo to the registry"](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry) with `status: 0` and null phase scores, then proceed to **Confirm before verification**. + - **Architecture NOT supported** → skip to step 2 below (the full adapter-authoring workflow). + + ### Confirm before verification + + **Always** ask the user before running verification — even for small models. Verification is non-trivial and resource-bound. To give them an informed decision: + + 1. **Get a resource projection** without loading the model: + ``` + set -a; source .env; set +a + uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run + ``` + Capture the projected memory and parameter-count estimate from the dry-run output. + 2. **Present the user with the cost of verification**: + - **Model**: `` (HF repo: `https://huggingface.co/`) + - **Architecture**: `` (from `config.architectures[0]`) + - **Estimated parameters**: `` (from the dry-run) + - **Projected memory**: ` GB` (from the dry-run) — they need at least this much free on the target device + - **Expected runtime**: typically 30 s – 2 min for sub-1B models, 2–15 min for 1B–7B, 15+ min for 7B+ or multimodal + - **What verification does**: loads the model, runs Phases 1–4 (forward correctness vs HF, hook firing + gradients, weight processing, generation quality), and on success updates `supported_models.json` via `update_model_status()` + - **Token requirement**: whether `HF_TOKEN` is needed (gated repos: Llama, Mistral, Gemma, gated Qwen variants) + 3. **Ask explicitly**: "Do you want me to run verification on this machine? (Y/N)" + + **If the user confirms** → run `/verify-model $ARGUMENTS`. On pass, the registry now reflects verified status and you're done — no new code needed. On fail, the failure mode tells you whether to fix the existing adapter (per-sibling regression — see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md)) or escalate. + + **If the user rejects** → direct them to file a tracking issue so a maintainer with appropriate hardware can run the verification: + ``` + gh issue create \ + --title "Verify model support: " \ + --body "$(cat <<'EOF' + ## Model + - Repo: https://huggingface.co/ + - Architecture: + - Estimated parameters: + - Projected memory: GB + - Gated: yes / no (HF_TOKEN required: yes / no) + + ## Registry state + - Architecture adapter exists: yes + - In supported_models.json: yes (status: ) / no — entry added in this issue's PR + + ## Motivation + + + ## Next step + Run on appropriate hardware and report results: + \`\`\` + set -a; source .env; set +a + uv run python -m transformer_lens.tools.model_registry.verify_models --model + \`\`\` + EOF + )" + ``` + If `gh` isn't installed, link them to `https://github.com/TransformerLensOrg/TransformerLens/issues/new` with the same body content. Then stop the workflow — the registry-entry add (if you made one above) can either land in its own small PR or be folded into the maintainer's verification PR; either is fine. + +2. **Analyze the HF model.** Read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output-head layouts. Run through the [Config-attr propagation table in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) for non-standard attributes (`final_logit_softcapping`, `sliding_window`, etc.) and decide which need to surface on `self.cfg`. +3. **Pick a starting adapter** using the [starter-adapter table](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy it into [`transformer_lens/model_bridge/supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `.py`. **Watch tokenizer-policy flags** — see [Tokenizer policy in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) (`default_prepend_bos`, padding side, EOS handling, chat-template wiring are all per-model, not per-architecture). +4. **Fill in `self.component_mapping`** so each HF module path resolves to a canonical Bridge name. Hook names are Bridge-native. +5. **Register in all four sites** per the [registration steps in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). +6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per the [Adding the HF repo to the registry section](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Then ask the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. 7. **Verify** end-to-end: run `/verify-model $ARGUMENTS` (one model only — do not parallelize). 8. **Write an integration test** under [tests/integration/](../../tests/integration/) that asserts logit parity with HuggingFace. Use fp32 + eager attention to match `boot_transformers`' load configuration. Gate probes for optional structural features (`resid_mid` etc.) rather than assuming they exist. -9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers (unit + docstring + acceptance + integration), looping until clean. +9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers, looping until clean. diff --git a/.claude/commands/task-complete.md b/.claude/commands/task-complete.md index 6145371dc..9e36a5289 100644 --- a/.claude/commands/task-complete.md +++ b/.claude/commands/task-complete.md @@ -28,13 +28,10 @@ If mypy reports new errors, fix the underlying typing issue. Do not add `# type: ``` set -a; source .env; set +a -make unit-test -make docstring-test -make acceptance-test -make integration-test +make test-pr ``` -These are the tiers that gate PR review for almost every change. Notebook and benchmark suites are intentionally skipped — they are slow, hit gated models, and are not required to land typical PRs (CI runs them separately). If your change specifically touches a notebook or a benchmark, run that file directly (`pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` or `make benchmark-test`) in addition. +`make test-pr` runs unit + docstring + acceptance + integration — the tiers that gate PR review for almost every change. Notebook and benchmark suites are intentionally skipped (slow, gated models, CI runs them separately). If your change specifically touched a notebook or a benchmark, also run that file directly (`pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` or `make benchmark-test`). Investigate every failure. Do not dismiss any failure as "pre-existing" or "unrelated" — fix the underlying issue, even if it predates this task (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)). Do not add platform skips or `xfail` markers to dodge a failing test. diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md index 8d8280d40..1198e2bb9 100644 --- a/.claude/commands/verify-model.md +++ b/.claude/commands/verify-model.md @@ -12,6 +12,19 @@ uv run python -m transformer_lens.tools.model_registry.verify_models --model "$A If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. +**Optional flags for a single-model run** (full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transformer_lens/tools/model_registry/AGENTS.md#flag-reference)): + +- `--device cpu|cuda|mps` — override automatic device selection +- `--dtype float32|bfloat16` — override automatic dtype selection +- `--max-memory ` — skip if parameter estimate exceeds; on a 24 GB GPU, `--max-memory 16` leaves headroom for activations +- `--phases 1 2 3` — restrict to a subset (Phase 4 generation is the slowest; restrict when debugging Phase 1 forward parity) +- `--dry-run` — project memory + parameter count without loading the model. Use this before any expensive run, and as the data source for the confirmation prompt in [`/add-model-support`](add-model-support.md). +- `--no-hf-reference` / `--no-ht-reference` — skip the HF / HT comparison passes (faster, lower confidence) +- `--reverify` — re-test a model already at `status == 1` +- `--retry-failed` — re-test a model at `status == 3` (read its existing `note` first to see the prior failure mode) + +Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) are not relevant for `--model ` — use the batch invocation in [tools/model_registry/AGENTS.md §Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations) if you need them. + **Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json`. See [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) for the canonical-invocations table, file roles, and the resume/checkpoint mechanism. -Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). +Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). For phase failures or logit-parity drift, the bisection workflow lives in [docs/source/content/debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md). diff --git a/.cursor/rules/transformerlens.mdc b/.cursor/rules/transformerlens.mdc index 4fd7d6ae4..845d076da 100644 --- a/.cursor/rules/transformerlens.mdc +++ b/.cursor/rules/transformerlens.mdc @@ -5,6 +5,12 @@ alwaysApply: true Read `AGENTS.md` at the repo root before doing any work. It is the single source of truth for project conventions, quickstart commands, repo layout, hook-naming rules, the HookedTransformer ↔ TransformerBridge mirroring rule, PR conventions, and hard rules. +Sub-folder `AGENTS.md` files apply when you're working in those directories — read them too: + +- `tests/AGENTS.md` — tier placement, conftest hierarchy, MPS rules +- `transformer_lens/model_bridge/supported_architectures/AGENTS.md` — adapter contract, starter-adapter table, 4-place registration +- `transformer_lens/tools/model_registry/AGENTS.md` — `verify_models` workflow, the `main_benchmark` trap + Quick reminders that override common defaults: - Use `uv`, not `pip` or `poetry`. Install with `uv sync`; run commands with `uv run …` or `make` targets. diff --git a/.env.example b/.env.example index 518673219..19a8b9a2e 100644 --- a/.env.example +++ b/.env.example @@ -6,8 +6,10 @@ # gated Qwen variants, etc. Create at https://huggingface.co/settings/tokens HF_TOKEN= -# Optional: retry HuggingFace Hub 429s in non-pytest invocations (CI sets this -# automatically in checks.yml). Pytest enables retries via tests/conftest.py. +# Optional: retry HuggingFace Hub 429s. Only matters for ad-hoc non-pytest scripts +# (docs build, demo notebooks, one-off `python -m ...` invocations). The pytest +# suite already enables retries unconditionally via tests/conftest.py's +# _enable_hf_retry_for_tests fixture, and CI sets this env var in checks.yml. # TRANSFORMERLENS_HF_RETRY=1 # Optional: allow MPS device on macOS (off by default to avoid hard-to-debug diff --git a/AGENTS.md b/AGENTS.md index 5a297172c..cf9ad43af 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -24,34 +24,30 @@ Sub-folder rules: [tests/AGENTS.md](tests/AGENTS.md) · [supported_architectures ## 2. Two systems live in this repo -- **`TransformerBridge` (v3)** — default for new work. Lives in [transformer_lens/model_bridge/](transformer_lens/model_bridge/). -- **`HookedTransformer`** — legacy, maintenance mode, deprecated in 3.0. +| System | Status | Lives in | Numerics | Registry | +|---|---|---|---|---| +| **`TransformerBridge`** | v3 — default for new work | [transformer_lens/model_bridge/](transformer_lens/model_bridge/) | Raw HF weights by default; `bridge.enable_compatibility_mode()` for HT-equivalent | [transformer_lens/tools/model_registry/data/supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json) | +| **`HookedTransformer`** | Legacy, maintenance mode, deprecated in 3.0 | [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) + [transformer_lens/components/](transformer_lens/components/) | Folds LayerNorm + centres weights → does NOT match HF | [transformer_lens/supported_models.py](transformer_lens/supported_models.py) (**HT-only**) | -### `TransformerBridge` (v3, recommended for new work) - -- Lives in [transformer_lens/model_bridge/](transformer_lens/model_bridge/). -- Architecture-adapter pattern: each HF architecture has an adapter file in [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) that maps HF module paths to canonical TransformerLens names. -- Preserves raw HF weights by default (logits match HF). Call `bridge.enable_compatibility_mode()` after booting for HookedTransformer-equivalent numerics. -- Hook names are architecture-native (e.g. `blocks.{i}.hook_out`), with aliases provided for HT-style names where applicable. - -### `HookedTransformer` (legacy, maintenance mode) - -- Lives in [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) and [transformer_lens/components/](transformer_lens/components/). -- Folds LayerNorm and centers weights by default — useful for circuit analysis but means weights and logits **do not** match HF. -- Model registry is [transformer_lens/supported_models.py](transformer_lens/supported_models.py) — **HT-only**. Bridge-only models do not belong here; they live in the Bridge registry data file under [transformer_lens/tools/model_registry/](transformer_lens/tools/model_registry/). -- Deprecated as of TransformerLens 3.0; will be removed in the next major version. Still fully functional via the compatibility layer. +Bridge architecture-adapter pattern: each HF architecture has one file in [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) mapping HF module paths to canonical TransformerLens names. Bridge hook names are architecture-native (e.g. `blocks.{i}.hook_out`); HT-style aliases are registered separately in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py). ### Mirroring rule -If you change behavior in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. They are parallel implementations of the same surface; drift between them is a recurring source of bugs. Conversely, `supported_models.py` is HT-only — do not add Bridge-only models there. +If you change behaviour in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. Parallel implementations; drift between them is a recurring source of bugs. The HT-only [supported_models.py](transformer_lens/supported_models.py) is NOT the place to add Bridge-only models — those go in the Bridge registry data file. ## 3. Quickstart ```bash +# Bootstrap uv if missing +curl -LsSf https://astral.sh/uv/install.sh | sh + # Install — uv only (not pip, not poetry) uv sync source .venv/bin/activate +# First-time only: create .env from the template, then fill in HF_TOKEN +cp .env.example .env + # Source HF token before any HF-Hub-hitting command set -a; source .env; set +a @@ -61,7 +57,8 @@ make integration-test # cross-component make acceptance-test # end-to-end make docstring-test # doctest + doctest-plus make notebook-test # slow; subset run in CI -make test # all of the above +make test-pr # unit + docstring + acceptance + integration (PR-review surface) +make test # everything (long; includes benchmarks + notebooks) # Format + typecheck — no pre-commit hook; run manually before push make format # pycln + isort + black @@ -113,18 +110,7 @@ When raw-HF-forward drivers compare against `boot_transformers`, match its load ## 6. Adding a model -TransformerBridge adapters are written per architecture family, not per individual model. Adding `gpt2` registers all GPT-2 variants; you generally won't add a new file for a sibling checkpoint of an already-supported architecture. - -Start with [docs/source/content/adapter_development/](docs/source/content/adapter_development/) — there are step-by-step guides covering the full workflow. Summary: - -1. Read the HF model's `config.json` and source to identify components (embed, attention, MLP, normalization, output head). -2. Copy [docs/source/_static/adapter-template.py](docs/source/_static/adapter-template.py) into [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) as `.py`. -3. Fill in component mappings. -4. Register in both [transformer_lens/model_bridge/supported_architectures/__init__.py](transformer_lens/model_bridge/supported_architectures/__init__.py) and [transformer_lens/factories/architecture_adapter_factory.py](transformer_lens/factories/architecture_adapter_factory.py). -5. Add the HF repo path to the Bridge registry (model registry data file) and run `verify_models.py` against the new architecture. -6. Add an integration test under [tests/integration/](tests/integration/) that asserts logit parity with HF. - -The existing adapters under [supported_architectures/](transformer_lens/model_bridge/supported_architectures/) are the best references for tricky cases. +Adapters are written **per architecture family**, not per individual model — adding `gpt2` registers all GPT-2 variants. Full workflow (starter-adapter table, 4-place registration, common gotchas, anti-patterns): **[supported_architectures/AGENTS.md](transformer_lens/model_bridge/supported_architectures/AGENTS.md)**. Verification flow: **[tools/model_registry/AGENTS.md](transformer_lens/tools/model_registry/AGENTS.md)**. Claude Code users: invoke `/add-model-support `. ## 7. Prioritization @@ -139,6 +125,7 @@ When picking solutions to any problem, prioritize by **research impact**, not im - **New branches must track their own remote**, not the source branch's remote. When you `git push -u`, push to the new branch's namespace, not the branch you forked from. - **PR template**: [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md). No conventional-commits enforcement. - **No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing; CI will fail otherwise. +- **Changelog**: no per-PR changelog file. User-facing changes go in the PR description and are rolled up into the next release essay in [docs/source/content/news/](docs/source/content/news/) (`release-2.0.md`, `release-3.0.md`, … — one file per major version). If your PR introduces a breaking change or a notable user-facing feature, note it explicitly in the PR description so the next release essay picks it up. ## 9. CI gates a PR must pass @@ -193,14 +180,48 @@ Before declaring a task complete: | Task type | Must do | |---|---| | Bug fix | Reproduce in a test, fix, confirm test passes, `make format`, `uv run mypy .` | -| New adapter | Adapter file + factory registration + Bridge registry entry + integration test asserting HF logit parity + `verify_models.py` run for the new architecture | +| New adapter | Adapter file + 4-place registration + integration test asserting HF logit parity + `verify_models` run (see [supported_architectures/AGENTS.md](transformer_lens/model_bridge/supported_architectures/AGENTS.md)) | | Docs change | `uv run build-docs` succeeds; branch name contains `docs` so the docs job triggers in CI | | Notebook change | `pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` passes locally | -| Anything else | `make unit-test` + `make format` + `uv run mypy .` clean before push | +| Anything else | `make format` + `uv run mypy .` + `make test-pr` clean before push | -If you're using Claude Code, the `/task-complete` slash command in [.claude/commands/](.claude/commands/) automates the last row. +If you're using Claude Code, the `/task-complete` slash command in [.claude/commands/](.claude/commands/) automates the last row (`make test-pr` = unit + docstring + acceptance + integration). ## 12. Pointers for further reading - [docs/source/content/migrating_to_v3.md](docs/source/content/migrating_to_v3.md) — HT → Bridge migration recipes - [docs/source/content/adapter_development/](docs/source/content/adapter_development/) — adapter authoring deep dive +- [docs/source/content/compatibility_mode.md](docs/source/content/compatibility_mode.md) — when to call `bridge.enable_compatibility_mode()`, what each flag does, four-quadrant test matrix +- [docs/source/content/debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — bisection workflow for HT-vs-Bridge / Bridge-vs-HF logit drift +- [tests/QUARANTINES.md](tests/QUARANTINES.md) — inventory of every `skip` / `xfail` and when each can be un-skipped + +## 13. Local-only conventions + +Two gitignored paths exist for ephemeral work; use them so your work-in-progress doesn't show up in `git status`: + +- **`transformer_lens/scratch.py`** — sibling-of-package scratch file for one-off bisection scripts and ad-hoc imports. Already in `.gitignore`; never commit. +- **`.adapter-workspace/`** — directory for adapter WIP (notes, config dumps, repro scripts) while you're iterating. Already in `.gitignore`. + +Both are first-class conventions, not personal preferences — the gitignore entries are checked in. New contributors should know about them so they don't reinvent the convention or accidentally commit experimental code. + +## 14. Upstream dependency pins + +Load-bearing pins live in [pyproject.toml](pyproject.toml): + +| Pin | Where | Why it matters | +|---|---|---| +| `transformers>=5.4.0` | `[project] dependencies` | The Bridge adapter contract is written against HF module layouts; every minor HF release can break adapter component-mappings. Bumping is a real test pass. | +| `torch>=2.6` | `[project] dependencies` | Hook system relies on PyTorch's forward / backward hook semantics; major torch bumps occasionally change ordering. | +| `accelerate>=0.23.0` | `[project] dependencies` | Required for Llama-family loading. | +| `numpy>=1.24` / `>=1.26` | `[project] dependencies` (python-version-conditional) | Doctest float formatting can drift across NumPy versions. | +| `isort==5.8.0` | `[dependency-groups] dev` (exact) | Format check pins to exactly this version; a bump flips the formatting of every file. | + +**Bumping upstream pins:** + +1. Bump the pin in `pyproject.toml`, refresh `uv.lock` (`uv lock`). +2. Run `make test-pr` locally; expect adapter / hook tests to surface real breakages. +3. For each break, fix the adapter or component — do not pin around the regression. +4. Run `uv run python -m transformer_lens.tools.model_registry.verify_models --architectures ` to catch numerical regressions the unit tests miss. +5. Land in a focused PR — pin bumps are reviewed separately from feature work because the blast radius is wide. + +Specific to HF: when a `transformers` minor bump introduces a new architecture or renames an existing one, check the impact on `_HF_PASSTHROUGH_ATTRS` in [`transformer_lens/model_bridge/sources/_bridge_builder.py`](transformer_lens/model_bridge/sources/_bridge_builder.py) and `SUPPORTED_ARCHITECTURES` in [`transformer_lens/factories/architecture_adapter_factory.py`](transformer_lens/factories/architecture_adapter_factory.py). diff --git a/CLAUDE.md b/CLAUDE.md index ee4eb882d..baaa88700 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,11 +12,15 @@ - `/build-docs` — sources `.env`, runs `uv run build-docs` - `/verify-model ` — guided single-model run of `verify_models.py` - `/add-model-support ` — checklist-driven new-adapter workflow - - `/task-complete` — end-of-task gate: clean up new comments per the comment rules in AGENTS.md, then `/format`, then run `unit-test` + `docstring-test` + `acceptance-test` + `integration-test`, fixing any issues until all three steps pass cleanly. Skips notebook + benchmark tiers (run those directly if your change touched them). + - `/task-complete` — end-of-task gate: clean up new comments per the comment rules in AGENTS.md, then `/format`, then `make test-pr` (unit + docstring + acceptance + integration), looping until clean. Skips notebook + benchmark tiers (run those directly if your change touched them). - **Project settings** — [.claude/settings.json](.claude/settings.json) is the checked-in scaffold for project-wide Claude Code settings (hooks, env). It intentionally ships with no permission allowlist; configure your own permissions in `.claude/settings.local.json` (gitignored). +- **First-time permission setup** — the empty checked-in allowlist means every command (`uv sync`, `make unit-test`, `git status`, `gh pr view`, etc.) prompts on first use. Two ways to reduce the noise: (1) run the built-in `/fewer-permission-prompts` skill, which scans your recent transcripts and adds the read-only commands you actually use to `.claude/settings.local.json`; or (2) approve commands with "Don't ask again" the first time each is used. Both routes write to the gitignored local settings file — nothing committed. - **User-specific settings** — `.claude/settings.local.json`, `.claude/agents/`, `.claude/worktrees/` are all gitignored. Configure them per-user. ## Pointers when working with Claude Code in this repo - The hard rules in [AGENTS.md §10](AGENTS.md#10-hard-rules) reflect rules the maintainer has repeatedly enforced. Treat them as load-bearing. - The mirroring rule in [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — `HookedTransformer` to `TransformerBridge` — is the single most common source of "you forgot to mirror this to TransformerBridge" PR feedback. +- When investigating a failing test, **always check [tests/QUARANTINES.md](tests/QUARANTINES.md) first** — known quarantines have documented reasons and "un-skip when…" lines; spending an hour debugging a known macOS-arm64 KV-cache skip is the most common time-sink. +- For Bridge-vs-HF logit drift, use [docs/source/content/debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — bisection table by symptom and the four-quadrant compatibility-mode matrix. +- `bridge.enable_compatibility_mode()` is name-dropped in many places but specified in [docs/source/content/compatibility_mode.md](docs/source/content/compatibility_mode.md) — read it before deciding whether a new test needs it. diff --git a/docs/source/content/compatibility_mode.md b/docs/source/content/compatibility_mode.md new file mode 100644 index 000000000..8ca304ea6 --- /dev/null +++ b/docs/source/content/compatibility_mode.md @@ -0,0 +1,94 @@ +# TransformerBridge Compatibility Mode + +`TransformerBridge.boot_transformers(...)` returns a bridge whose **default numerics match HuggingFace** — raw weights, no folding, no centering. Calling `bridge.enable_compatibility_mode()` afterwards puts the bridge into **HookedTransformer-equivalent numerics** — weights folded, centered, and the legacy hook aliases registered. + +Most research code that was written against `HookedTransformer.from_pretrained(...)` assumes compatibility mode. Most new code that needs HF-faithful logits does not. + +> Source: [`transformer_lens/model_bridge/bridge.py:enable_compatibility_mode`](../../../transformer_lens/model_bridge/bridge.py). + +--- + +## When to enable it + +| Use case | Compatibility mode? | Why | +|---|---|---| +| Logit lens / direct logit attribution | **Yes** | These analyses reason in the post-fold-LN coordinate system; raw HF weights produce different (wrong) attributions. | +| Residual-stream norm analysis | **Yes** | Centered weights give the residual a meaningful zero. | +| Circuit analysis using HT-style hook names (`blocks.{i}.attn.hook_q`, `hook_resid_pre`, etc.) | **Yes** | Legacy aliases register only after compat mode. | +| Logit parity against HuggingFace | **No** | Folding changes weights; logits will not match HF. | +| Generation / inference vs HF baseline | **No** | Same reason. | +| Verifying a new adapter's forward pass | **No (initially)** | Use `enable_compatibility_mode(no_processing=True)` to get hook aliases without weight processing — isolates forward-pass bugs from weight-processing bugs. | + +## What each flag does + +```python +bridge.enable_compatibility_mode( + disable_warnings: bool = False, + no_processing: bool = False, + fold_ln: bool = True, + center_writing_weights: bool = True, + center_unembed: bool = True, + fold_value_biases: bool = True, + refactor_factored_attn_matrices: bool = False, +) +``` + +| Flag | Default | Effect | +|---|---|---| +| `no_processing` | `False` | If `True`, **overrides all other processing flags to False** — registers the legacy hook aliases only, leaves weights raw. The "I want HT hook names but HF numerics" mode. | +| `fold_ln` | `True` | Folds LayerNorm scale + bias into the subsequent linear weights so the LayerNorm modules become pure normalization. Changes weights; mathematically equivalent. | +| `center_writing_weights` | `True` | Subtracts the mean from each "writing" weight (`W_out` in attention, MLP-down). Makes residual contributions sum to zero per layer, which makes residual-stream norms interpretable. | +| `center_unembed` | `True` | Subtracts the mean from the unembedding matrix. Logits become mean-zero — affects logit-lens output but not argmax. | +| `fold_value_biases` | `True` | Folds attention value biases into the output bias. Same numerics, fewer parameters. | +| `refactor_factored_attn_matrices` | `False` | Refactors `W_Q @ W_K.T` and `W_V @ W_O` for analysis. Off by default because it's slow and only matters for specific factored-matrix research. | +| `disable_warnings` | `False` | Suppresses warnings emitted by legacy component aliases when accessed. | + +After processing, the bridge **also**: + +- Re-initializes the hook registry. +- Calls `_setup_hook_compatibility()` on every component (installs HT-style hook conversions like reshaping `hook_z` from `[batch, seq, d_model]` to `[batch, seq, n_heads, d_head]`). +- Registers HT-style hook aliases recursively across blocks. + +`compatibility_mode` is then `True` on the bridge and on every component, so subsequent operations behave as if the bridge were loaded by `HookedTransformer.from_pretrained()`. + +## Hook semantic parity + +After `enable_compatibility_mode()`, these HT hook names fire on the **pre-norm residual** (matching HookedTransformer semantics): + +- `blocks.{i}.attn.hook_q_input`, `hook_k_input`, `hook_v_input` +- `blocks.{i}.hook_attn_in` +- `blocks.{i}.hook_mlp_in` (gated on `cfg.use_hook_mlp_in`; toggle via `bridge.set_use_hook_mlp_in(True)`) + +**Carve-outs** ([issue #1317](https://github.com/TransformerLensOrg/TransformerLens/issues/1317)): + +- **Post-norm architectures** (OLMo 2, BERT-style) read the **post-attention residual** instead, because the norm semantically lives elsewhere in the block. +- **MLA blocks** (DeepSeek V2 / V3 / R1) do **not** expose the split-qkv aliases — MLA's compressed K/V doesn't have a clean split. + +An adapter author for a new post-norm or MLA-style architecture must handle these carve-outs in `setup_hook_compatibility`. The Gemma1/Gemma2 adapters are exemplars of when **not** to override `setup_hook_compatibility` — `GemmaTextScaledWordEmbedding` already scales internally, so any added `hook_conversion` would double-scale `embed.hook_out`. + +## The four-quadrant test matrix + +The integration conftest at [`tests/integration/model_bridge/conftest.py`](../../../tests/integration/model_bridge/conftest.py) provides four bridge variants for every test model: + +| Variant | `compatibility_mode` | `no_processing` | Tests… | +|---|---|---|---| +| `gpt2_bridge` | off | n/a | HF-faithful numerics | +| `gpt2_bridge_compat` | on | `False` | HT-equivalent numerics | +| `gpt2_bridge_compat_no_processing` | on | `True` | Hook aliases without weight processing — used to bisect numerical bugs | +| (HT side) `gpt2_hooked_processed`, `gpt2_hooked_unprocessed` | n/a | n/a | Reference HookedTransformer with/without weight processing | + +New integration tests should use the variant that matches the property they're testing. Tests of HF parity → `gpt2_bridge`. Tests of HT-API behaviour → `gpt2_bridge_compat`. Tests of hook semantics regardless of weights → `gpt2_bridge_compat_no_processing`. + +## Cost + +`enable_compatibility_mode()` mutates the bridge's weights in-place. It is: + +- **One-shot**: calling it twice on the same bridge is safe but pointless; the second call re-folds already-folded weights into the new (already-folded) LayerNorm modules — which is a no-op for `fold_ln` semantically, but it does re-run the centering subtractions. Don't. +- **Not reversible** from within the bridge: if you need raw weights again, re-boot the bridge. +- **Idempotent in `_setup_hook_compatibility`**: that method can be called multiple times safely; only `process_weights` mutates weights. + +## See also + +- [supported_architectures/AGENTS.md](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — adapter contract; the `setup_hook_compatibility` override hook is documented there. +- [debugging_numerical_divergence.md](debugging_numerical_divergence.md) — uses `no_processing=True` as a key bisection tool. +- [migrating_to_v3.md](migrating_to_v3.md) — when porting HT code, you almost always want `enable_compatibility_mode()`. diff --git a/docs/source/content/debugging_numerical_divergence.md b/docs/source/content/debugging_numerical_divergence.md new file mode 100644 index 000000000..34b6cb10b --- /dev/null +++ b/docs/source/content/debugging_numerical_divergence.md @@ -0,0 +1,111 @@ +# Debugging Numerical Divergence + +When a Bridge adapter's integration test fails by `~1e-3` (or any larger delta) against the HuggingFace reference, the failure mode is almost always one of a small set of recurring bugs. This page walks the bisection workflow. + +> **Hard rule (also in [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)):** never call observed drift "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale until you measure. + +--- + +## 0. Setup checklist before debugging + +- **fp32, eager attention** on both sides. `bridge.boot_transformers(..., dtype=torch.float32, attn_implementation="eager")` and the HF `AutoModelForCausalLM.from_pretrained(..., torch_dtype=torch.float32, attn_implementation="eager")`. Mismatched dtype or `sdpa` / `flash_attention_2` masks bugs. +- **`enable_compatibility_mode(no_processing=True)`** for the first pass. This isolates forward-pass bugs from weight-processing bugs. See [compatibility_mode.md](compatibility_mode.md) for the four-quadrant matrix. +- **Single-token input**, then short input (5–10 tokens), then longer. Most adapter bugs surface on the single-token case; widen only after that's clean. +- **Same seed / disable dropout**. Adapters generally don't use dropout, but a stray `nn.Dropout(p=0.1)` in a generalized component will silently de-correlate runs. + +## 1. Bisect by component + +Compare intermediate activations end-to-end. The Bridge exposes them via hooks; HF exposes them via `output_hidden_states=True` and `output_attentions=True`. Walk forward through the stack and find the **first** layer where Bridge ≠ HF: + +| Stage | Bridge hook | HF output | +|---|---|---| +| Embedding | `embed.hook_out` | `outputs.hidden_states[0]` | +| Block i pre-attn-norm | `blocks.{i}.ln1.hook_out` | (HF doesn't expose; compute from `hidden_states[i]`) | +| Block i Q / K / V | `blocks.{i}.attn.q.hook_out`, `.k.hook_out`, `.v.hook_out` | (HF doesn't expose; instrument `model.layers[i].self_attn` directly) | +| Block i attention output | `blocks.{i}.attn.hook_out` | `outputs.attentions[i]` (pattern), then hidden_state delta | +| Block i MLP output | `blocks.{i}.mlp.hook_out` | (HF doesn't expose; hidden_state delta) | +| Block i residual out | `blocks.{i}.hook_resid_post` | `outputs.hidden_states[i+1]` | +| Final norm | `ln_final.hook_out` | (HF inlines into lm_head) | +| Logits | `unembed.hook_out` | `outputs.logits` | + +The first hop where they disagree localizes the bug. + +## 2. Common root causes, in order of frequency + +| Symptom | Likely cause | Where to look | +|---|---|---| +| Logits off everywhere but Q/K/V close | RoPE base / scaling mismatch | Adapter's `RotaryEmbeddingBridge` setup; check `cfg.rotary_base`, `cfg.rope_scaling` | +| Attention output drifts; Q / K / V match | Wrong `n_key_value_heads`, wrong head reshape | `_qkvo_weight_conversions(n_kv_heads=...)`; GQA-aware split | +| First-layer outputs off; embeddings off | Embedding scaling missing (Gemma, T5) | `preprocess_weights()` override; `cfg.scale_embeddings` | +| Off by a constant scale in residual | Final-RMS-norm offset missing | `cfg.rmsnorm_uses_offset = True` + `ArithmeticTensorConversion(ADDITION, 1.0)` | +| Logits flat / saturated at extremes | Missing logit softcap | `cfg.output_logits_soft_cap` from HF's `final_logit_softcapping` | +| Attention pattern collapses to argmax | Missing attention-score softcap | `cfg.attn_scores_soft_cap` from HF's `attn_logit_softcapping` | +| Off by `eps` magnitudes in norm | Wrong RMSNorm eps attribute name | `cfg.eps_attr` (Llama uses `"variance_epsilon"`, most others use `"eps"`) | +| First MLP off; gate matches | Forgot gated-MLP wiring | `GatedMLPBridge` with `{gate, in, out}` submodules — not `MLPBridge` | +| Bias-related drift | Adapter assumes biases that don't exist (Llama / RMSNorm) | `ProcessWeights._safe_get_tensor` handles `None`; check the weight-processing conversions are bias-aware | +| Drift only in compatibility mode | Hook semantic carve-out missing for post-norm or MLA | See [compatibility_mode.md](compatibility_mode.md) §"Hook semantic parity" | + +## 3. Isolating weight-processing bugs + +If `no_processing=True` matches HF but `enable_compatibility_mode()` (default) drifts: + +- The bug is in weight processing, not the forward pass. +- Bisect by toggling individual flags: `fold_ln`, `center_writing_weights`, `center_unembed`, `fold_value_biases`. The first one that introduces drift is the culprit. +- See [compatibility_mode.md §"What each flag does"](compatibility_mode.md#what-each-flag-does). + +## 4. Comparing against `boot_transformers` for the same model + +If Bridge ≠ HF on a model that already passes `verify_models`, your adapter likely diverges from the canonical Bridge load configuration. Quick sanity check: + +```python +import torch +from transformer_lens.model_bridge.bridge import TransformerBridge +from transformers import AutoModelForCausalLM + +ref = TransformerBridge.boot_transformers(model_name, device="cpu", dtype=torch.float32) +hf = AutoModelForCausalLM.from_pretrained( + model_name, torch_dtype=torch.float32, attn_implementation="eager" +) + +ids = torch.tensor([[hf.config.bos_token_id or 1]]) +ref_logits = ref(ids) +hf_logits = hf(ids).logits + +print((ref_logits - hf_logits).abs().max()) # should be < 1e-4 in fp32 +``` + +If `boot_transformers` itself disagrees with HF on the same model, the issue is upstream of your adapter (probably a `_HF_PASSTHROUGH_ATTRS` gap in [`_bridge_builder.py`](../../../transformer_lens/model_bridge/sources/_bridge_builder.py) or a missing config-attr propagation — see [supported_architectures/AGENTS.md §Config-attr propagation](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation)). + +## 5. Bisecting `verify_models` phase failures + +`verify_models` reports phase-by-phase. Map the failing phase to the bisection focus: + +| Phase | What failed | Start here | +|---|---|---| +| 1 | Forward correctness vs HF | Steps 1–4 above; this is the standard parity workflow | +| 2 | Hook firing / gradient flow | The hook isn't registered, or it's firing on a tensor that's been replaced (in-place op). Grep adapter for in-place ops on hookable tensors. | +| 3 | Weight processing | Run with `no_processing=True` to isolate. Then bisect compat-mode flags per §3 above. | +| 4 | Text-generation quality | Usually tokenizer policy (BOS, padding, EOS) — see [supported_architectures/AGENTS.md §Tokenizer policy](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy). Less often, a generation-loop divergence; rerun with `--no-ht-reference` to skip HT comparison. | +| 7 | Multimodal alignment | Vision encoder output drift or projection mismatch. Llava / Gemma3-multimodal only. | +| 8 | Audio | HuBERT only; check CTC head and audio-feature alignment. | + +## 6. What "fp noise" actually looks like + +Empirically, in this codebase: + +- **fp32, eager attention, single forward**: HF vs Bridge max-abs diff is typically `< 5e-5`. Anything ≥ `1e-4` is suspicious. +- **bf16, eager**: `< 1e-2` is the noise floor. +- **fp32, sdpa**: `< 5e-4` due to sdpa's internal reductions. Use eager for parity tests. + +If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug. If it drops by ~8 orders of magnitude, it was noise. This is what [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)'s rule is asking for. + +## 7. Tooling + +- `make integration-test PYTEST_ADDOPTS="-k -s"` — focused run with stdout. +- `transformer_lens/scratch.py` (gitignored) — drop one-off bisection scripts here without polluting `git status`. +- `.adapter-workspace/` (gitignored) — sibling directory for WIP adapter notes / repros. +- `bridge.run_with_cache(ids)` — returns `(logits, cache)`; `cache["blocks.{i}.hook_resid_post"]` is the easiest path to per-layer diffs. + +--- + +If you exhaust this guide and still can't localize the bug, the failure pattern is worth adding to §2 above so the next agent doesn't repeat the bisection. diff --git a/makefile b/makefile index de8c99738..2b5834e28 100644 --- a/makefile +++ b/makefile @@ -62,6 +62,12 @@ notebook-test: $(RUN) pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/Attribution_Patching_Demo.ipynb $(RERUN_ARGS) $(RUN) pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/Grokking_Demo.ipynb $(RERUN_ARGS) +test-pr: + $(MAKE) unit-test + $(MAKE) docstring-test + $(MAKE) acceptance-test + $(MAKE) integration-test + test: $(MAKE) unit-test $(MAKE) integration-test diff --git a/tests/AGENTS.md b/tests/AGENTS.md index 14e25cd4d..c129dd788 100644 --- a/tests/AGENTS.md +++ b/tests/AGENTS.md @@ -11,20 +11,23 @@ Read [the root AGENTS.md](../AGENTS.md) for project-wide rules. This file covers - **Use cached models for fast tests** (`gpt2`, `attn-only-{1,2,3,4}l`, `tiny-stories-1M`). Anything else gets `@pytest.mark.slow`. - **MPS is a carve-out, not a green light.** `TRANSFORMERLENS_ALLOW_MPS=1` is required; only [`tests/mps/`](mps/) runs there. - **Hard rules in [AGENTS.md §10](../AGENTS.md#10-hard-rules) apply**: no `xfail`/`skipif` to dodge CI, no dismissing failing tests as "pre-existing," no platform skips outside MPS. +- **Before debugging a failing test, check [QUARANTINES.md](QUARANTINES.md)** — if the failure matches a documented quarantine, it's known. If not, treat it as a real bug. --- ## Test tiers -| Tier | Path | Loads models? | Hits HF Hub? | Scope | Example | -|---|---|---|---|---|---| -| `unit` | [`tests/unit/`](unit/) | None / synthetic | No | Function or single module | [`tests/unit/test_key_value_cache_entry.py`](unit/test_key_value_cache_entry.py) | -| `integration` | [`tests/integration/`](integration/) | 1–2 cached models, module-scoped | Yes | Cross-component | [`tests/integration/test_generation_compatibility.py`](integration/test_generation_compatibility.py) | -| `acceptance` | [`tests/acceptance/`](acceptance/) | Full models (`gpt2`, `bloom-560m`), session-scoped | Yes | End-to-end behaviour | [`tests/acceptance/conftest.py`](acceptance/conftest.py) | -| `benchmarks` | [`tests/benchmarks/`](benchmarks/) | Varies; performance focus | Yes | Throughput / memory | [`tests/benchmarks/test_boot_memory.py`](benchmarks/test_boot_memory.py) | -| `mps` | [`tests/mps/`](mps/) | TinyStories-1M, fp32 only | Yes | macOS-MPS smoke only | [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) | +| Tier | Path | Run | Loads models? | Hits HF Hub? | Scope | Example | +|---|---|---|---|---|---|---| +| `unit` | [`tests/unit/`](unit/) | `make unit-test` | None / synthetic (rare exceptions) | No | Function or single module | [`tests/unit/test_key_value_cache_entry.py`](unit/test_key_value_cache_entry.py) | +| `integration` | [`tests/integration/`](integration/) | `make integration-test` | 1–2 cached models, module-scoped | Yes | Cross-component | [`tests/integration/test_generation_compatibility.py`](integration/test_generation_compatibility.py) | +| `acceptance` | [`tests/acceptance/`](acceptance/) | `make acceptance-test` | Full models (`gpt2`, `bloom-560m`), session-scoped | Yes | End-to-end behaviour | [`tests/acceptance/conftest.py`](acceptance/conftest.py) | +| `benchmarks` | [`tests/benchmarks/`](benchmarks/) | `make benchmark-test` | Varies; performance focus | Yes | Throughput / memory | [`tests/benchmarks/test_boot_memory.py`](benchmarks/test_boot_memory.py) | +| `mps` | [`tests/mps/`](mps/) | `pytest tests/mps -v` (needs `TRANSFORMERLENS_ALLOW_MPS=1`) | TinyStories-1M, fp32 only | Yes | macOS-MPS smoke only | [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) | -**Rule of thumb:** if your test calls `HookedTransformer.from_pretrained(...)` or `TransformerBridge.boot_transformers(...)`, it is NOT a unit test. +Common combinations: `make test-pr` (unit + docstring + acceptance + integration — the PR-review surface), `make test` (everything including benchmarks + notebooks). + +**Rule of thumb:** new tests that load a model should land in `integration/` by default. The `unit/` tier has a few legitimate model-loading exceptions (e.g. `test_bridge_vs_hooked_transformer_*.py` compares numerics across architectures, which is conceptually unit-scoped) — match that pattern only when the test really is testing isolated behaviour that happens to need a model. --- diff --git a/tests/QUARANTINES.md b/tests/QUARANTINES.md new file mode 100644 index 000000000..1d8212bd4 --- /dev/null +++ b/tests/QUARANTINES.md @@ -0,0 +1,124 @@ +# Test Quarantines + +Inventory of every `@pytest.mark.skip`, `skipif`, and `xfail` in [`tests/`](.). Use this to triage failures: a test on this list with the matching reason is **not your bug** and shouldn't be debugged blindly. A test that fails but **isn't** on this list is something to investigate. + +The repo-wide rule (see [AGENTS.md §10](../AGENTS.md#10-hard-rules)): **never add `xfail` / `skipif` to dodge a failing CI**. Every quarantine below has a category and a "When to un-skip" line. If you add a new skip, add it here too. + +--- + +## Permanent — optional dependency + +These tests require an optional install group and are skipped on the default install. Will never enable in CI without infrastructure changes. + +| Path | Marker | Trigger | +|---|---|---| +| [`tests/unit/test_lit.py`](unit/test_lit.py) (×18) | `skipif(not LIT_AVAILABLE)` | `pip install lit-nlp` (the `lit` optional-dep group) | +| [`tests/unit/components/test_attention.py`](unit/components/test_attention.py) (line 48) | `skipif(not is_bitsandbytes_available())` | `uv sync --group quantization` | +| [`tests/unit/test_weight_processing.py`](unit/test_weight_processing.py) (line 477) | `skipif(not is_bitsandbytes_available())` | Same as above | +| [`tests/unit/factories/test_mlp_factory.py`](unit/factories/test_mlp_factory.py) (line 40) | `skipif(not is_bitsandbytes_available())` | Same as above | + +**Un-skip:** never (these are correct behaviour). To run locally, install the optional group. + +--- + +## Permanent — hardware requirement + +| Path | Marker | Required hardware | +|---|---|---| +| [`tests/unit/test_next_sentence_prediction.py`](unit/test_next_sentence_prediction.py) (line 131) | `skipif(not torch.cuda.is_available())` | Any CUDA device | +| [`tests/unit/model_bridge/compatibility/test_next_sentence_prediction.py`](unit/model_bridge/compatibility/test_next_sentence_prediction.py) (line 95) | `skipif(not torch.cuda.is_available())` | Any CUDA device | +| [`tests/unit/components/test_attention.py`](unit/components/test_attention.py) (line 83) | `skipif(not torch.cuda.is_available())` (`reason="CUDA required for half/bfloat16 tests"`) | Any CUDA device | +| [`tests/acceptance/test_hooked_encoder.py`](acceptance/test_hooked_encoder.py) (line 227) | `skipif(not torch.cuda.is_available())` | Any CUDA device | +| [`tests/acceptance/test_hooked_encoder_decoder.py`](acceptance/test_hooked_encoder_decoder.py) (line 421) | `skipif(not torch.cuda.is_available())` | Any CUDA device | +| [`tests/acceptance/test_multi_gpu.py`](acceptance/test_multi_gpu.py) (line 91, 105) | `skipif(torch.cuda.device_count() < 2)` | 2+ CUDA devices | +| [`tests/acceptance/test_multi_gpu.py`](acceptance/test_multi_gpu.py) (line 22) | `skipif(torch.cuda.device_count() < 4)` | 4+ CUDA devices | +| [`tests/acceptance/model_bridge/test_multi_gpu_bridge.py`](acceptance/model_bridge/test_multi_gpu_bridge.py) (line 257) | `skipif(torch.cuda.device_count() < 2)` | 2+ CUDA devices | +| [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) (module-level `pytestmark`) | `skipif(not torch.backends.mps.is_available())` | Apple Silicon | + +**Un-skip:** never. CI provides each hardware tier via its own dedicated job (CUDA via compatibility-checks → in practice CPU-only; MPS via `mps-checks`; multi-GPU is local-only). See [tests/AGENTS.md §MPS rules](AGENTS.md#mps-rules) and the carve-out list in [`.github/workflows/checks.yml`](../.github/workflows/checks.yml). + +--- + +## Intentional — CI cost / network budget + +These tests gate on `os.getenv("CI")` to avoid expensive HF Hub fetches or model loads that don't fit the runner. + +| Path | Marker | +|---|---| +| [`tests/unit/model_bridge/supported_architectures/test_gemma2_adapter.py`](unit/model_bridge/supported_architectures/test_gemma2_adapter.py) (line 49) | `skipif(CI, reason="Network/disk fetch of tiny Gemma2 — skip in CI")` | +| [`tests/integration/model_bridge/test_bridge_integration.py`](integration/model_bridge/test_bridge_integration.py) (line 801) | `skipif(CI, reason="Skip Gemma2 test in CI to avoid timeout")` | +| [`tests/acceptance/model_bridge/compatibility/test_hook_completeness.py`](acceptance/model_bridge/compatibility/test_hook_completeness.py) (line 156) | `skipif(CI, reason="Gemma2 is too large for CI")` | + +**Un-skip:** when local. These should run cleanly on a dev machine with `HF_TOKEN` sourced. + +--- + +## Intentional — manual verification only + +Skipped in pytest because they need real (large) weights, run as part of `verify_models`. + +| Path | Marker | +|---|---| +| [`tests/integration/model_bridge/test_qwen3_moe_bridge.py`](integration/model_bridge/test_qwen3_moe_bridge.py) (lines 155, 166) | `skip(reason="Requires real weights — run manually during verification")` | + +**Un-skip:** run via `/verify-model Qwen/Qwen3-MoE-...` and the canonical [`verify_models`](../transformer_lens/tools/model_registry/AGENTS.md) workflow. + +--- + +## Upstream / platform bug — wait for fix + +| Path | Marker | Underlying issue | +|---|---|---| +| [`tests/unit/model_bridge/test_bridge_generate_no_tokenizer.py`](unit/model_bridge/test_bridge_generate_no_tokenizer.py) (lines 30, 128) | `skipif(_MACOS_ARM64, reason="Upstream macOS-arm64 KV-cache NaN; see linked issue.")` | Upstream PyTorch / HF KV-cache NaN on M-series Macs | + +**Un-skip:** when the upstream issue resolves. Don't bypass — the test will produce NaN logits. + +--- + +## ⚠️ Technical debt — actively quarantined whole-file + +These are **entire test modules** skipped via module-level `pytestmark`. They represent significant coverage gaps. Treat them as priority work to re-enable, not as permanent quarantine. + +| Path | Reason | +|---|---| +| [`tests/acceptance/test_hooked_transformer.py`](acceptance/test_hooked_transformer.py) (line 19) | `"Temporarily skipped due to CI test pollution issues"` | +| [`tests/acceptance/test_hooked_encoder.py`](acceptance/test_hooked_encoder.py) (line 13) | `"Temporarily skipped due to CI test pollution issues"` | +| [`tests/acceptance/test_hooked_encoder_decoder.py`](acceptance/test_hooked_encoder_decoder.py) (line 10) | `"Temporarily skipped due to CI test pollution issues"` | + +**Un-skip:** root-cause the "test pollution issues" — likely a fixture-scope or import-ordering bug that leaks state across tests. Until then, these major acceptance tiers are dark. + +--- + +## Technical debt — individual + +Tests quarantined individually pending fixes. Each should have a tracking issue or a clear "fix me" note in the surrounding code. + +| Path | Marker | What it covers | +|---|---|---| +| [`tests/unit/factored_matrix/test_constructor.py`](unit/factored_matrix/test_constructor.py) (line 54) | `skip(...)` | FactoredMatrix constructor edge case | +| [`tests/unit/model_bridge/test_architecture_adapter.py`](unit/model_bridge/test_architecture_adapter.py) (line 436) | `skip(...)` | Adapter behaviour | +| [`tests/unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py`](unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py) (lines 138, 142) | `skipif(...) / xfail(...)` | Bridge↔HT patching parity | +| [`tests/unit/model_bridge/test_hook_alias_resolution.py`](unit/model_bridge/test_hook_alias_resolution.py) (line 89) | `xfail(strict=True, reason=...)` per-architecture | Hook-alias gaps | +| [`tests/unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py`](unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py) (lines 574, 609, 660, 680, 771) | `skipif(...)` × 5 | Specific Qwen3.5 adapter behaviours | +| [`tests/unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py`](unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py) (line 531) | `skipif(...)` | Qwen3-Next adapter behaviour | +| [`tests/integration/test_weight_processing_integration.py`](integration/test_weight_processing_integration.py) (line 238) | `skip(...)` | Weight-processing edge case | +| [`tests/integration/test_tensor_extraction_consistency.py`](integration/test_tensor_extraction_consistency.py) (line 33) | `skip(...)` | Tensor extraction consistency | +| [`tests/integration/test_tokenization_methods.py`](integration/test_tokenization_methods.py) (line 53) | `skipif(...)` | Tokenization method coverage | +| [`tests/integration/test_hooked_encoder_properties.py`](integration/test_hooked_encoder_properties.py) (line 71) | `xfail(...)` | HookedEncoder properties | +| [`tests/acceptance/model_bridge/compatibility/test_backward_hooks.py`](acceptance/model_bridge/compatibility/test_backward_hooks.py) (line 11) | `skip(...)` | Backward-hook compatibility | +| [`tests/acceptance/test_hooked_transformer.py`](acceptance/test_hooked_transformer.py) (lines 551, 560) | `skipif(...)` × 2 (inside the module-level skip) | Additional `from_pretrained_no_processing` cases | + +**Un-skip:** debug the underlying issue and remove the marker. Each removed marker should land in a focused PR with a regression test. + +--- + +## Adding a new quarantine + +If you're tempted to add a `skip` / `skipif` / `xfail`, **read [AGENTS.md §10](../AGENTS.md#10-hard-rules) first**. The default answer is "fix the bug instead." + +If a quarantine really is the right call: + +1. Pick the right marker — `skipif(condition)` for environment gates, `skip(reason=)` for known-bad code paths, `xfail(strict=True, reason=)` when you expect failure and want CI to alert you if it starts passing. +2. Use a `reason=` string descriptive enough to look up later — not `"flaky"`, not `"broken"`. +3. **Add a row to the right section above**, with the path, marker, and an "un-skip when…" line. +4. If you're skipping a whole module via `pytestmark = pytest.mark.skip(...)`, flag it in the "actively quarantined whole-file" section so it gets visibility. diff --git a/transformer_lens/model_bridge/supported_architectures/AGENTS.md b/transformer_lens/model_bridge/supported_architectures/AGENTS.md index 560bb11e8..d1282a4f4 100644 --- a/transformer_lens/model_bridge/supported_architectures/AGENTS.md +++ b/transformer_lens/model_bridge/supported_architectures/AGENTS.md @@ -9,10 +9,12 @@ For the **verification** workflow after writing an adapter, see [tools/model_reg ## TL;DR - **One file per architecture family**, not per model. `llama.py` covers Llama 1 / 2 / 3 / 3.1 / 3.2; do NOT add `llama7b.py` or `llama2.py`. Family splits happen only when internal structure changes (`gemma1.py` / `gemma2.py` / `gemma3.py`, `qwen.py` / `qwen2.py` / `qwen3.py`, `mistral.py` / `mixtral.py`). -- **Two-place registration is mandatory.** Re-export in [`__init__.py`](__init__.py) AND add to `SUPPORTED_ARCHITECTURES` in [`../../factories/architecture_adapter_factory.py`](../../factories/architecture_adapter_factory.py). Missing either causes silent runtime failure at boot. +- **Four-place registration is mandatory.** See [Registration steps](#registration-steps) — missing any one causes silent runtime failure or stale generated docs. - **`self.component_mapping: ComponentMapping`** is the load-bearing contract. Maps canonical TransformerLens paths → Bridge components wrapping HF module paths. - **Hook names are Bridge-native** (`blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`). Don't invent HT-style aliases here — those are registered separately in [`bridge.py`](../bridge.py). -- **Copy from the closest existing adapter** before writing from scratch. See the starter table below. +- **Copy from the closest existing adapter** before writing from scratch. See the [starter table](#starter-adapter-table). +- **Surface non-standard HF config attrs explicitly** — they are invisible to TL-side consumers unless propagated to `self.cfg`. See [Config-attr propagation](#config-attr-propagation). +- **Tokenizer policy is per-model, not per-architecture.** Never inherit `default_prepend_bos`, `default_padding_side`, EOS handling, or chat-template wiring from the starter adapter without checking the target's `tokenizer_config.json`. See [Tokenizer policy](#tokenizer-policy). --- @@ -83,17 +85,62 @@ Optional overrides on the class: --- +## Config-attr propagation + +HF raw config attributes are invisible to TL-side consumers unless explicitly propagated to `self.cfg`. After analysing the HF model, walk the `config.json` and list every non-standard attribute. For each, decide whether the base adapter machinery already handles it; if not, mirror it onto `self.cfg`. + +Common attributes that need explicit propagation: + +| HF attribute | Surface as | Used by | +|---|---|---| +| `final_logit_softcapping` | `self.cfg.final_logit_softcapping` | Gemma2/3 — final-layer logit clip | +| `attn_logit_softcapping` | `self.cfg.attn_logit_softcapping` | Gemma2/3 — attention-score clip | +| `query_pre_attn_scalar` | `self.cfg.query_pre_attn_scalar` | Gemma2/3 — query scaling override | +| `sliding_window` | `self.cfg.sliding_window` | Mistral, Qwen2, Gemma2 — local-attention layers | +| `layer_types` | `self.cfg.layer_types` | Hybrid models with per-layer attention type lists | +| Non-standard RMSNorm eps key | `self.cfg.eps_attr = ""` | Llama uses `"variance_epsilon"` instead of `"eps"` | + +Rule of thumb: if the model card or HF source mentions a numerical knob, assume it needs to land on `self.cfg`. + +--- + +## Tokenizer policy + +Tokenizer behaviour is **per-model**, not per-architecture. Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, the multilingual fork may pad on the left where the original pads on the right, the new-generation tokenizer may have a different EOS. **Do not copy these values from your starter adapter blindly; check the target's [tokenizer_config.json] and chat template first.** + +Tokenizer-policy fields the adapter / load path is responsible for: + +| Flag | Meaning | Default behaviour | What goes wrong if wrong | +|---|---|---|---| +| `default_prepend_bos` | Prepend BOS to every input by default | Framework default unless set | Off-by-one in logit positions; generation starting from wrong context | +| `default_padding_side` | `"left"` or `"right"` | HF tokenizer default | Generation produces garbage for batched inputs of unequal length | +| EOS handling in generation | Stop on `tokenizer.eos_token_id` (single or list) | Single token from `cfg.eos_token_id` | Generation runs past the natural stop or stops prematurely | +| Chat-template wiring | Whether the bridge auto-applies `tokenizer.apply_chat_template` | Off — user calls it explicitly | Instruct-tuned models produce base-model-style continuations | +| Tokenizer class mismatch (HF reports `Tokenizer`, model card uses `Tokenizer`) | n/a | Whatever HF resolves | Subtle BPE/SentencePiece divergence; rare but real for forks | + +**How to verify the right values:** + +1. Read [`tokenizer_config.json`] in the HF repo. Look at `add_bos_token`, `padding_side`, `bos_token`, `eos_token`, `chat_template`. +2. Compare to the closest sibling already in the registry. If they differ, treat that as a deliberate choice and propagate it. +3. When unsure, **leave the framework default** — explicit wrongness is worse than implicit defaulting, because the wrong explicit value hides under "I configured it" assumptions. + +`default_prepend_bos` is the most common trap (instruct/base divergence within Llama, Mistral, Gemma families), but the same logic applies to every flag above. + +--- + ## Registration steps -After writing `myarch.py` with `MyArchitectureAdapter`: +After writing `myarch.py` with `MyArchitectureAdapter`, register in **all four** sites. Missing any one breaks something: -1. **Re-export in [`__init__.py`](__init__.py):** +1. **[`__init__.py`](__init__.py)** — import + add to `__all__`: ```python from transformer_lens.model_bridge.supported_architectures.myarch import ( MyArchitectureAdapter, ) ``` -2. **Register in [`../../factories/architecture_adapter_factory.py`](../../factories/architecture_adapter_factory.py)** — add to the imports block AND to `SUPPORTED_ARCHITECTURES`: + Missing → import error at boot. + +2. **[`../../factories/architecture_adapter_factory.py`](../../factories/architecture_adapter_factory.py)** — import + `SUPPORTED_ARCHITECTURES` entry: ```python from transformer_lens.model_bridge.supported_architectures import ( ..., @@ -102,15 +149,24 @@ After writing `myarch.py` with `MyArchitectureAdapter`: SUPPORTED_ARCHITECTURES = { ..., - "MyArchForCausalLM": MyArchitectureAdapter, # HF class name from config.architectures[] + "MyArchForCausalLM": MyArchitectureAdapter, # key must match config.architectures[0] exactly } ``` - The key must match `config.architectures[0]` from the HF model's `config.json` exactly. -3. **Add to the registry constants** in [`../../tools/model_registry/__init__.py`](../../tools/model_registry/__init__.py): - - `HF_SUPPORTED_ARCHITECTURES` set - - `CANONICAL_AUTHORS_BY_ARCH` map (foundation-trained orgs for the scraper's download-threshold bypass) + Missing → `boot_transformers` raises "unsupported architecture." + +3. **[`../../tools/model_registry/__init__.py`](../../tools/model_registry/__init__.py)** — two updates in this file: + - Add `"MyArchForCausalLM"` to `HF_SUPPORTED_ARCHITECTURES` + - Add `"MyArchForCausalLM": ["foundation-org-1", "foundation-org-2"]` to `CANONICAL_AUTHORS_BY_ARCH` + + Missing → HF scraper misses canonical models for this architecture (download-threshold bypass). + +4. **[`../../tools/model_registry/generate_report.py`](../../tools/model_registry/generate_report.py)** — one-line entry in `ARCHITECTURE_DESCRIPTIONS`: + ```python + "MyArchForCausalLM": "Short human-readable description.", + ``` + Missing → generated docs table omits the new architecture. -Missing step 1 → import error at boot. Missing step 2 → `boot_transformers` raises "unsupported architecture." Missing step 3 → scraper misses canonical models. +**Verify the four-place wiring**: [`tests/unit/tools/test_model_registry.py`](../../../tests/unit/tools/test_model_registry.py) has a `TestRegistrySyncedWithFactory` class with four bidirectional invariants over `SUPPORTED_ARCHITECTURES`, `HF_SUPPORTED_ARCHITECTURES`, and `CANONICAL_AUTHORS_BY_ARCH`. Run `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory` after registering — the failure message tells you which set is missing your new key. (There's an `INTENTIONAL_EXCLUDES` carve-out in the same class for internal-only architectures and HF-emits-different-casing aliases; almost no new adapters need to add themselves there.) --- @@ -132,16 +188,105 @@ Missing step 1 → import error at boot. Missing step 2 → `boot_transformers` --- -## Verification +## Adding the HF repo to the registry + +After registration, add the model ID to [`../../tools/model_registry/data/supported_models.json`](../../tools/model_registry/data/supported_models.json) so `verify_models` can resolve it. The entry shape is: + +```json +{ + "architecture_id": "MyArchForCausalLM", + "model_id": "org/repo-name", + "status": 0, + "verified_date": null, + "metadata": null, + "note": null, + "phase1_score": null, + "phase2_score": null, + "phase3_score": null, + "phase4_score": null, + "phase7_score": null, + "phase8_score": null +} +``` + +**Hand-edits are allowed only for adding new model-ID entries.** The `status`, `verified_date`, `note`, and `phaseN_score` fields are written exclusively by `update_model_status()` — never set them manually. See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). -After registering the adapter, validate end-to-end via the model registry — see [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). The minimum check is: +When adding the user-provided model, prompt the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]` — e.g. when adding `google/gemma-2-2b`, ask about `google/gemma-2-2b-it`, `google/gemma-2-9b`, `google/gemma-2-9b-it`, `google/gemma-2-27b`, `google/gemma-2-27b-it`. The HF scraper picks these up eventually, but explicit entries unblock `verify_models` against the siblings now. + +Do NOT add the model to [`transformer_lens/supported_models.py`](../../supported_models.py) — that file is HookedTransformer-only. + +--- + +## Model source paths: `boot_transformers` vs `boot_native` + +TransformerBridge has **two parallel load paths**, both routed through the same adapter system: + +| Source | Entry | Use case | +|---|---|---| +| `boot_transformers` | `TransformerBridge.boot_transformers(model_name, ...)` | Default: wraps a HuggingFace model. Adapter maps HF module paths to canonical names. Every adapter in this directory supports this path. | +| `boot_native` | `TransformerBridge.boot_native(cfg, init_mode=...)` | TL-native transformer built from scratch, no HF dependency. Used by [`Realtime_Training_Telemetry_Demo.ipynb`](../../../demos/Realtime_Training_Telemetry_Demo.ipynb) and research that needs cfg-driven small models. | + +Native models are built by [`transformer_lens/model_bridge/sources/native/model.py`](../sources/native/model.py) and routed through [`native.py`](native.py) — the adapter whose `component_mapping` adapts to `cfg` (rotary drops `pos_embed`, RMS norm picks `RMSNormalizationBridge`, gated MLP picks `GatedMLPBridge`, `attn_only` drops MLP). Init policies live in [`sources/native/init.py`](../sources/native/init.py): `gpt2` (default, Normal with 1/√(2·n_layers) residual scaling), `xavier_uniform` / `xavier_normal`, `kaiming_uniform` / `kaiming_normal`. Determinism uses a scoped `torch.Generator` so seeded init does not perturb the caller's global RNG. + +Cfg-driven Native features: `normalization_type` (LN / RMS / RMSPre), `final_rms`, `gated_mlp`, `attn_only`, `n_key_value_heads` (GQA), `attn_scores_soft_cap`, `output_logits_soft_cap`, `positional_embedding_type` (standard / rotary), `rotary_dim` / `rotary_base` / `rope_scaling` (linear PI, dynamic/NTK, llama3 by-parts). + +**Sister source paths** under [`sources/`](../sources/) — `transformers/`, `vllm/`, `inspect/` — are different backends; their existence is mostly an internal detail for new adapter authors. Almost all new work goes through `transformers/` (HF backend) and only writes a `supported_architectures/.py` file. + +When **does a new adapter need to support both paths?** Almost never — if you're adding support for an HF-released architecture, you're writing a `boot_transformers` adapter and you're done. `boot_native` is for research-grade from-scratch models and uses the single `native.py` adapter generically. Touch `native.py` only if you're adding a primitive (e.g. a new norm type) that the cfg-driven dispatch doesn't already cover. + +--- + +## Required tests + +Two test layers per architecture, both required for review: + +### 1. Unit adapter test — `tests/unit/model_bridge/supported_architectures/test__adapter.py` + +29 of these exist; they all follow the same shape. Copy from a sibling that matches your architecture's quirks: + +- **Standard decoder LM** template: [`test_gemma1_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_gemma1_adapter.py) or [`test_gpt2_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_gpt2_adapter.py) +- **GQA / modern RMSNorm+RoPE** template: [`test_qwen3_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_qwen3_adapter.py) +- **Multimodal** template: [`test_llava_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_llava_adapter.py), [`test_gemma3_multimodal_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_gemma3_multimodal_adapter.py) + +The minimal shape: + +```python +def _make_cfg(d_model: int = 32) -> TransformerBridgeConfig: + return TransformerBridgeConfig( + d_model=d_model, d_head=d_model // 4, n_layers=2, n_ctx=128, + n_heads=4, d_vocab=256, d_mlp=64, architecture="MyArchForCausalLM", + ) + +@pytest.fixture(scope="module") +def adapter() -> MyArchitectureAdapter: + return MyArchitectureAdapter(_make_cfg()) + +class TestMyArchHookCompatibility: + def test_(self, adapter): ... +``` + +Unit adapter tests don't load weights — they instantiate the adapter from a synthetic cfg and assert structural / hook-compatibility properties (component_mapping shape, whether `setup_hook_compatibility` is overridden, conversion shapes, etc.). They run in the default `make unit-test` tier with no HF Hub access. + +Add a dedicated test for **each architecture quirk** your adapter handles — softcaps, RMSNorm offsets, sliding window, custom eps attr, MoE routing. The Gemma1 test is a great example of a one-quirk-one-test file ("the adapter must NOT override setup_hook_compatibility because the HF embedding scales internally"). + +### 2. Integration parity test — `tests/integration/model_bridge/test__adapter.py` (or `test__bridge.py`) + +Loads a real cached HF model, asserts logit parity vs HuggingFace at fp32 + eager attention. Templates: [`test_deepseek_adapter.py`](../../../tests/integration/model_bridge/test_deepseek_adapter.py), [`test_falcon_adapter.py`](../../../tests/integration/model_bridge/test_falcon_adapter.py), [`test_mamba_adapter.py`](../../../tests/integration/model_bridge/test_mamba_adapter.py). + +If your architecture's reference model is large enough to OOM CI, gate it with `@pytest.mark.skipif(bool(os.getenv("CI")), reason="...")` and add a row to [`tests/QUARANTINES.md`](../../../tests/QUARANTINES.md) under "CI cost / network budget." + +### End-to-end registry verification + +After both tests pass: ```bash set -a; source .env; set +a -python -m transformer_lens.tools.model_registry.verify_models --architectures MyArchForCausalLM --per-arch 1 +uv run python -m transformer_lens.tools.model_registry.verify_models --architectures MyArchForCausalLM --per-arch 1 ``` -And an integration test under [`tests/integration/`](../../../tests/integration/) asserting logit parity with HuggingFace at fp32 + eager attention. +This is what [`/verify-model`](../../../.claude/commands/verify-model.md) wraps. See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md) for the canonical workflow and the [`verify_models` vs `main_benchmark` trap](../../tools/model_registry/AGENTS.md#tldr). + +If logit parity fails, see [docs/source/content/debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md) for the bisection workflow. --- @@ -149,5 +294,8 @@ And an integration test under [`tests/integration/`](../../../tests/integration/ - **No single-model files.** One file per architecture family. - **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` / `typing.cast` if the type system disagrees. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). -- **No skipping the factory registration.** The adapter file existing is not enough — without the entry in `SUPPORTED_ARCHITECTURES`, `boot_transformers` cannot find it and fails at runtime. +- **No skipping any of the four registration sites.** Each one breaks something different (import, factory, scraper, generated docs). The invariant test in [`test_model_registry.py`](../../../tests/unit/tools/test_model_registry.py) will catch most of these. - **No HT-style hook aliases inside adapters.** Aliases live in [`../bridge.py`](../bridge.py) via `build_alias_to_canonical_map()`; adapters declare Bridge-native names only. +- **No inheriting tokenizer-policy flags from the starter adapter** (`default_prepend_bos`, padding side, EOS). Tokenizer policy is per-model; see [Tokenizer policy](#tokenizer-policy). +- **No manual edits to existing `supported_models.json` entries' status / phase fields.** Only `update_model_status()` writes those. +- **No skipping the unit adapter test.** Every architecture has one; an adapter PR without one is incomplete. diff --git a/transformer_lens/tools/model_registry/AGENTS.md b/transformer_lens/tools/model_registry/AGENTS.md index 39d00c911..91d1a0b82 100644 --- a/transformer_lens/tools/model_registry/AGENTS.md +++ b/transformer_lens/tools/model_registry/AGENTS.md @@ -6,10 +6,10 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file ## TL;DR -> **Use `verify_models`, not `main_benchmark`.** Only `python -m transformer_lens.tools.model_registry.verify_models …` updates `data/supported_models.json`. `python -m transformer_lens.benchmarks.main_benchmark …` runs the same benchmark math but defaults to **not** writing the registry (needs `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). **If you ran `main_benchmark`, the registry is stale — re-run via `verify_models`.** +> **Use `verify_models`, not `main_benchmark`.** Only `uv run python -m transformer_lens.tools.model_registry.verify_models …` updates `data/supported_models.json`. `uv run python -m transformer_lens.benchmarks.main_benchmark …` runs the same benchmark math but defaults to **not** writing the registry (needs `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). **If you ran `main_benchmark`, the registry is stale — re-run via `verify_models`.** -- **The only function that writes `supported_models.json` is `update_model_status()` in `registry_io.py`.** Anything that doesn't route through it leaves the registry stale. -- **Never edit `supported_models.json` by hand.** Always go through `update_model_status()`. +- **`update_model_status()` in `registry_io.py` is the only function that mutates status / phase / note fields on existing entries.** Never set those fields by hand. +- **Adding a brand-new model-ID entry is allowed (and required before `verify_models --model ` can find it).** See [Adding a new model entry](#adding-a-new-model-entry). - **Never run `verify_models` in parallel.** A single CUDA/MPS device cannot hold concurrent loads (see [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). - **HF token required.** `verify_models` calls `AutoConfig.from_pretrained(..., token=get_hf_token())` for parameter estimation; gated models fail without `HF_TOKEN`. Source `.env` first: `set -a; source .env; set +a`. @@ -19,9 +19,9 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file | Goal | Command | |---|---| -| Verify one specific model + update registry | `python -m transformer_lens.tools.model_registry.verify_models --model ` | -| Verify N models per architecture family | `python -m transformer_lens.tools.model_registry.verify_models --architectures --per-arch ` | -| Verify N models across all architectures | `python -m transformer_lens.tools.model_registry.verify_models --per-arch ` | +| Verify one specific model + update registry | `uv run python -m transformer_lens.tools.model_registry.verify_models --model ` | +| Verify N models per architecture family | `uv run python -m transformer_lens.tools.model_registry.verify_models --architectures --per-arch ` | +| Verify N models across all architectures | `uv run python -m transformer_lens.tools.model_registry.verify_models --per-arch ` | | Resume after Ctrl-C / crash | Re-run the same command with `--resume` (reads `data/verification_checkpoint.json`) | | Re-verify already-verified models for an arch | `--reverify --architectures ` | | See what would run without doing it | Add `--dry-run` | @@ -30,6 +30,25 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file `HFClassName` matches the strings in `HF_SUPPORTED_ARCHITECTURES` (see `__init__.py`) — e.g. `LlamaForCausalLM`, `GPT2LMHeadModel`, `Olmo2ForCausalLM`. +### Flag reference + +| Flag | Meaning | +|---|---| +| `--model ` | Verify a single HF repo (must already exist as an entry in `supported_models.json`) | +| `--architectures ` | Restrict to one or more HF architecture classes | +| `--per-arch ` | Verify the top-N unverified models per architecture (default 10) | +| `--limit ` | Cap total models verified across all architectures | +| `--device ` | Override automatic device selection | +| `--dtype ` | Override automatic dtype selection | +| `--max-memory ` | Skip models whose parameter-count estimate exceeds this GB cap (default: tries every model that fits available device memory). Use this to avoid OOM on a small device — e.g. `--max-memory 16` on a 24 GB GPU leaves head-room for activations. | +| `--phases ` | Restrict to specific phases (default `1 2 3 4`; Phase 7/8 are auto-skipped for non-applicable architectures) | +| `--resume` | Read `data/verification_checkpoint.json` and skip models already tested in the in-flight run | +| `--reverify` | Re-test already-verified models (default skips status=1 entries) | +| `--retry-failed` | Re-test status=3 (failed) entries | +| `--dry-run` | Print what would be tested without running | +| `--no-hf-reference` / `--no-ht-reference` | Skip the HF / HT comparison passes (faster, lower confidence) | +| `--quiet` | Suppress per-model logging | + --- ## File roles @@ -47,12 +66,37 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file | `discover_architectures.py` | Lightweight HF scan to enumerate architecture classes | | `hf_scraper.py` | Full HF Hub scan; builds initial supported/unsupported model lists | | `relevancy.py` | Filters models by download count, foundation-org provenance | -| `generate_report.py` | Renders human-readable status summaries | +| `generate_report.py` | Renders human-readable status summaries; holds `ARCHITECTURE_DESCRIPTIONS` | `__init__.py` exports the canonical `HF_SUPPORTED_ARCHITECTURES` set and `CANONICAL_AUTHORS_BY_ARCH` map; agents adding a new HF architecture must update both. --- +## Adding a new model entry + +To verify a model that isn't already in `data/supported_models.json`, you must add the entry by hand first. This is the **only** allowed hand-edit: + +```json +{ + "architecture_id": "MyArchForCausalLM", + "model_id": "org/repo-name", + "status": 0, + "verified_date": null, + "metadata": null, + "note": null, + "phase1_score": null, + "phase2_score": null, + "phase3_score": null, + "phase4_score": null, + "phase7_score": null, + "phase8_score": null +} +``` + +After adding the entry, `verify_models --model org/repo-name` will populate the status / score / note fields via `update_model_status()`. **Do not** set those fields manually in the JSON; they are owned by the verification pipeline. + +--- + ## `data/verification_checkpoint.json` (gitignored) Resume state for long-running `verify_models` runs. Lists tested / verified / failed / skipped model IDs plus start timestamp. Behaviour: @@ -87,6 +131,6 @@ A model entry's `status` is `1` (verified) only after all applicable phases pass - **No `main_benchmark` for registry updates.** It misses Phase 7 / 8, doesn't manage the checkpoint, and silently skips the registry write unless `--update-registry` is passed. - **No parallel `verify_models` runs.** Device OOM. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). -- **No manual edits to `data/supported_models.json`.** Always `update_model_status()`. +- **No manual edits to existing entries' `status` / `verified_date` / `note` / `phaseN_score` fields.** Only `update_model_status()` writes those. (Adding new model-ID entries is allowed — see [Adding a new model entry](#adding-a-new-model-entry).) - **No deleting `data/verification_checkpoint.json` mid-run.** Let the SIGINT handler clean up. - **No skipping `.env` sourcing.** Gated-model arch verification (Llama / Mistral / Gemma / gated Qwen) needs `HF_TOKEN`. From 7dd3b67f60fe0c9555d7d33029f8b82adf62a353 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 17:30:38 -0500 Subject: [PATCH 06/11] Additional review pass --- .claude/commands/add-model-support.md | 81 +++----- .claude/commands/verify-model.md | 32 +++- .github/ISSUE_TEMPLATE/verify-model.md | 51 +++++ AGENTS.md | 79 +++++--- CLAUDE.md | 19 +- docs/source/content/compatibility_mode.md | 6 +- docs/source/content/contributing.md | 144 +++++++++++++- .../content/debugging_numerical_divergence.md | 8 +- docs/source/content/hook_system.md | 177 ++++++++++++++++++ docs/source/index.md | 3 + tests/AGENTS.md | 18 +- 11 files changed, 506 insertions(+), 112 deletions(-) create mode 100644 .github/ISSUE_TEMPLATE/verify-model.md create mode 100644 docs/source/content/hook_system.md diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index 7eff3199d..1f9bdcd12 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -5,11 +5,7 @@ argument-hint: You are adding TransformerBridge support for the HuggingFace model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user for the HF repo path before continuing. -**Read first:** - -- [transformer_lens/model_bridge/supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — the adapter contract, starter-adapter table, **4-place registration steps**, config-attr propagation rules, the `default_prepend_bos` trap, common gotchas, and how to add the model entry to `supported_models.json`. -- [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) — the verification workflow and the `verify_models` vs `main_benchmark` trap. -- [docs/source/content/adapter_development/adapter-creation-guide.md](../../docs/source/content/adapter_development/adapter-creation-guide.md) and [docs/source/content/adapter_development/hf-model-analysis-guide.md](../../docs/source/content/adapter_development/hf-model-analysis-guide.md) — authoritative step-by-step references. +Each step below names the doc to read **only when you reach that step** — do not load all of them up front. Execute this checklist, stopping at each step until it is genuinely done: @@ -28,61 +24,32 @@ Execute this checklist, stopping at each step until it is genuinely done: ### Confirm before verification - **Always** ask the user before running verification — even for small models. Verification is non-trivial and resource-bound. To give them an informed decision: + Always ask the user before running verification, even for small models. To give them an informed decision: - 1. **Get a resource projection** without loading the model: + 1. Run dry-run to project cost: ``` set -a; source .env; set +a uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run ``` - Capture the projected memory and parameter-count estimate from the dry-run output. - 2. **Present the user with the cost of verification**: - - **Model**: `` (HF repo: `https://huggingface.co/`) - - **Architecture**: `` (from `config.architectures[0]`) - - **Estimated parameters**: `` (from the dry-run) - - **Projected memory**: ` GB` (from the dry-run) — they need at least this much free on the target device - - **Expected runtime**: typically 30 s – 2 min for sub-1B models, 2–15 min for 1B–7B, 15+ min for 7B+ or multimodal - - **What verification does**: loads the model, runs Phases 1–4 (forward correctness vs HF, hook firing + gradients, weight processing, generation quality), and on success updates `supported_models.json` via `update_model_status()` - - **Token requirement**: whether `HF_TOKEN` is needed (gated repos: Llama, Mistral, Gemma, gated Qwen variants) - 3. **Ask explicitly**: "Do you want me to run verification on this machine? (Y/N)" - - **If the user confirms** → run `/verify-model $ARGUMENTS`. On pass, the registry now reflects verified status and you're done — no new code needed. On fail, the failure mode tells you whether to fix the existing adapter (per-sibling regression — see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md)) or escalate. - - **If the user rejects** → direct them to file a tracking issue so a maintainer with appropriate hardware can run the verification: - ``` - gh issue create \ - --title "Verify model support: " \ - --body "$(cat <<'EOF' - ## Model - - Repo: https://huggingface.co/ - - Architecture: - - Estimated parameters: - - Projected memory: GB - - Gated: yes / no (HF_TOKEN required: yes / no) - - ## Registry state - - Architecture adapter exists: yes - - In supported_models.json: yes (status: ) / no — entry added in this issue's PR - - ## Motivation - - - ## Next step - Run on appropriate hardware and report results: - \`\`\` - set -a; source .env; set +a - uv run python -m transformer_lens.tools.model_registry.verify_models --model - \`\`\` - EOF - )" - ``` - If `gh` isn't installed, link them to `https://github.com/TransformerLensOrg/TransformerLens/issues/new` with the same body content. Then stop the workflow — the registry-entry add (if you made one above) can either land in its own small PR or be folded into the maintainer's verification PR; either is fine. - -2. **Analyze the HF model.** Read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output-head layouts. Run through the [Config-attr propagation table in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) for non-standard attributes (`final_logit_softcapping`, `sliding_window`, etc.) and decide which need to surface on `self.cfg`. -3. **Pick a starting adapter** using the [starter-adapter table](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy it into [`transformer_lens/model_bridge/supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `.py`. **Watch tokenizer-policy flags** — see [Tokenizer policy in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) (`default_prepend_bos`, padding side, EOS handling, chat-template wiring are all per-model, not per-architecture). -4. **Fill in `self.component_mapping`** so each HF module path resolves to a canonical Bridge name. Hook names are Bridge-native. -5. **Register in all four sites** per the [registration steps in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). -6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per the [Adding the HF repo to the registry section](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Then ask the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. -7. **Verify** end-to-end: run `/verify-model $ARGUMENTS` (one model only — do not parallelize). -8. **Write an integration test** under [tests/integration/](../../tests/integration/) that asserts logit parity with HuggingFace. Use fp32 + eager attention to match `boot_transformers`' load configuration. Gate probes for optional structural features (`resid_mid` etc.) rather than assuming they exist. + 2. Show the user: model ID, architecture class, estimated parameters, projected memory (GB), whether `HF_TOKEN` is needed, runtime expectation (30 s – 2 min sub-1B, 2–15 min 1B–7B, 15+ min 7B+ or multimodal), and what verification does (Phases 1–4; updates `supported_models.json` on success). + 3. Ask: "Run verification on this machine? (Y/N)" + + **Confirm** → `/verify-model $ARGUMENTS`. On pass, registry updated, done. On fail, see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md) (per-sibling adapter bug, not a new adapter). + + **Reject** → `gh issue create --template verify-model.md` (fill from the dry-run output you already collected). If `gh` isn't installed: . Stop the workflow. + +2. **Analyze the HF model.** Read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output-head layouts. Now read [Config-attr propagation in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) and run through the non-standard attributes table (`final_logit_softcapping`, `sliding_window`, etc.); decide which need to surface on `self.cfg`. + +3. **Pick a starting adapter.** Read the [starter-adapter table in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy the closest match into [`transformer_lens/model_bridge/supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `.py`. **Watch tokenizer-policy flags** — see [Tokenizer policy in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) (`default_prepend_bos`, padding side, EOS handling, chat-template wiring are all per-model, not per-architecture). + +4. **Fill in `self.component_mapping`** so each HF module path resolves to a canonical Bridge name. Hook names are Bridge-native. The [Minimal contract](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#minimal-contract) and [Common gotchas](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#common-gotchas) tables in supported_architectures/AGENTS.md are the references for this step. + +5. **Register in all four sites** per the [registration steps in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). After registering, run the invariant test to confirm the four-place wiring: `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory`. + +6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per [supported_architectures/AGENTS.md §"Adding the HF repo to the registry"](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Then ask the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. + +7. **Verify** end-to-end: run `/verify-model $ARGUMENTS`. (`/verify-model` enforces dry-run-first and one-at-a-time.) + +8. **Write tests.** Read [Required tests in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#required-tests) for the two-layer pattern (unit adapter test + integration parity test) and copy the closest sibling. + 9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers, looping until clean. diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md index 1198e2bb9..bd728cb1f 100644 --- a/.claude/commands/verify-model.md +++ b/.claude/commands/verify-model.md @@ -1,30 +1,52 @@ --- -description: Run verify_models.py against a single model (non-parallel). +description: Run verify_models.py against a single model (non-parallel). Always dry-run first. argument-hint: --- -Run model verification for `$ARGUMENTS`: +Verify model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. + +## Always dry-run first + +Verification loads the full model and runs Phases 1–4 — 30 s to 30 min and needs enough free memory to hold the model. **Never invoke the real run blindly.** + +``` +set -a; source .env; set +a +uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run +``` + +Capture: estimated parameter count, projected memory (GB), HF_TOKEN requirement, architecture class. + +Decide whether to confirm with the user: + +| Model | Action | +|---|---| +| Cached small (`gpt2`, `attn-only-*`, `tiny-stories-1M`, `distilgpt2`, etc.) | Proceed directly; report the dry-run in your response so the user can intervene | +| ≥1B params, or gated, or anything else | Present the dry-run to the user and ask before running | + +## Run the verification ``` set -a; source .env; set +a uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" ``` -If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. +## Optional flags -**Optional flags for a single-model run** (full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transformer_lens/tools/model_registry/AGENTS.md#flag-reference)): +Full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transformer_lens/tools/model_registry/AGENTS.md#flag-reference). - `--device cpu|cuda|mps` — override automatic device selection - `--dtype float32|bfloat16` — override automatic dtype selection - `--max-memory ` — skip if parameter estimate exceeds; on a 24 GB GPU, `--max-memory 16` leaves headroom for activations - `--phases 1 2 3` — restrict to a subset (Phase 4 generation is the slowest; restrict when debugging Phase 1 forward parity) -- `--dry-run` — project memory + parameter count without loading the model. Use this before any expensive run, and as the data source for the confirmation prompt in [`/add-model-support`](add-model-support.md). +- `--dry-run` — project memory + parameter count without loading the model (see above; always run this first) - `--no-hf-reference` / `--no-ht-reference` — skip the HF / HT comparison passes (faster, lower confidence) - `--reverify` — re-test a model already at `status == 1` - `--retry-failed` — re-test a model at `status == 3` (read its existing `note` first to see the prior failure mode) Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) are not relevant for `--model ` — use the batch invocation in [tools/model_registry/AGENTS.md §Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations) if you need them. +## Hard rules + **Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json`. See [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) for the canonical-invocations table, file roles, and the resume/checkpoint mechanism. Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). For phase failures or logit-parity drift, the bisection workflow lives in [docs/source/content/debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md). diff --git a/.github/ISSUE_TEMPLATE/verify-model.md b/.github/ISSUE_TEMPLATE/verify-model.md new file mode 100644 index 000000000..3e8292a53 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/verify-model.md @@ -0,0 +1,51 @@ +--- +name: Verify model support +about: Track a request to run verify_models on a specific model that someone without appropriate hardware can't run themselves +title: "[Verify Model] org/model-id" +labels: verification-request + +--- + + + +## Model + +- HF repo: `https://huggingface.co/REPLACE_WITH_MODEL_ID` +- Architecture class (from `config.architectures[0]`): `REPLACE_WITH_HF_ARCH_CLASS` +- Estimated parameters: `REPLACE_WITH_PARAM_COUNT` +- Projected memory: `REPLACE_WITH_GB` GB (from `verify_models --dry-run`) +- Gated repo: `yes / no` (HF_TOKEN required: `yes / no`) + +## Registry state + +- Architecture adapter exists: `yes` (file: `transformer_lens/model_bridge/supported_architectures/REPLACE_WITH_ADAPTER.py`) +- Currently in `data/supported_models.json`: + - [ ] No — entry added in the PR linked from this issue + - [ ] Yes, status: `REPLACE_WITH_STATUS` (0=unverified, 2=skipped, 3=failed) + +## Motivation + + + + +## How to run + +```bash +set -a; source .env; set +a +uv run python -m transformer_lens.tools.model_registry.verify_models --model REPLACE_WITH_MODEL_ID +``` + +For the full workflow, see [Creating Architecture Adapters in contributing.md](../../docs/source/content/contributing.md#creating-architecture-adapters). + +## Result + + diff --git a/AGENTS.md b/AGENTS.md index cf9ad43af..555935c0c 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -4,6 +4,8 @@ Guidance for AI coding agents contributing to **TransformerLens**. This file is the single source of truth. Vendor-specific files ([CLAUDE.md](CLAUDE.md), [.github/copilot-instructions.md](.github/copilot-instructions.md), [.cursor/rules/transformerlens.mdc](.cursor/rules/transformerlens.mdc)) and sub-folder `AGENTS.md` files all defer here. +> **Just want to use the library?** Start with the [README](README.md). **Just want to run the tests?** Skip to [§3 Quickstart](#3-quickstart). **Contributing?** Read on. + ## TL;DR — read before doing anything 1. **Use `uv`**, not `pip` or `poetry`. Install with `uv sync`. @@ -29,6 +31,8 @@ Sub-folder rules: [tests/AGENTS.md](tests/AGENTS.md) · [supported_architectures | **`TransformerBridge`** | v3 — default for new work | [transformer_lens/model_bridge/](transformer_lens/model_bridge/) | Raw HF weights by default; `bridge.enable_compatibility_mode()` for HT-equivalent | [transformer_lens/tools/model_registry/data/supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json) | | **`HookedTransformer`** | Legacy, maintenance mode, deprecated in 3.0 | [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) + [transformer_lens/components/](transformer_lens/components/) | Folds LayerNorm + centres weights → does NOT match HF | [transformer_lens/supported_models.py](transformer_lens/supported_models.py) (**HT-only**) | +> ⚠ The **HookedTransformer acceptance test suite is currently quarantined** ([tests/acceptance/test_hooked_transformer.py](tests/acceptance/test_hooked_transformer.py), [test_hooked_encoder.py](tests/acceptance/test_hooked_encoder.py), [test_hooked_encoder_decoder.py](tests/acceptance/test_hooked_encoder_decoder.py)) — see [tests/QUARANTINES.md §⚠️](tests/QUARANTINES.md). HT-touching changes land essentially untested at the acceptance level until the suite is re-enabled; extra manual care is required. + Bridge architecture-adapter pattern: each HF architecture has one file in [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) mapping HF module paths to canonical TransformerLens names. Bridge hook names are architecture-native (e.g. `blocks.{i}.hook_out`); HT-style aliases are registered separately in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py). ### Mirroring rule @@ -72,6 +76,8 @@ uv run build-docs # build to docs/build/ Python: **>=3.10, <4.0**. CI tests 3.10, 3.11, 3.12. Format/type/docstring checks run on 3.12. +**Windows:** use WSL2 — the bash invocations above (`source`, `set -a; source .env; set +a`, heredocs in slash commands) don't translate to PowerShell. + ## 4. Repo map | Path | What's there | @@ -95,7 +101,7 @@ Python: **>=3.10, <4.0**. CI tests 3.10, 3.11, 3.12. Format/type/docstring check | [docs/source/content/](docs/source/content/) | Sphinx markdown sources | | [docs/source/content/adapter_development/](docs/source/content/adapter_development/) | Adapter-authoring guides — read these before adding a new architecture | | [makefile](makefile) | Canonical test/format/docs targets | -| [pyproject.toml](pyproject.toml) | Deps, pytest / mypy / format / build config. | +| [pyproject.toml](pyproject.toml) | Deps, pytest / mypy / format / build config | | [.github/workflows/checks.yml](.github/workflows/checks.yml) | CI gates | | [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md) | PR template + base-branch rules | @@ -118,37 +124,41 @@ When picking solutions to any problem, prioritize by **research impact**, not im ## 8. PR conventions -- **Base branch**: - - `dev` — default for all PRs (new features, refactors, docs, most bug fixes). - - `main` — **only** for bug fixes against the currently-released version. PRs to main made at Maintainer's discretion. -- **Branch names**: do NOT name your branch `main` or `dev` — these conflict with the canonical branches when maintainers periodically refresh PRs against upstream. Include the word `docs` in the branch name if your PR is primarily a docs change (this triggers the docs build job). -- **New branches must track their own remote**, not the source branch's remote. When you `git push -u`, push to the new branch's namespace, not the branch you forked from. -- **PR template**: [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md). No conventional-commits enforcement. -- **No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing; CI will fail otherwise. -- **Changelog**: no per-PR changelog file. User-facing changes go in the PR description and are rolled up into the next release essay in [docs/source/content/news/](docs/source/content/news/) (`release-2.0.md`, `release-3.0.md`, … — one file per major version). If your PR introduces a breaking change or a notable user-facing feature, note it explicitly in the PR description so the next release essay picks it up. +**Base branch**: +- `dev` — default for all PRs (new features, refactors, docs, most bug fixes). +- `main` — **only** for bug fixes against the currently-released version. PRs to `main` are made by maintainers; request permission before basing off `main`. + +**Branch naming**: +- Do NOT name your branch `main` or `dev` — these conflict with the canonical branches when maintainers periodically refresh PRs against upstream. +- Include the word `docs` in the branch name if your PR is primarily a docs change (this triggers the docs build job). +- New branches must track their own remote — `git push -u origin ` from your branch, not from `main`/`dev`. + +**Pre-push checks**: no pre-commit hook is installed. Run `make format` and `uv run mypy .` manually before pushing; CI will fail otherwise. + +**PR template**: [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md). No conventional-commits enforcement. + +**Changelog / release notes**: there is no per-PR changelog file. **Note user-facing changes in your PR description and commit messages** — release essays in [docs/source/content/news/](docs/source/content/news/) and the GitHub Releases page are drafted by a maintainer at release time by reviewing those. Breaking changes and notable user-facing features should be called out explicitly so the rollup picks them up. ## 9. CI gates a PR must pass From [.github/workflows/checks.yml](.github/workflows/checks.yml): -| Job | Runs | -|---|---| -| `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` × py 3.10 / 3.11 / 3.12 | -| `mps-checks` | macOS MPS unit + integration + smoke (PRs to `main` only) | -| `format-check` | `make check-format` | -| `type-check` | `uv run mypy .` | -| `docstring-test` | `make docstring-test` | -| `coverage-test` | Full suite + coverage artifact | -| `notebook-checks` | `nbval` over subset of `demos/*.ipynb`; `HF_TOKEN`-gated notebooks skip when secret absent | -| `build-docs` | Sphinx build (push to `main`/`dev` or branch containing `docs`) | -| `deploy-docs` | GitHub Pages (push to `main` only) | +| Job | Runs | When it fails, run locally | +|---|---|---| +| `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` × py 3.10 / 3.11 / 3.12 | `make unit-test` / `make acceptance-test` / `uv build`. Repro Python-specific failures with `uv python install ` then `uv run --python pytest …` | +| `mps-checks` | macOS MPS unit + integration + smoke (PRs to `main` only) | `TRANSFORMERLENS_ALLOW_MPS=1 uv run pytest tests/mps` on a Mac with MPS | +| `format-check` | `make check-format` | `make format` (writes the fix) then commit | +| `type-check` | `uv run mypy .` | `uv run mypy .` — fix the error; never add `# type: ignore` | +| `docstring-test` | `make docstring-test` | `make docstring-test` — failures are doctest mismatches in `transformer_lens/` | +| `coverage-test` | Full suite + coverage artifact | `make test-pr` covers most of this; full reproduction with `make coverage-report-test` | +| `notebook-checks` | `nbval` over subset of `demos/*.ipynb`; `HF_TOKEN`-gated notebooks skip when secret absent | `uv run pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` | +| `build-docs` | Sphinx build (push to `main`/`dev` or branch containing `docs`) | `uv run build-docs` (sources `.env` if needed for HF-gated doctest examples) | +| `deploy-docs` | GitHub Pages (push to `main` only) | Maintainer-only; rarely the contributor's fault — usually a `build-docs` artifact issue | In-progress PR runs cancel on new commit; tag/release runs do not. ## 10. Hard rules -These are the explicit "do / do not" rules. They are informed by maintainer experience. - **On test failures:** - Never dismiss a failing test as "pre-existing" or "unrelated" — investigate every failure and fix the underlying issue. @@ -185,7 +195,7 @@ Before declaring a task complete: | Notebook change | `pytest --nbval-sanitize-with demos/doc_sanitize.cfg demos/.ipynb` passes locally | | Anything else | `make format` + `uv run mypy .` + `make test-pr` clean before push | -If you're using Claude Code, the `/task-complete` slash command in [.claude/commands/](.claude/commands/) automates the last row (`make test-pr` = unit + docstring + acceptance + integration). +Claude Code: `/task-complete` automates the last row. See [§15 Workflow shortcuts](#15-workflow-shortcuts). ## 12. Pointers for further reading @@ -197,12 +207,10 @@ If you're using Claude Code, the `/task-complete` slash command in [.claude/comm ## 13. Local-only conventions -Two gitignored paths exist for ephemeral work; use them so your work-in-progress doesn't show up in `git status`: +Two gitignored paths for ephemeral work (gitignore entries checked in — first-class conventions, not personal): -- **`transformer_lens/scratch.py`** — sibling-of-package scratch file for one-off bisection scripts and ad-hoc imports. Already in `.gitignore`; never commit. -- **`.adapter-workspace/`** — directory for adapter WIP (notes, config dumps, repro scripts) while you're iterating. Already in `.gitignore`. - -Both are first-class conventions, not personal preferences — the gitignore entries are checked in. New contributors should know about them so they don't reinvent the convention or accidentally commit experimental code. +- **`transformer_lens/scratch.py`** — sibling-of-package scratch file for one-off bisection scripts and ad-hoc imports. +- **`.adapter-workspace/`** — directory for adapter WIP (notes, config dumps, repro scripts). ## 14. Upstream dependency pins @@ -225,3 +233,18 @@ Load-bearing pins live in [pyproject.toml](pyproject.toml): 5. Land in a focused PR — pin bumps are reviewed separately from feature work because the blast radius is wide. Specific to HF: when a `transformers` minor bump introduces a new architecture or renames an existing one, check the impact on `_HF_PASSTHROUGH_ATTRS` in [`transformer_lens/model_bridge/sources/_bridge_builder.py`](transformer_lens/model_bridge/sources/_bridge_builder.py) and `SUPPORTED_ARCHITECTURES` in [`transformer_lens/factories/architecture_adapter_factory.py`](transformer_lens/factories/architecture_adapter_factory.py). + +## 15. Workflow shortcuts + +Claude Code users have slash commands in [.claude/commands/](.claude/commands/) that wrap common workflows. Non-Claude agents (Cursor, Codex, Copilot, Aider, etc.) can run the manual equivalents: + +| Claude shortcut | Manual equivalent | Reference | +|---|---|---| +| `/test-unit` | `make unit-test` | [§3 Quickstart](#3-quickstart) | +| `/test-all` | `make test` (long) | [§3 Quickstart](#3-quickstart) | +| `/format` | `make format && uv run mypy .` | [§3 Quickstart](#3-quickstart) | +| `/typecheck` | `uv run mypy .` | [§3 Quickstart](#3-quickstart) | +| `/build-docs` | `set -a; source .env; set +a && uv run build-docs` | [§3 Quickstart](#3-quickstart) | +| `/verify-model ` | `set -a; source .env; set +a` then `uv run python -m transformer_lens.tools.model_registry.verify_models --model --dry-run` first, confirm, then drop `--dry-run` | [tools/model_registry/AGENTS.md](transformer_lens/tools/model_registry/AGENTS.md) | +| `/add-model-support ` | Follow the 4-way branch + adapter-authoring workflow | [supported_architectures/AGENTS.md](transformer_lens/model_bridge/supported_architectures/AGENTS.md) | +| `/task-complete` | `make format && uv run mypy . && make test-pr` (loop until clean) | [§11 Done checklist](#11-done-checklist) | diff --git a/CLAUDE.md b/CLAUDE.md index baaa88700..38b1674a8 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -6,7 +6,7 @@ - **Slash commands** — see [.claude/commands/](.claude/commands/) for canonical workflows: - `/test-unit` — `make unit-test` - - `/test-all` — `make test` (slow; warns before running) + - `/test-all` — `make test` (long; runs benchmarks + notebooks too) - `/format` — `make format` then `uv run mypy .` - `/typecheck` — `uv run mypy .` - `/build-docs` — sources `.env`, runs `uv run build-docs` @@ -19,8 +19,15 @@ ## Pointers when working with Claude Code in this repo -- The hard rules in [AGENTS.md §10](AGENTS.md#10-hard-rules) reflect rules the maintainer has repeatedly enforced. Treat them as load-bearing. -- The mirroring rule in [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — `HookedTransformer` to `TransformerBridge` — is the single most common source of "you forgot to mirror this to TransformerBridge" PR feedback. -- When investigating a failing test, **always check [tests/QUARANTINES.md](tests/QUARANTINES.md) first** — known quarantines have documented reasons and "un-skip when…" lines; spending an hour debugging a known macOS-arm64 KV-cache skip is the most common time-sink. -- For Bridge-vs-HF logit drift, use [docs/source/content/debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — bisection table by symptom and the four-quadrant compatibility-mode matrix. -- `bridge.enable_compatibility_mode()` is name-dropped in many places but specified in [docs/source/content/compatibility_mode.md](docs/source/content/compatibility_mode.md) — read it before deciding whether a new test needs it. +- [AGENTS.md §10](AGENTS.md#10-hard-rules) — hard rules; load-bearing. +- [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — HT → Bridge mirroring rule; single most common source of PR-review pushback. +- [tests/QUARANTINES.md](tests/QUARANTINES.md) — check this before debugging any failing test. Known quarantines have "un-skip when…" lines; the macOS-arm64 KV-cache skip is the most common time-sink. +- [docs/source/content/debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — bisection workflow for Bridge-vs-HF logit drift. +- [docs/source/content/compatibility_mode.md](docs/source/content/compatibility_mode.md) — what `bridge.enable_compatibility_mode()` actually does. Read before deciding whether a new test needs it. + +## Looking for a starter task? + +- [Issues labeled `good first issue`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) +- [Issues labeled `help wanted`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) +- [Issues labeled `verification-request`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3Averification-request) — models awaiting verification on hardware the requester didn't have. Pick one that fits your machine and run `/verify-model `. +- Backfilling per-adapter unit tests under [`tests/unit/model_bridge/supported_architectures/`](tests/unit/model_bridge/supported_architectures/) is high-leverage — recent commits show this is an active push and the pattern is well-established (copy a sibling adapter test). diff --git a/docs/source/content/compatibility_mode.md b/docs/source/content/compatibility_mode.md index 8ca304ea6..8a9c9f87f 100644 --- a/docs/source/content/compatibility_mode.md +++ b/docs/source/content/compatibility_mode.md @@ -89,6 +89,6 @@ New integration tests should use the variant that matches the property they're t ## See also -- [supported_architectures/AGENTS.md](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md) — adapter contract; the `setup_hook_compatibility` override hook is documented there. -- [debugging_numerical_divergence.md](debugging_numerical_divergence.md) — uses `no_processing=True` as a key bisection tool. -- [migrating_to_v3.md](migrating_to_v3.md) — when porting HT code, you almost always want `enable_compatibility_mode()`. +- [Creating Architecture Adapters in contributing.md](contributing.md#creating-architecture-adapters) — adapter contract and the four-place registration; adapter authors override `setup_hook_compatibility` only when an architecture has the post-norm / MLA carve-outs mentioned above. +- [Debugging Numerical Divergence](debugging_numerical_divergence.md) — uses `no_processing=True` as a key bisection tool. +- [Migrating to TransformerLens 3](migrating_to_v3.md) — when porting HT code, you almost always want `enable_compatibility_mode()`. diff --git a/docs/source/content/contributing.md b/docs/source/content/contributing.md index 87a28f6b0..7bbd7eb48 100644 --- a/docs/source/content/contributing.md +++ b/docs/source/content/contributing.md @@ -2,6 +2,8 @@ ```{warning} `HookedTransformer` is deprecated as of TransformerLens 3.0 and will be removed in the next major version. New code should use [`TransformerBridge`](migrating_to_v3.md) instead. Existing `HookedTransformer` code continues to work through the 3.x branch via a compatibility layer. See the [migration guide](migrating_to_v3.md) for conversion recipes. + +The HookedTransformer **acceptance test suite is currently quarantined** due to a CI test-pollution issue (see `tests/QUARANTINES.md` in the repo). Changes that touch HookedTransformer internals therefore land essentially untested at the acceptance level — extra manual care is required until the suite is re-enabled. ``` ## Setup @@ -22,6 +24,8 @@ As of TransformerLens 3.0, this project uses [UV](https://docs.astral.sh/uv/gett uv sync # activate the virtual environment source .venv/bin/activate +# first-time only: create .env from the template, then fill in HF_TOKEN for gated models +cp .env.example .env ``` Dependency groups are defined in `pyproject.toml` under `[dependency-groups]`. The project sets `default-groups = ["dev", "docs", "jupyter"]`, so `uv sync` installs all three out of the box — you do not need to pass `--group` flags for the standard contributor setup. @@ -33,6 +37,12 @@ You can also add individual groups with `uv sync --group `, or install wit Requires Python 3.10 or higher. +**Windows users:** the bash invocations above don't translate to PowerShell. Use WSL2. + +### Environment variables + +Source `.env` (`set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command. `HF_TOKEN` is required for gated models (Llama, Mistral, Gemma, gated Qwen variants). See `.env.example` in the repo root for the full list. + ## Testing If adding a feature, please add unit tests for it. If you need a model, please use one of the ones @@ -43,11 +53,33 @@ quite slow (as we only have CPU actions) so the smaller models like `attn-only-1 ### Running the tests -- Unit tests only via `make unit-test` -- Acceptance tests only via `make acceptance-test` -- Docstring tests only via `make docstring-test` -- Notebook tests only via `make notebook-test` -- Run all test suites mentioned `make test` +- Standard PR-review surface (unit + docstring + acceptance + integration): `make test-pr` +- Just unit tests (fast feedback): `make unit-test` +- Integration tests: `make integration-test` +- Acceptance tests: `make acceptance-test` +- Docstring tests: `make docstring-test` +- Notebook tests: `make notebook-test` +- All test suites including benchmarks + notebooks (long): `make test` + +### Test tiers + +| Tier | Path | Loads models? | Hits HF Hub? | Scope | +|---|---|---|---|---| +| `unit` | `tests/unit/` | None / synthetic (rare exceptions) | No | Function or single module | +| `integration` | `tests/integration/` | 1–2 cached models, module-scoped | Yes | Cross-component | +| `acceptance` | `tests/acceptance/` | Full models, session-scoped | Yes | End-to-end behaviour | +| `benchmarks` | `tests/benchmarks/` | Varies; performance focus | Yes | Throughput / memory | +| `mps` | `tests/mps/` | TinyStories-1M, fp32 only | Yes | macOS-MPS smoke only | + +Rule of thumb: new tests that load a model should land in `integration/` by default. Tests that need real (large) weights to verify go through the model registry's `verify_models` workflow rather than running in pytest. + +The flaky-retry policy (`--reruns 2 --reruns-delay 5`) wraps every `make` target and exists to absorb HF Hub 429s. The root `tests/conftest.py` also enables `enable_hf_retry()` session-wide. + +### Quarantined tests + +Some tests carry persistent `skip` / `skipif` / `xfail` markers — for optional dependencies (LIT, bitsandbytes), hardware requirements (CUDA, MPS, multi-GPU), CI cost / network budget, or upstream platform bugs. The `tests/QUARANTINES.md` file in the repo inventories every one with an "un-skip when…" line. **Before debugging a failing test, check whether it's a known quarantine.** + +The HookedTransformer acceptance suite (`tests/acceptance/test_hooked_transformer.py`, `test_hooked_encoder.py`, `test_hooked_encoder_decoder.py`) is currently a whole-file quarantine — see the warning at the top of this page. ## Formatting @@ -56,9 +88,65 @@ actions. - Format all files via `make format` - Only check the formatting via `make check-format` +- Type-check via `uv run mypy .` Note that `black` line length is set to 100 in `pyproject.toml` (instead of the default 88). +**No pre-commit hook is installed.** Run `make format` and `uv run mypy .` manually before pushing — CI will fail otherwise. + +## Project conventions + +These rules apply across the codebase and reflect maintainer experience with recurring failure modes. + +### On test failures + +- **Never dismiss a failing test as "pre-existing" or "unrelated"** — investigate every failure and fix the underlying issue. +- **Never add platform skips, `@pytest.mark.skip`, or `xfail` to make a failing CI green.** Debug the actual bug. If a skip is genuinely needed (optional dep, hardware requirement), add an entry to `tests/QUARANTINES.md` with the rationale. +- **If new tests surface pre-existing bugs, fix them.** Don't punt to a future PR. + +### On code quality + +- **Never add `# type: ignore`.** Prefer `isinstance` assertions or `typing.cast` for narrowing. +- **Comments**: terse, one-line docstrings; inline comments should explain WHY, not WHAT. No multi-paragraph explanations. + +### On numerical work + +- **Never claim observed drift is "fp noise" without empirical evidence.** Real bugs and accumulated rounding error look identical at noise scale until you measure. The cheap proof is fp64: if the diff stays the same magnitude in `dtype=torch.float64` on both sides, it's a bug; if it drops by ~8 orders of magnitude, it was noise. +- **Never run model verification or benchmarks in parallel.** A single CUDA/MPS device does not have memory for concurrent loads. + +### On environment and tooling + +- Use `uv`, never `pip` or `poetry`. Run commands via `uv run ` or the appropriate `make` target. +- Source `.env` before any HF-Hub-hitting command. + +## Two systems live in this repo + +The library is mid-transition between two parallel paths: + +| System | Status | Lives in | Numerics | Registry | +|---|---|---|---|---| +| `TransformerBridge` | v3 — default for new work | `transformer_lens/model_bridge/` | Raw HF weights by default; `bridge.enable_compatibility_mode()` for HT-equivalent — see [Compatibility Mode](compatibility_mode.md) | `transformer_lens/tools/model_registry/data/supported_models.json` | +| `HookedTransformer` | Legacy, maintenance mode, deprecated in 3.0 | `transformer_lens/HookedTransformer.py` + `transformer_lens/components/` | Folds LayerNorm + centres weights → does NOT match HF | `transformer_lens/supported_models.py` (**HT-only**) | + +**Mirroring rule**: if you change behaviour in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. They are parallel implementations of the same surface; drift between them is a recurring source of bugs. Conversely, `supported_models.py` is HT-only — Bridge-only models live in the Bridge registry data file under `transformer_lens/tools/model_registry/`. + +## PR conventions + +### Base branch + +- `dev` — default for all PRs (new features, refactors, docs, most bug fixes). +- `main` — **only** for bug fixes against the currently-released version. PRs to `main` are made by maintainers; request permission before basing off `main`. + +### Branch naming + +- Do NOT name your branch `main` or `dev` — these conflict with the canonical branches when maintainers periodically refresh PRs against upstream. +- Include the word `docs` in the branch name if your PR is primarily a docs change (this triggers the docs build job). +- New branches must track their own remote (`git push -u origin ` from your branch, not from `main`/`dev`). + +### Changelog + +There is no per-PR changelog file. **Note user-facing changes in your PR description and commit messages** — release essays in `docs/source/content/news/` and GitHub Releases are drafted by a maintainer by reviewing those at release time. Breaking changes and notable user-facing features should be called out explicitly so the rollup picks them up. + ## Documentation Please make sure to add thorough documentation for any features you add. You should do this directly @@ -170,7 +258,51 @@ Two guides walk through the process: - [Architecture Adapter Creation Guide](adapter_development/adapter-creation-guide.md) — start here. A step-by-step workflow for taking an HF model from unsupported to tested, registered adapter. - [HuggingFace Model Analysis Guide](adapter_development/hf-model-analysis-guide.md) — a reference for reading an HF model's `config.json` and source files to extract the attributes you'll set on `self.cfg`. -Adapters live in `transformer_lens/model_bridge/supported_architectures/.py` and are registered in two places: `supported_architectures/__init__.py` and `factories/architecture_adapter_factory.py`. Both steps are covered in the creation guide. If you want a starter file, copy [adapter-template.py](../_static/adapter-template.py) into `supported_architectures/` and rename it. +Adapters live in `transformer_lens/model_bridge/supported_architectures/.py` and are registered in **four** places: + +1. **`transformer_lens/model_bridge/supported_architectures/__init__.py`** — import the adapter class and add to `__all__`. Missing → import error at boot. +2. **`transformer_lens/factories/architecture_adapter_factory.py`** — import the class and add to the `SUPPORTED_ARCHITECTURES` dict. The key must match `config.architectures[0]` exactly. Missing → `boot_transformers` raises "unsupported architecture." +3. **`transformer_lens/tools/model_registry/__init__.py`** — add to `HF_SUPPORTED_ARCHITECTURES` (the set) and `CANONICAL_AUTHORS_BY_ARCH` (the foundation-orgs map). Missing → HF scraper misses canonical models. +4. **`transformer_lens/tools/model_registry/generate_report.py`** — add a one-line entry to `ARCHITECTURE_DESCRIPTIONS` so the generated docs table covers the new architecture. + +If you want a starter file, copy [adapter-template.py](../_static/adapter-template.py) into `supported_architectures/` and rename it. + +After registering, verify the four-place wiring by running: + +```bash +uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory +``` + +The `TestRegistrySyncedWithFactory` class bidirectionally asserts that `SUPPORTED_ARCHITECTURES`, `HF_SUPPORTED_ARCHITECTURES`, and `CANONICAL_AUTHORS_BY_ARCH` stay in sync — the failure message names exactly which set is missing your new key. + +### Required tests for a new adapter + +Two test layers, both required: + +1. **Unit adapter test** at `tests/unit/model_bridge/supported_architectures/test__adapter.py`. ~29 of these exist; copy the closest sibling. The pattern: a `_make_cfg()` factory, an `adapter` fixture, and one test per architecture-specific quirk. Unit adapter tests instantiate the adapter from a synthetic config and assert structural properties — they don't load weights and don't hit HF Hub. +2. **Integration parity test** at `tests/integration/model_bridge/test__adapter.py`. Loads a real cached HF model and asserts logit parity vs HuggingFace at fp32 + eager attention. + +### Common adapter gotchas + +- **HF raw config attributes are invisible to TL-side consumers unless explicitly propagated to `self.cfg`.** Walk the HF `config.json` and mirror any non-standard knobs (`final_logit_softcapping`, `attn_logit_softcapping`, `query_pre_attn_scalar`, `sliding_window`, `layer_types`, custom `eps_attr` names) onto `self.cfg` so weight processing and forward passes can see them. +- **Tokenizer policy is per-model, not per-architecture.** Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, padding side can flip, EOS handling can differ. Never copy `default_prepend_bos`, padding side, or EOS handling from your starter adapter without checking the target's `tokenizer_config.json` first. +- **Hook names inside adapters are Bridge-native** (e.g., `blocks.{i}.hook_out`). HookedTransformer-style aliases (e.g., `blocks.{i}.hook_resid_post`) are registered elsewhere — in `transformer_lens/model_bridge/bridge.py` via `build_alias_to_canonical_map()`. Adapters declare canonical names only. +- **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` or `typing.cast` if the type system disagrees. + +### Verifying a new model + +After your adapter is registered, the model registry's `verify_models` runner exercises it end-to-end: + +```bash +set -a; source .env; set +a +uv run python -m transformer_lens.tools.model_registry.verify_models --model +``` + +`verify_models` runs phases 1–4 (forward correctness vs HF, hook firing + gradients, weight processing, generation quality) and updates `data/supported_models.json` with status and per-phase scores. **Always run `--dry-run` first** to project memory and parameter count without loading the model. Run one model at a time — concurrent loads OOM a single device. + +**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes the registry. `main_benchmark` runs the same math but defaults to skipping the registry write (requires `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). + +If verification fails by `~1e-3` or more against the HF reference, the bisection workflow lives at [Debugging Numerical Divergence](debugging_numerical_divergence.md). ```{toctree} :hidden: diff --git a/docs/source/content/debugging_numerical_divergence.md b/docs/source/content/debugging_numerical_divergence.md index 34b6cb10b..8fa43fe3e 100644 --- a/docs/source/content/debugging_numerical_divergence.md +++ b/docs/source/content/debugging_numerical_divergence.md @@ -2,7 +2,7 @@ When a Bridge adapter's integration test fails by `~1e-3` (or any larger delta) against the HuggingFace reference, the failure mode is almost always one of a small set of recurring bugs. This page walks the bisection workflow. -> **Hard rule (also in [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)):** never call observed drift "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale until you measure. +> **Hard rule** (see also the [project conventions in contributing.md](contributing.md#on-numerical-work)): never call observed drift "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale until you measure. --- @@ -74,7 +74,7 @@ hf_logits = hf(ids).logits print((ref_logits - hf_logits).abs().max()) # should be < 1e-4 in fp32 ``` -If `boot_transformers` itself disagrees with HF on the same model, the issue is upstream of your adapter (probably a `_HF_PASSTHROUGH_ATTRS` gap in [`_bridge_builder.py`](../../../transformer_lens/model_bridge/sources/_bridge_builder.py) or a missing config-attr propagation — see [supported_architectures/AGENTS.md §Config-attr propagation](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation)). +If `boot_transformers` itself disagrees with HF on the same model, the issue is upstream of your adapter (probably a `_HF_PASSTHROUGH_ATTRS` gap in `transformer_lens/model_bridge/sources/_bridge_builder.py`, or a non-standard HF config attribute that the adapter never propagated onto `self.cfg`). HF raw config attributes are invisible to TL-side consumers unless explicitly mirrored. Common attributes that need propagation: `final_logit_softcapping` (Gemma2/3), `attn_logit_softcapping` (Gemma2/3), `query_pre_attn_scalar` (Gemma2/3), `sliding_window` (Mistral, Qwen2, Gemma2), `layer_types` (hybrid models), and non-standard RMSNorm eps attribute names (Llama uses `variance_epsilon`). ## 5. Bisecting `verify_models` phase failures @@ -85,7 +85,7 @@ If `boot_transformers` itself disagrees with HF on the same model, the issue is | 1 | Forward correctness vs HF | Steps 1–4 above; this is the standard parity workflow | | 2 | Hook firing / gradient flow | The hook isn't registered, or it's firing on a tensor that's been replaced (in-place op). Grep adapter for in-place ops on hookable tensors. | | 3 | Weight processing | Run with `no_processing=True` to isolate. Then bisect compat-mode flags per §3 above. | -| 4 | Text-generation quality | Usually tokenizer policy (BOS, padding, EOS) — see [supported_architectures/AGENTS.md §Tokenizer policy](../../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy). Less often, a generation-loop divergence; rerun with `--no-ht-reference` to skip HT comparison. | +| 4 | Text-generation quality | Usually tokenizer policy: `default_prepend_bos`, padding side, EOS handling, chat-template wiring. Tokenizer behaviour is per-model, not per-architecture — check the target's `tokenizer_config.json`, don't inherit from a sibling. Less often, a generation-loop divergence; rerun with `--no-ht-reference` to skip HT comparison. | | 7 | Multimodal alignment | Vision encoder output drift or projection mismatch. Llava / Gemma3-multimodal only. | | 8 | Audio | HuBERT only; check CTC head and audio-feature alignment. | @@ -97,7 +97,7 @@ Empirically, in this codebase: - **bf16, eager**: `< 1e-2` is the noise floor. - **fp32, sdpa**: `< 5e-4` due to sdpa's internal reductions. Use eager for parity tests. -If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug. If it drops by ~8 orders of magnitude, it was noise. This is what [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)'s rule is asking for. +If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug. If it drops by ~8 orders of magnitude, it was noise. See the [project conventions in contributing.md](contributing.md#on-numerical-work) for the rule behind this. ## 7. Tooling diff --git a/docs/source/content/hook_system.md b/docs/source/content/hook_system.md new file mode 100644 index 000000000..04c7e8b6a --- /dev/null +++ b/docs/source/content/hook_system.md @@ -0,0 +1,177 @@ +# The Hook System + +Hooks are the primary value proposition of TransformerLens. They let you intercept, cache, edit, or ablate intermediate activations as a model runs — without modifying the model code. This page covers the user-facing API; for the implementation, see [`transformer_lens/hook_points.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/hook_points.py). + +--- + +## What a hook is + +A `HookPoint` is an `nn.Identity` module placed at a named position inside a model (e.g., `blocks.0.attn.hook_q`). At runtime it passes its input through unchanged — but a registered hook function can read the tensor (caching), modify it (intervention), or replace it (ablation, patching). + +The hook system is purely PyTorch's `register_forward_hook` / `register_full_backward_hook` underneath, but with two additions that matter: + +1. **Named positions.** Every hook point has a string name like `blocks.{i}.hook_resid_post`, so you can address it without traversing the module tree. +2. **Context-scoped lifecycle.** Hooks added inside a `with model.hooks(...)` context auto-remove when the block exits — no manual cleanup required. + +--- + +## Three ways to use hooks + +### 1. `run_with_cache` — read everything + +The simplest workflow: run a forward pass and get back both the logits and a dict of every cached activation, keyed by hook name. + +```python +logits, cache = model.run_with_cache("Hello, world") +cache["blocks.0.attn.hook_q"] # Q tensor at layer 0 +cache["blocks.5.hook_resid_post"] # residual stream after block 5 +cache["ln_final.hook_normalized"] # post-final-norm activations +``` + +`cache` is an `ActivationCache` — a dict-like with conveniences (`cache.decompose_resid()`, `cache.apply_ln_to_stack(...)`, etc.). See [`transformer_lens/ActivationCache.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/ActivationCache.py). + +For TransformerBridge: + +```python +from transformer_lens.model_bridge import TransformerBridge +bridge = TransformerBridge.boot_transformers("gpt2", device="cpu") +logits, cache = bridge.run_with_cache("Hello, world") +``` + +Pass `names_filter=` to cache only a subset (saves memory): + +```python +logits, cache = model.run_with_cache( + "Hello, world", + names_filter=lambda name: name.endswith("hook_resid_post"), +) +``` + +### 2. `run_with_hooks` — intervene during the forward pass + +Attach temporary hooks for one forward pass, then auto-remove them. + +```python +def zero_out_head_3(activation, hook): + activation[:, :, 3, :] = 0 # ablate attention head 3 + return activation + +logits = model.run_with_hooks( + "Hello, world", + fwd_hooks=[("blocks.5.attn.hook_z", zero_out_head_3)], +) +``` + +Each hook is a `(hook_name, hook_fn)` tuple. The hook function signature is **always**: + +```python +def hook_fn(tensor: torch.Tensor, *, hook: HookPoint) -> Optional[torch.Tensor]: + # Read or modify `tensor`. Return None to leave the activation unchanged, + # or return a tensor of the same shape to replace it. + ... +``` + +`hook` (the `HookPoint` instance) exposes `hook.name` so a single function can dispatch on which hook called it. + +For backward hooks (gradient interventions), use `bwd_hooks=[...]` with the same tuple shape. + +### 3. `add_hook` + `remove_all_hook_fns` — manual lifecycle + +When you need hooks that persist across multiple forward passes (e.g., during training), drop down to the underlying API: + +```python +hook_point = model.blocks[5].attn.hook_z +hook_point.add_hook(my_hook_fn, dir="fwd") # temporary +hook_point.add_hook(my_hook_fn, dir="fwd", is_permanent=True) # survives reset + +# later +model.remove_all_hook_fns() # removes temporary hooks +model.remove_all_hook_fns(including_permanent=True) # also removes permanent +``` + +`add_hook` returns nothing useful; lifecycle is owned by the `HookPoint`. The `is_permanent` flag is the only way to survive a `remove_all_hook_fns()` call. + +--- + +## Hook naming + +Hook names are stable strings you address from outside the model. They differ between HookedTransformer and TransformerBridge: + +| System | Style | Example | +|---|---|---| +| `HookedTransformer` (legacy) | Uniform across architectures | `blocks.5.attn.hook_q`, `blocks.5.hook_resid_post`, `hook_embed` | +| `TransformerBridge` (default) | Architecture-native | `blocks.5.attn.q.hook_out`, `blocks.5.hook_out`, `embed.hook_out` | +| `TransformerBridge` + compatibility mode | Bridge-native **and** HT-style aliases both fire | Same as above, plus `blocks.5.attn.hook_q` etc. | + +For the full canonical-name catalogue, the demo notebooks ([Main Demo](generated/demos/Main_Demo), [Exploratory Analysis Demo](generated/demos/Exploratory_Analysis_Demo)) walk through them with examples. For an architecture diagram showing where each name fires, see [TransformerLens_Diagram.svg](../_static/TransformerLens_Diagram.svg). + +When porting HookedTransformer code to TransformerBridge, calling `bridge.enable_compatibility_mode()` (see [Compatibility Mode](compatibility_mode.md)) registers the HT-style aliases so your existing hook names continue to resolve. + +--- + +## Common patterns + +### Cache one activation, run a single forward pass + +```python +logits, cache = model.run_with_cache("text", names_filter="blocks.5.hook_resid_post") +resid_5 = cache["blocks.5.hook_resid_post"] +``` + +### Zero-ablate a head + +```python +def ablate(z, hook): + z[:, :, head_idx, :] = 0 + return z + +model.run_with_hooks("text", fwd_hooks=[(f"blocks.{layer}.attn.hook_z", ablate)]) +``` + +### Activation patching (swap an activation from one prompt into another) + +```python +_, clean_cache = model.run_with_cache(clean_prompt) +target = clean_cache["blocks.5.hook_resid_post"] + +def patch(resid, hook): + return target # replace corrupted's activation with clean's + +logits = model.run_with_hooks( + corrupted_prompt, + fwd_hooks=[("blocks.5.hook_resid_post", patch)], +) +``` + +[`transformer_lens/patching.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/patching.py) wraps this pattern for systematic sweeps across layers / positions. + +### Gradient intervention via backward hook + +```python +def scale_grad(grad, hook): + return grad * 0.1 + +model.run_with_hooks( + "text", + bwd_hooks=[("blocks.5.hook_resid_post", scale_grad)], +) +``` + +--- + +## Lifecycle gotchas + +- **Temporary hooks added outside `run_with_hooks` / `model.hooks(...)` do NOT auto-clean.** Call `model.remove_all_hook_fns()` or you'll leak hooks across runs. +- **Permanent hooks (`is_permanent=True`) survive `remove_all_hook_fns()`** — use `including_permanent=True` to clear them. +- **Hook functions that return a tensor replace the activation in-flight.** Returning `None` leaves it unchanged. In-place modification (`tensor[…] = …`) + `return tensor` is the common pattern. +- **Backward hooks see `(grad,)` tuples** at the PyTorch level — the wrapper in `hook_points.py` unwraps to a bare tensor for you. Your hook function still receives a bare tensor. +- **Hooks fire in registration order**, with `prepend=True` to register at the front. + +--- + +## See also + +- [Compatibility Mode](compatibility_mode.md) — when to enable HT-style hook aliases on a Bridge model. +- [Migrating to TransformerLens 3](migrating_to_v3.md) — porting HookedTransformer hook patterns to TransformerBridge. +- [Main Demo](generated/demos/Main_Demo) — end-to-end walkthrough using the hook system. +- [`transformer_lens/hook_points.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/hook_points.py), [`transformer_lens/ActivationCache.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/ActivationCache.py), [`transformer_lens/patching.py`](https://github.com/TransformerLensOrg/TransformerLens/blob/main/transformer_lens/patching.py) — source. diff --git a/docs/source/index.md b/docs/source/index.md index de5cc84eb..b10bd3a9d 100644 --- a/docs/source/index.md +++ b/docs/source/index.md @@ -54,6 +54,9 @@ content/migrating_to_v3 content/tutorials content/citation content/contributing +content/hook_system +content/compatibility_mode +content/debugging_numerical_divergence generated/demos/Main_Demo generated/demos/Exploratory_Analysis_Demo content/special_cases diff --git a/tests/AGENTS.md b/tests/AGENTS.md index c129dd788..17779c68a 100644 --- a/tests/AGENTS.md +++ b/tests/AGENTS.md @@ -2,6 +2,8 @@ Read [the root AGENTS.md](../AGENTS.md) for project-wide rules. This file covers conventions specific to `tests/`. +> **Just running tests?** Run `make test-pr` for the standard PR-review surface (unit + docstring + acceptance + integration). `make unit-test` for fast feedback. Full tier breakdown in [Test tiers](#test-tiers) below. + --- ## TL;DR @@ -33,17 +35,27 @@ Common combinations: `make test-pr` (unit + docstring + acceptance + integration ## Conftest hierarchy +[`tests/conftest.py`](conftest.py) — root, provides: + +- `cleanup_memory` (function autouse), `cleanup_class_memory` — CUDA/MPS cache + GC +- `_enable_hf_retry_for_tests` (session autouse) — wraps HF `from_pretrained` with 429 retry +- Seeded RNG (numpy/torch/Python @ 42) +- `gpt2_tokenizer` (session) +- `gpt2_hooked_processed` (session) +- `temp_dir` + +Sub-folder conftests: + | Path | Provides | |---|---| -| [`tests/conftest.py`](conftest.py) | `cleanup_memory` (function autouse) and `cleanup_class_memory` — CUDA/MPS cache + GC; `_enable_hf_retry_for_tests` (session autouse) — wraps HF `from_pretrained` with 429 retry; seeded RNG (numpy/torch/Python @ 42); `gpt2_tokenizer` + `gpt2_hooked_processed` (session); `temp_dir` | | [`tests/acceptance/conftest.py`](acceptance/conftest.py) | `gpt2_model`, `bloom_560m_hooked`, `bloom_560m_hf_model`, `bloom_560m_hf_tokenizer` (all session) | | [`tests/acceptance/model_bridge/conftest.py`](acceptance/model_bridge/conftest.py) | Bridge variants of gpt2 with/without compat mode | | [`tests/integration/model_bridge/conftest.py`](integration/model_bridge/conftest.py) | distilgpt2 + gpt2 Bridge variants × {compat, no-compat, no-processing} | -**Notes:** +Two cross-cutting rules: - All `transformer_lens` imports inside conftest fixtures live in fixture bodies, not at module top — jaxtyping's `pytest_configure` hook must install before the package is first imported. -- The session-scoped model fixtures (`gpt2_hooked_processed`, `gpt2_bridge`, …) are read-only — mutating them leaks across the entire test session. +- Session-scoped model fixtures (`gpt2_hooked_processed`, `gpt2_bridge`, …) are read-only — mutating them leaks across the entire test session. --- From 50eb9ff74c335961ddd1b713b03d976fc6a30ef9 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 18:48:51 -0500 Subject: [PATCH 07/11] second end-to-end test changes --- .claude/commands/add-model-support.md | 2 +- .claude/commands/verify-model.md | 21 ++++- AGENTS.md | 2 +- docs/source/content/contributing.md | 27 ++++++- .../supported_architectures/AGENTS.md | 77 +++++++++++++++++-- .../tools/model_registry/AGENTS.md | 26 ++++++- 6 files changed, 141 insertions(+), 14 deletions(-) diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index 1f9bdcd12..3a4a9a9ae 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -48,7 +48,7 @@ Execute this checklist, stopping at each step until it is genuinely done: 6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per [supported_architectures/AGENTS.md §"Adding the HF repo to the registry"](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Then ask the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. -7. **Verify** end-to-end: run `/verify-model $ARGUMENTS`. (`/verify-model` enforces dry-run-first and one-at-a-time.) +7. **Verify** end-to-end: run `/verify-model $ARGUMENTS`. (`/verify-model` enforces dry-run-first and one-at-a-time.) Read both `status` and per-phase scores from the output. `STATUS_VERIFIED` means the system's hard gates passed (see [tools/model_registry/AGENTS.md §Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds) for the actual table) — but Phase 4's 50% bar is intentionally lenient. If Phase 4 on a small parity-test model is well below 100% even though `status == 1`, suspect a missing `preprocess_weights` fold or wrong `default_prepend_bos` and investigate before continuing to step 8. 8. **Write tests.** Read [Required tests in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#required-tests) for the two-layer pattern (unit adapter test + integration parity test) and copy the closest sibling. diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md index bd728cb1f..7ad8a6ddc 100644 --- a/.claude/commands/verify-model.md +++ b/.claude/commands/verify-model.md @@ -45,8 +45,27 @@ Full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transfor Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) are not relevant for `--model ` — use the batch invocation in [tools/model_registry/AGENTS.md §Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations) if you need them. +## Interpreting the output + +`verify_models` enforces hard thresholds (from `_MIN_PHASE_SCORES` in `verify_models.py`): + +| Phase | Min score | Required tests | Below = | +|---|---|---|---| +| 1 | 100% | — | `STATUS_FAILED` | +| 2 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | +| 3 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | +| 4 | 50% | — | **Non-gating** — only adds `"low text quality"` to the `note`; never causes `STATUS_FAILED`. | +| 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL = fail. | +| 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL = fail. | + +`STATUS_VERIFIED` means the system's hard contract is met. The `note` field carries quality flags (`"low text quality"`) or failure details. + +**One caveat for adapter authors:** Phase 4's 50% bar is intentionally lenient (it's a coherence metric, not correctness). A small parity-test model that gets Phase 4 well below 100% can still indicate a real adapter bug the system doesn't gate on — most often a missing `preprocess_weights` fold ([supported_architectures/AGENTS.md §When to override preprocess_weights](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights)) or wrong `default_prepend_bos` ([§Tokenizer policy](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy)). Worth investigating even on VERIFIED runs. + +Full reference: [tools/model_registry/AGENTS.md §Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds). + ## Hard rules **Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json`. See [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) for the canonical-invocations table, file roles, and the resume/checkpoint mechanism. -Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). For phase failures or logit-parity drift, the bisection workflow lives in [docs/source/content/debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md). +Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output (including per-phase scores) and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). diff --git a/AGENTS.md b/AGENTS.md index 555935c0c..fc5c824d8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -146,7 +146,7 @@ From [.github/workflows/checks.yml](.github/workflows/checks.yml): | Job | Runs | When it fails, run locally | |---|---|---| | `compatibility-checks` | `make unit-test` + `make acceptance-test` + `uv build` × py 3.10 / 3.11 / 3.12 | `make unit-test` / `make acceptance-test` / `uv build`. Repro Python-specific failures with `uv python install ` then `uv run --python pytest …` | -| `mps-checks` | macOS MPS unit + integration + smoke (PRs to `main` only) | `TRANSFORMERLENS_ALLOW_MPS=1 uv run pytest tests/mps` on a Mac with MPS | +| `mps-checks` | macOS MPS unit + integration + smoke (PRs targeting `main` or pushes to `main` only) | `TRANSFORMERLENS_ALLOW_MPS=1 uv run pytest tests/mps` on a Mac with MPS | | `format-check` | `make check-format` | `make format` (writes the fix) then commit | | `type-check` | `uv run mypy .` | `uv run mypy .` — fix the error; never add `# type: ignore` | | `docstring-test` | `make docstring-test` | `make docstring-test` — failures are doctest mismatches in `transformer_lens/` | diff --git a/docs/source/content/contributing.md b/docs/source/content/contributing.md index 7bbd7eb48..55202bbab 100644 --- a/docs/source/content/contributing.md +++ b/docs/source/content/contributing.md @@ -279,13 +279,23 @@ The `TestRegistrySyncedWithFactory` class bidirectionally asserts that `SUPPORTE Two test layers, both required: -1. **Unit adapter test** at `tests/unit/model_bridge/supported_architectures/test__adapter.py`. ~29 of these exist; copy the closest sibling. The pattern: a `_make_cfg()` factory, an `adapter` fixture, and one test per architecture-specific quirk. Unit adapter tests instantiate the adapter from a synthetic config and assert structural properties — they don't load weights and don't hit HF Hub. +1. **Unit adapter test** at `tests/unit/model_bridge/supported_architectures/test__adapter.py`. ~26 of these exist; copy the closest sibling. The pattern: a `_make_cfg()` factory, an `adapter` fixture, and one test per architecture-specific quirk. Unit adapter tests instantiate the adapter from a synthetic config and assert structural properties — they don't load weights and don't hit HF Hub. 2. **Integration parity test** at `tests/integration/model_bridge/test__adapter.py`. Loads a real cached HF model and asserts logit parity vs HuggingFace at fp32 + eager attention. ### Common adapter gotchas - **HF raw config attributes are invisible to TL-side consumers unless explicitly propagated to `self.cfg`.** Walk the HF `config.json` and mirror any non-standard knobs (`final_logit_softcapping`, `attn_logit_softcapping`, `query_pre_attn_scalar`, `sliding_window`, `layer_types`, custom `eps_attr` names) onto `self.cfg` so weight processing and forward passes can see them. -- **Tokenizer policy is per-model, not per-architecture.** Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, padding side can flip, EOS handling can differ. Never copy `default_prepend_bos`, padding side, or EOS handling from your starter adapter without checking the target's `tokenizer_config.json` first. +- **Some config attrs need both surface-on-cfg AND fold-into-weight** via a `preprocess_weights()` override. The trigger: a numerical operation HF's forward applies natively must also be baked into the raw weights, or `bridge.enable_compatibility_mode()` (which calls `process_weights` on raw weights) produces wrong results. Concrete examples in-tree: Cohere `logit_scale` → `unembed.weight`; Gemma embedding scale (`√d_model`) → `embed.weight`. Skip the fold and Phase 3 / Phase 4 of `verify_models` will silently degrade. +- **Tokenizer policy is per-model, not per-architecture.** Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, padding side can flip, EOS handling can differ. Never copy `default_prepend_bos`, padding side, or EOS handling from your starter adapter without checking the target's behaviour. **`tokenizer_config.json` lies for some architectures** (Cohere declares `add_bos_token=False` but HF's `__call__` prepends BOS anyway). The reliable check is to invoke the tokenizer: + + ```python + from transformers import AutoTokenizer + t = AutoTokenizer.from_pretrained("") + print(t("hello").input_ids) # what generation actually uses + print(t.bos_token_id) + ``` + + If `t("hello").input_ids[0] == t.bos_token_id`, set `cfg.default_prepend_bos = True`. Otherwise leave the flag unset. - **Hook names inside adapters are Bridge-native** (e.g., `blocks.{i}.hook_out`). HookedTransformer-style aliases (e.g., `blocks.{i}.hook_resid_post`) are registered elsewhere — in `transformer_lens/model_bridge/bridge.py` via `build_alias_to_canonical_map()`. Adapters declare canonical names only. - **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` or `typing.cast` if the type system disagrees. @@ -302,6 +312,19 @@ uv run python -m transformer_lens.tools.model_registry.verify_models --model `; the rest of the bridge reads them from there): | HF attribute | Surface as | Used by | |---|---|---| @@ -100,7 +100,54 @@ Common attributes that need explicit propagation: | `layer_types` | `self.cfg.layer_types` | Hybrid models with per-layer attention type lists | | Non-standard RMSNorm eps key | `self.cfg.eps_attr = ""` | Llama uses `"variance_epsilon"` instead of `"eps"` | -Rule of thumb: if the model card or HF source mentions a numerical knob, assume it needs to land on `self.cfg`. +**Weight-fold attributes** (need BOTH surface-on-cfg AND fold-into-weight via `preprocess_weights` — see [the next section](#when-to-override-preprocess_weights)): + +| HF attribute | Fold target | Used by | +|---|---|---| +| `logit_scale` | `unembed.weight` (multiply in fp32 then cast back) | Cohere — final logits scaled by `1/16` | +| `embedding_multiplier` / embed-scale flags | `embed.weight` (multiply) | Gemma — embeddings scaled by `√d_model` | +| Tied unembed with extra scale | `unembed.weight` | T5-family tied projection variants | + +Rule of thumb: if the model card or HF source mentions a numerical knob, assume it needs to land on `self.cfg`. If that knob changes weights or final outputs and HF's forward applies it natively, you ALSO need a `preprocess_weights` override or compatibility mode will diverge. + +--- + +## When to override `preprocess_weights` + +The framework default for `preprocess_weights()` is a no-op pass-through. Override it when **a numerical operation that HF's forward applies natively must also be baked into the raw weights** — otherwise `bridge.enable_compatibility_mode()` (which calls `process_weights` on the raw weights, expecting them to already encode all the math) produces wrong results. + +The trigger rule: + +> If a config attr changes the math of the forward pass AND would not be re-applied during compatibility-mode weight processing, fold it into the relevant weight inside `preprocess_weights()`. + +Concrete examples in-tree: + +- **Cohere** — `cfg.logit_scale` (default `0.0625`) must be folded into `unembed.weight`. HF forward multiplies logits by `logit_scale`; compat-mode `process_weights` does not, so without the fold, compat-mode logits are off by `1/16`. +- **Gemma1 / Gemma2 / Gemma3** — embedding scale (`√d_model`) must be folded into `embed.weight`. HF's `GemmaTextScaledWordEmbedding` scales internally on forward; compat-mode reads embeddings raw. + +Skeleton: + +```python +import torch + +def preprocess_weights(self, state_dict: dict[str, torch.Tensor]) -> dict[str, torch.Tensor]: + """Fold into before ProcessWeights runs. + + bridge.py clones unembed.weight before calling this, so the scale does not + leak into the tied embed.weight. + """ + scale: float = getattr(self.cfg, "") # set in __init__ + if scale != 1.0: # no-op when scale is identity + key = "unembed.weight" + if key in state_dict: + orig_dtype = state_dict[key].dtype + state_dict[key] = (state_dict[key].float() * scale).to(orig_dtype) + return state_dict +``` + +**When you do NOT need `preprocess_weights`:** if the numerical operation lives entirely inside an HF submodule that the bridge delegates to (e.g. RoPE inside `CohereRotaryEmbedding`), HF forward applies it on both paths and you inherit parity for free. + +**Verification**: a missing `preprocess_weights` fold most often degrades Phase 4 (text-generation quality), and can show up as Phase 3 (weight processing) below 75% if severe enough to trip the strict gate. P4's 50% bar is intentionally lenient, so a sub-100% P4 on a small parity-test model is worth investigating even when the system reports `STATUS_VERIFIED`. See [tools/model_registry/AGENTS.md §Phase-score thresholds](../../tools/model_registry/AGENTS.md#phase-score-thresholds). --- @@ -120,11 +167,25 @@ Tokenizer-policy fields the adapter / load path is responsible for: **How to verify the right values:** -1. Read [`tokenizer_config.json`] in the HF repo. Look at `add_bos_token`, `padding_side`, `bos_token`, `eos_token`, `chat_template`. -2. Compare to the closest sibling already in the registry. If they differ, treat that as a deliberate choice and propagate it. -3. When unsure, **leave the framework default** — explicit wrongness is worse than implicit defaulting, because the wrong explicit value hides under "I configured it" assumptions. +1. **Run the tokenizer.** `tokenizer_config.json` LIES for some architectures — Cohere is the canonical example: it declares `add_bos_token=False` but HF's `__call__` prepends BOS anyway via `add_special_tokens=True`. The only reliable check is to actually invoke it: + + ```python + from transformers import AutoTokenizer + t = AutoTokenizer.from_pretrained("") + print("encode:", t.encode("hello")) # raw encode + print("__call__:", t("hello").input_ids) # what generation actually uses + print("bos_token_id:", t.bos_token_id) + ``` + + If `t("hello").input_ids[0] == t.bos_token_id`, set `cfg.default_prepend_bos = True`. If the first token is NOT BOS, leave the flag unset. + +2. **Cross-check against `tokenizer_config.json`** for `padding_side`, `bos_token`, `eos_token`, `chat_template`. These tend to be honest; the runtime override mostly affects BOS-prepending. + +3. **Compare to the closest sibling already in the registry.** If they differ, treat that as a deliberate choice and propagate it. + +4. **When unsure, leave the framework default.** Explicit wrongness is worse than implicit defaulting — the wrong explicit value hides under "I configured it" assumptions. -`default_prepend_bos` is the most common trap (instruct/base divergence within Llama, Mistral, Gemma families), but the same logic applies to every flag above. +`default_prepend_bos` is the most common trap (Cohere's tokenizer-config-vs-runtime mismatch, instruct/base divergence within Llama / Mistral / Gemma families). The same recipe applies to verifying every flag above. --- @@ -242,7 +303,7 @@ Two test layers per architecture, both required for review: ### 1. Unit adapter test — `tests/unit/model_bridge/supported_architectures/test__adapter.py` -29 of these exist; they all follow the same shape. Copy from a sibling that matches your architecture's quirks: +26 of these exist; they all follow the same shape. Copy from a sibling that matches your architecture's quirks: - **Standard decoder LM** template: [`test_gemma1_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_gemma1_adapter.py) or [`test_gpt2_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_gpt2_adapter.py) - **GQA / modern RMSNorm+RoPE** template: [`test_qwen3_adapter.py`](../../../tests/unit/model_bridge/supported_architectures/test_qwen3_adapter.py) diff --git a/transformer_lens/tools/model_registry/AGENTS.md b/transformer_lens/tools/model_registry/AGENTS.md index 91d1a0b82..270fe4628 100644 --- a/transformer_lens/tools/model_registry/AGENTS.md +++ b/transformer_lens/tools/model_registry/AGENTS.md @@ -123,7 +123,31 @@ Agents should never edit it manually; let `verify_models` manage it. | 7 | Multimodal (vision/text alignment) — only Llava / Gemma3-multimodal | | 8 | Audio — only Hubert | -A model entry's `status` is `1` (verified) only after all applicable phases pass. +### Phase-score thresholds + +`verify_models` enforces hard pass/fail at the thresholds in `_MIN_PHASE_SCORES` ([`verify_models.py:508`](verify_models.py)). Below threshold OR a required-test failure → `STATUS_FAILED`. The contract: + +| Phase | Min score | Required tests | Effect when below threshold or required tests fail | +|---|---|---|---| +| 1 | **100%** | — | `STATUS_FAILED` | +| 2 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | +| 3 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | +| 4 | 50% | — | **Non-gating.** Below 50% adds `"low text quality"` to the registry `note`; never causes `STATUS_FAILED`. | +| 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL score (processor unavailable) also fails. | +| 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL score also fails. | + +Phase 4 is intentionally lenient — the source comment ([`verify_models.py:554`](verify_models.py)) calls it *"a quality metric, not a correctness check."* The 50% bar asks "is the generated text coherent at all?" rather than "is this adapter clean?" + +**For adapter authors**: a `STATUS_VERIFIED` entry whose Phase 4 score is well below 100% on a small parity-test model (e.g. `tiny-ForCausalLM`) can still indicate a real adapter bug that the system intentionally doesn't gate on. Cohere's first cold-agent run produced `P4=74.8%, status=1, note="Full verification completed"` — the system was correct (74.8 > 50, P4 doesn't gate), but the score reflected a missing `preprocess_weights` fold for `logit_scale`. Treat sub-100% P4 on tiny test models as a yellow flag worth investigating, even when the system reports VERIFIED. + +**Reading the result**: + +- `status == 1` AND `note == "Full verification completed"` → all hard gates passed AND no quality flag. Good. +- `status == 1` AND `note` mentions `"low text quality"` → P4 < 50%; investigate. +- `status == 1` AND P4 well below 100% on a small model, no quality flag → still potentially a subtle weight-fold or tokenizer bug; investigate manually. +- `status == 3` (FAILED) → `note` carries the specific failure reason and which tests failed; debug from there. + +For Phase-1 / Phase-3 failures, see [supported_architectures/AGENTS.md §When to override preprocess_weights](../../model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights) and [debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md). For Phase-4 quality drift, see [supported_architectures/AGENTS.md §Tokenizer policy](../../model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) and the same `preprocess_weights` reference (logit-scale / embedding-scale folds typically degrade P4 without crossing the 50% gate). --- From c30d4677fefb039552891031077ba0370c2e9665 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 19:03:37 -0500 Subject: [PATCH 08/11] brevity pass --- .claude/commands/add-model-support.md | 48 ++++--- .claude/commands/verify-model.md | 42 +++--- AGENTS.md | 83 ++++++------ CLAUDE.md | 49 +++---- docs/source/content/compatibility_mode.md | 6 +- .../content/debugging_numerical_divergence.md | 12 +- docs/source/content/hook_system.md | 8 +- tests/AGENTS.md | 48 ++++--- tests/QUARANTINES.md | 124 ++++++++--------- .../supported_architectures/AGENTS.md | 126 ++++++++---------- .../tools/model_registry/AGENTS.md | 63 ++++----- 11 files changed, 287 insertions(+), 322 deletions(-) diff --git a/.claude/commands/add-model-support.md b/.claude/commands/add-model-support.md index 3a4a9a9ae..8508ff64e 100644 --- a/.claude/commands/add-model-support.md +++ b/.claude/commands/add-model-support.md @@ -3,53 +3,51 @@ description: Guided workflow for adding a new architecture adapter to Transforme argument-hint: --- -You are adding TransformerBridge support for the HuggingFace model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user for the HF repo path before continuing. +Adding TransformerBridge support for HF model `$ARGUMENTS`. If empty, ask the user for the HF repo path first. -Each step below names the doc to read **only when you reach that step** — do not load all of them up front. - -Execute this checklist, stopping at each step until it is genuinely done: +Each step names the doc to read **when you reach that step** — don't load all up front. 1. **Check registry state and decide whether to verify.** - Determine the current state: - - **Architecture supported?** Check `SUPPORTED_ARCHITECTURES` in [`transformer_lens/factories/architecture_adapter_factory.py`](../../transformer_lens/factories/architecture_adapter_factory.py). - - **Model in registry?** Check whether `$ARGUMENTS` appears in [`transformer_lens/tools/model_registry/data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json), and if so what `status` it has (0=unverified, 1=verified, 2=skipped, 3=failed). + State: + - Architecture supported? Check `SUPPORTED_ARCHITECTURES` in [`architecture_adapter_factory.py`](../../transformer_lens/factories/architecture_adapter_factory.py). + - Model in registry? Check [`supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json); note `status` (0=unverified, 1=verified, 2=skipped, 3=failed). - Then branch: + Branch: - - **Architecture supported AND model already in registry with `status == 1` (verified)** → the model is already verified end-to-end. Ask the user what symptom they're seeing — this is most likely a bug-report path, not an add-support path. Stop the workflow here. - - **Architecture supported, model in registry with `status != 1`** (0 unverified, 2 skipped, 3 failed) → proceed to **Confirm before verification** below. If `status == 3`, read the existing `note` field first to see what failure mode you may be about to re-trigger. - - **Architecture supported, model NOT in registry** → add a registry entry per [supported_architectures/AGENTS.md §"Adding the HF repo to the registry"](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry) with `status: 0` and null phase scores, then proceed to **Confirm before verification**. - - **Architecture NOT supported** → skip to step 2 below (the full adapter-authoring workflow). + - **Supported AND `status==1`** → already verified. Ask the user the symptom (bug-report path, not add-support). Stop. + - **Supported, `status != 1`** → proceed to **Confirm before verification**. If `status==3`, read existing `note` for the prior failure mode. + - **Supported, not in registry** → add an entry per [§Adding the HF repo to the registry](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry) with `status: 0` and null scores, then proceed. + - **Not supported** → skip to step 2. ### Confirm before verification - Always ask the user before running verification, even for small models. To give them an informed decision: + Always ask the user first, even for small models: - 1. Run dry-run to project cost: + 1. Dry-run to project cost: ``` set -a; source .env; set +a uv run python -m transformer_lens.tools.model_registry.verify_models --model "$ARGUMENTS" --dry-run ``` - 2. Show the user: model ID, architecture class, estimated parameters, projected memory (GB), whether `HF_TOKEN` is needed, runtime expectation (30 s – 2 min sub-1B, 2–15 min 1B–7B, 15+ min 7B+ or multimodal), and what verification does (Phases 1–4; updates `supported_models.json` on success). + 2. Show: model ID, architecture class, estimated parameters, projected memory (GB), HF_TOKEN needed?, runtime (30 s–2 min sub-1B, 2–15 min 1B–7B, 15+ min 7B+/multimodal), what verification does (Phases 1–4; updates `supported_models.json` on success). 3. Ask: "Run verification on this machine? (Y/N)" - **Confirm** → `/verify-model $ARGUMENTS`. On pass, registry updated, done. On fail, see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md) (per-sibling adapter bug, not a new adapter). + **Confirm** → `/verify-model $ARGUMENTS`. On pass, done. On fail, see [debugging_numerical_divergence.md](../../docs/source/content/debugging_numerical_divergence.md) (per-sibling adapter bug). - **Reject** → `gh issue create --template verify-model.md` (fill from the dry-run output you already collected). If `gh` isn't installed: . Stop the workflow. + **Reject** → `gh issue create --template verify-model.md` (fill from dry-run output). No `gh`? . Stop. -2. **Analyze the HF model.** Read its `config.json` and source. Identify embedding, attention, MLP, normalization, and output-head layouts. Now read [Config-attr propagation in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) and run through the non-standard attributes table (`final_logit_softcapping`, `sliding_window`, etc.); decide which need to surface on `self.cfg`. +2. **Analyze the HF model.** Read `config.json` and source — identify embedding, attention, MLP, normalization, output-head layouts. Read [§Config-attr propagation](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#config-attr-propagation) and decide which non-standard attrs (`final_logit_softcapping`, `sliding_window`, etc.) need surfacing on `self.cfg`. -3. **Pick a starting adapter.** Read the [starter-adapter table in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy the closest match into [`transformer_lens/model_bridge/supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `.py`. **Watch tokenizer-policy flags** — see [Tokenizer policy in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) (`default_prepend_bos`, padding side, EOS handling, chat-template wiring are all per-model, not per-architecture). +3. **Pick a starting adapter.** See [§Starter-adapter table](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#starter-adapter-table). Copy into [`supported_architectures/`](../../transformer_lens/model_bridge/supported_architectures/) as `.py`. **Tokenizer-policy flags are per-model** — see [§Tokenizer policy](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy). -4. **Fill in `self.component_mapping`** so each HF module path resolves to a canonical Bridge name. Hook names are Bridge-native. The [Minimal contract](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#minimal-contract) and [Common gotchas](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#common-gotchas) tables in supported_architectures/AGENTS.md are the references for this step. +4. **Fill `self.component_mapping`.** Bridge-native hook names. Reference: [§Minimal contract](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#minimal-contract), [§Common gotchas](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#common-gotchas). -5. **Register in all four sites** per the [registration steps in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). After registering, run the invariant test to confirm the four-place wiring: `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory`. +5. **Register in all four sites** per [§Registration steps](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#registration-steps). Then run the invariant test: `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory`. -6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per [supported_architectures/AGENTS.md §"Adding the HF repo to the registry"](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Then ask the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. +6. **Add the HF repo entry** to [`data/supported_models.json`](../../transformer_lens/tools/model_registry/data/supported_models.json) per [§Adding the HF repo to the registry](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#adding-the-hf-repo-to-the-registry). Ask the user about adding canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]`. -7. **Verify** end-to-end: run `/verify-model $ARGUMENTS`. (`/verify-model` enforces dry-run-first and one-at-a-time.) Read both `status` and per-phase scores from the output. `STATUS_VERIFIED` means the system's hard gates passed (see [tools/model_registry/AGENTS.md §Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds) for the actual table) — but Phase 4's 50% bar is intentionally lenient. If Phase 4 on a small parity-test model is well below 100% even though `status == 1`, suspect a missing `preprocess_weights` fold or wrong `default_prepend_bos` and investigate before continuing to step 8. +7. **Verify** end-to-end: `/verify-model $ARGUMENTS`. Read both `status` AND per-phase scores. `STATUS_VERIFIED` means hard gates passed (see [§Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds)) — but P4's 50% bar is intentionally lenient. P4 well below 100% on a small parity-test model + `status==1` → suspect missing `preprocess_weights` fold or wrong `default_prepend_bos`; investigate before step 8. -8. **Write tests.** Read [Required tests in supported_architectures/AGENTS.md](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#required-tests) for the two-layer pattern (unit adapter test + integration parity test) and copy the closest sibling. +8. **Write tests** per [§Required tests](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#required-tests) (unit + integration). Copy the closest sibling. -9. **Run `/task-complete`** — handles comment cleanup, `/format`, and the standard test tiers, looping until clean. +9. **`/task-complete`** — comment cleanup, `/format`, standard test tiers, loop until clean. diff --git a/.claude/commands/verify-model.md b/.claude/commands/verify-model.md index 7ad8a6ddc..8eec7501a 100644 --- a/.claude/commands/verify-model.md +++ b/.claude/commands/verify-model.md @@ -3,11 +3,11 @@ description: Run verify_models.py against a single model (non-parallel). Always argument-hint: --- -Verify model `$ARGUMENTS`. If `$ARGUMENTS` is empty, ask the user which model to verify (an HF repo path like `gpt2` or `meta-llama/Llama-2-7b-hf`, or a registry alias) before running. +Verify model `$ARGUMENTS`. If empty, ask for an HF repo path (e.g. `gpt2`, `meta-llama/Llama-2-7b-hf`) or registry alias. ## Always dry-run first -Verification loads the full model and runs Phases 1–4 — 30 s to 30 min and needs enough free memory to hold the model. **Never invoke the real run blindly.** +Verification loads the full model and runs Phases 1–4 — 30 s to 30 min, needs memory to hold the model. **Never invoke the real run blindly.** ``` set -a; source .env; set +a @@ -16,12 +16,10 @@ uv run python -m transformer_lens.tools.model_registry.verify_models --model "$A Capture: estimated parameter count, projected memory (GB), HF_TOKEN requirement, architecture class. -Decide whether to confirm with the user: - | Model | Action | |---|---| -| Cached small (`gpt2`, `attn-only-*`, `tiny-stories-1M`, `distilgpt2`, etc.) | Proceed directly; report the dry-run in your response so the user can intervene | -| ≥1B params, or gated, or anything else | Present the dry-run to the user and ask before running | +| Cached small (`gpt2`, `attn-only-*`, `tiny-stories-1M`, `distilgpt2`, …) | Proceed; report dry-run in your response so user can intervene | +| ≥1B params, gated, or anything else | Present dry-run, ask before running | ## Run the verification @@ -34,38 +32,38 @@ uv run python -m transformer_lens.tools.model_registry.verify_models --model "$A Full reference: [tools/model_registry/AGENTS.md §Flag reference](../../transformer_lens/tools/model_registry/AGENTS.md#flag-reference). -- `--device cpu|cuda|mps` — override automatic device selection -- `--dtype float32|bfloat16` — override automatic dtype selection -- `--max-memory ` — skip if parameter estimate exceeds; on a 24 GB GPU, `--max-memory 16` leaves headroom for activations -- `--phases 1 2 3` — restrict to a subset (Phase 4 generation is the slowest; restrict when debugging Phase 1 forward parity) -- `--dry-run` — project memory + parameter count without loading the model (see above; always run this first) -- `--no-hf-reference` / `--no-ht-reference` — skip the HF / HT comparison passes (faster, lower confidence) -- `--reverify` — re-test a model already at `status == 1` -- `--retry-failed` — re-test a model at `status == 3` (read its existing `note` first to see the prior failure mode) +- `--device cpu|cuda|mps` — override device selection +- `--dtype float32|bfloat16` — override dtype +- `--max-memory ` — skip if param estimate exceeds; e.g. `16` on a 24 GB GPU leaves headroom for activations +- `--phases 1 2 3` — restrict (P4 is slowest; restrict when debugging P1 forward parity) +- `--dry-run` — see above; always first +- `--no-hf-reference` / `--no-ht-reference` — skip HF / HT comparison (faster, lower confidence) +- `--reverify` — re-test `status==1` +- `--retry-failed` — re-test `status==3` (read existing `note` first) -Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) are not relevant for `--model ` — use the batch invocation in [tools/model_registry/AGENTS.md §Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations) if you need them. +Batch flags (`--architectures`, `--per-arch`, `--limit`, `--resume`) don't apply to `--model ` — use [§Canonical invocations](../../transformer_lens/tools/model_registry/AGENTS.md#canonical-invocations). ## Interpreting the output -`verify_models` enforces hard thresholds (from `_MIN_PHASE_SCORES` in `verify_models.py`): +Hard thresholds (`_MIN_PHASE_SCORES` in `verify_models.py`): | Phase | Min score | Required tests | Below = | |---|---|---|---| | 1 | 100% | — | `STATUS_FAILED` | | 2 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | | 3 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | -| 4 | 50% | — | **Non-gating** — only adds `"low text quality"` to the `note`; never causes `STATUS_FAILED`. | +| 4 | 50% | — | **Non-gating** — adds `"low text quality"` to `note`; never fails. | | 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL = fail. | | 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL = fail. | -`STATUS_VERIFIED` means the system's hard contract is met. The `note` field carries quality flags (`"low text quality"`) or failure details. +`STATUS_VERIFIED` means hard gates passed. `note` carries quality flags or failure details. -**One caveat for adapter authors:** Phase 4's 50% bar is intentionally lenient (it's a coherence metric, not correctness). A small parity-test model that gets Phase 4 well below 100% can still indicate a real adapter bug the system doesn't gate on — most often a missing `preprocess_weights` fold ([supported_architectures/AGENTS.md §When to override preprocess_weights](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights)) or wrong `default_prepend_bos` ([§Tokenizer policy](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy)). Worth investigating even on VERIFIED runs. +**Adapter-author caveat:** P4's 50% bar is intentionally lenient (coherence, not correctness). P4 well below 100% on a small parity-test model can indicate a real bug the system doesn't gate on — most often a missing [`preprocess_weights` fold](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights) or wrong [`default_prepend_bos`](../../transformer_lens/model_bridge/supported_architectures/AGENTS.md#tokenizer-policy). Investigate even on VERIFIED. -Full reference: [tools/model_registry/AGENTS.md §Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds). +Full reference: [§Phase-score thresholds](../../transformer_lens/tools/model_registry/AGENTS.md#phase-score-thresholds). ## Hard rules -**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json`. See [transformer_lens/tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md) for the canonical-invocations table, file roles, and the resume/checkpoint mechanism. +**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes `data/supported_models.json` ([tools/model_registry/AGENTS.md](../../transformer_lens/tools/model_registry/AGENTS.md)). -Run one model at a time — concurrent loads OOM a single device. Report the actual benchmark output (including per-phase scores) and investigate any failure or drift per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). +One model at a time — concurrent loads OOM. Report actual per-phase scores; investigate failures per [AGENTS.md §10](../../AGENTS.md#10-hard-rules). diff --git a/AGENTS.md b/AGENTS.md index fc5c824d8..10fbef1a2 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -6,15 +6,15 @@ This file is the single source of truth. Vendor-specific files ([CLAUDE.md](CLAU > **Just want to use the library?** Start with the [README](README.md). **Just want to run the tests?** Skip to [§3 Quickstart](#3-quickstart). **Contributing?** Read on. -## TL;DR — read before doing anything +## TL;DR -1. **Use `uv`**, not `pip` or `poetry`. Install with `uv sync`. -2. **Source `.env`** (`set -a; source .env; set +a`) before any HuggingFace-Hub-hitting command. Gated models need `HF_TOKEN`. +1. **Use `uv`**, not `pip` or `poetry` (`uv sync`). +2. **Source `.env`** (`set -a; source .env; set +a`) before any HF-Hub command. 3. **Base PRs on `dev`**, not `main`. Never name a branch `main` or `dev`. -4. **Mirror HookedTransformer → TransformerBridge** when changing behaviour that exists in both ([§2](#2-two-systems-live-in-this-repo)). -5. **Run `make format` + `uv run mypy .` before push** — no pre-commit hook is installed. -6. **Never add `# type: ignore`** — use `isinstance` / `typing.cast` ([§10](#10-hard-rules)). -7. **Never dismiss a failing test as "pre-existing"** — investigate every failure ([§10](#10-hard-rules)). +4. **Mirror HookedTransformer → TransformerBridge** when behaviour exists in both ([§2](#2-two-systems-live-in-this-repo)). +5. **`make format` + `uv run mypy .` before push** — no pre-commit hook. +6. **Never add `# type: ignore`** ([§10](#10-hard-rules)). +7. **Never dismiss a failing test as "pre-existing"** ([§10](#10-hard-rules)). Sub-folder rules: [tests/AGENTS.md](tests/AGENTS.md) · [supported_architectures/AGENTS.md](transformer_lens/model_bridge/supported_architectures/AGENTS.md) · [tools/model_registry/AGENTS.md](transformer_lens/tools/model_registry/AGENTS.md). @@ -31,13 +31,11 @@ Sub-folder rules: [tests/AGENTS.md](tests/AGENTS.md) · [supported_architectures | **`TransformerBridge`** | v3 — default for new work | [transformer_lens/model_bridge/](transformer_lens/model_bridge/) | Raw HF weights by default; `bridge.enable_compatibility_mode()` for HT-equivalent | [transformer_lens/tools/model_registry/data/supported_models.json](transformer_lens/tools/model_registry/data/supported_models.json) | | **`HookedTransformer`** | Legacy, maintenance mode, deprecated in 3.0 | [transformer_lens/HookedTransformer.py](transformer_lens/HookedTransformer.py) + [transformer_lens/components/](transformer_lens/components/) | Folds LayerNorm + centres weights → does NOT match HF | [transformer_lens/supported_models.py](transformer_lens/supported_models.py) (**HT-only**) | -> ⚠ The **HookedTransformer acceptance test suite is currently quarantined** ([tests/acceptance/test_hooked_transformer.py](tests/acceptance/test_hooked_transformer.py), [test_hooked_encoder.py](tests/acceptance/test_hooked_encoder.py), [test_hooked_encoder_decoder.py](tests/acceptance/test_hooked_encoder_decoder.py)) — see [tests/QUARANTINES.md §⚠️](tests/QUARANTINES.md). HT-touching changes land essentially untested at the acceptance level until the suite is re-enabled; extra manual care is required. +> ⚠ The **HookedTransformer acceptance suite is quarantined** ([test_hooked_transformer.py](tests/acceptance/test_hooked_transformer.py), [test_hooked_encoder.py](tests/acceptance/test_hooked_encoder.py), [test_hooked_encoder_decoder.py](tests/acceptance/test_hooked_encoder_decoder.py); see [QUARANTINES.md](tests/QUARANTINES.md)). HT changes land untested at the acceptance level — extra manual care required. -Bridge architecture-adapter pattern: each HF architecture has one file in [transformer_lens/model_bridge/supported_architectures/](transformer_lens/model_bridge/supported_architectures/) mapping HF module paths to canonical TransformerLens names. Bridge hook names are architecture-native (e.g. `blocks.{i}.hook_out`); HT-style aliases are registered separately in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py). +Bridge architecture-adapter pattern: each HF architecture has one file in [supported_architectures/](transformer_lens/model_bridge/supported_architectures/) mapping HF module paths to canonical names. Bridge hooks are architecture-native (e.g. `blocks.{i}.hook_out`); HT-style aliases live in [bridge.py](transformer_lens/model_bridge/bridge.py). -### Mirroring rule - -If you change behaviour in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. Parallel implementations; drift between them is a recurring source of bugs. The HT-only [supported_models.py](transformer_lens/supported_models.py) is NOT the place to add Bridge-only models — those go in the Bridge registry data file. +**Mirroring rule:** if you change `HookedTransformer` behaviour that has a `TransformerBridge` counterpart, update both in the same PR. [supported_models.py](transformer_lens/supported_models.py) is HT-only — Bridge-only models go in the Bridge registry data file. ## 3. Quickstart @@ -107,12 +105,10 @@ Python: **>=3.10, <4.0**. CI tests 3.10, 3.11, 3.12. Format/type/docstring check ## 5. Hook naming — HT vs Bridge -- **HT canonical names**: uniform across architectures. `hook_embed`, `blocks.{i}.hook_resid_pre`, `blocks.{i}.attn.hook_q`, `blocks.{i}.hook_resid_post`, etc. -- **Bridge-native names**: architecture-shaped. E.g. `blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`. Aliases are registered via `build_alias_to_canonical_map()` in [transformer_lens/model_bridge/bridge.py](transformer_lens/model_bridge/bridge.py) so HT names continue to resolve. - -When writing new code, prefer Bridge-native names. +- **HT canonical**: uniform across architectures — `hook_embed`, `blocks.{i}.hook_resid_pre`, `blocks.{i}.attn.hook_q`, `blocks.{i}.hook_resid_post`. +- **Bridge-native**: architecture-shaped — `blocks.{i}.hook_out`, `blocks.{i}.attn.q.hook_out`. HT aliases registered via `build_alias_to_canonical_map()` in [bridge.py](transformer_lens/model_bridge/bridge.py). -When raw-HF-forward drivers compare against `boot_transformers`, match its load configuration (fp32, eager attention). Probe for structural features like `resid_mid` rather than assuming all architectures expose them. +Prefer Bridge-native names in new code. Raw-HF-forward drivers comparing against `boot_transformers` must match its load configuration (fp32, eager attention) and probe for optional features like `resid_mid` rather than assume. ## 6. Adding a model @@ -133,11 +129,11 @@ When picking solutions to any problem, prioritize by **research impact**, not im - Include the word `docs` in the branch name if your PR is primarily a docs change (this triggers the docs build job). - New branches must track their own remote — `git push -u origin ` from your branch, not from `main`/`dev`. -**Pre-push checks**: no pre-commit hook is installed. Run `make format` and `uv run mypy .` manually before pushing; CI will fail otherwise. +**Pre-push checks**: no pre-commit hook — `make format` + `uv run mypy .` manually. **PR template**: [.github/PULL_REQUEST_TEMPLATE.md](.github/PULL_REQUEST_TEMPLATE.md). No conventional-commits enforcement. -**Changelog / release notes**: there is no per-PR changelog file. **Note user-facing changes in your PR description and commit messages** — release essays in [docs/source/content/news/](docs/source/content/news/) and the GitHub Releases page are drafted by a maintainer at release time by reviewing those. Breaking changes and notable user-facing features should be called out explicitly so the rollup picks them up. +**Changelog**: no per-PR file. Note user-facing changes in your PR description and commit messages — release essays in [news/](docs/source/content/news/) and GitHub Releases are drafted by a maintainer from those at release time. ## 9. CI gates a PR must pass @@ -159,29 +155,29 @@ In-progress PR runs cancel on new commit; tag/release runs do not. ## 10. Hard rules -**On test failures:** +**Test failures:** -- Never dismiss a failing test as "pre-existing" or "unrelated" — investigate every failure and fix the underlying issue. -- Never fabricate test counts or claim tests passed without running them. Reviewers re-run tests; agent self-reports are not trusted as evidence. -- Never add platform skips, `@pytest.mark.skip`, or `xfail` to make a failing CI green. Debug the actual bug. -- If new tests surface pre-existing bugs, fix them. Don't punt to a future PR or `xfail` marker. +- Never dismiss a failing test as "pre-existing" — investigate every failure. +- Never fabricate test counts or claim passes without running. Reviewers re-run. +- Never add `@pytest.mark.skip` / `xfail` / platform-skip to make CI green. Debug the bug. +- If new tests surface pre-existing bugs, fix them — don't punt. -**On code quality:** +**Code quality:** -- Never add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing. -- Comments: terse, one-line docstrings; inline comments should explain WHY, not WHAT. No multi-paragraph explanations. -- Never reference plan-file details (audit IDs, finding IDs, "see plan section X") in source comments or docstrings. Plan artifacts belong in PR descriptions, not the codebase. +- Never add `# type: ignore` — use `isinstance` / `typing.cast`. +- Comments: terse one-liners; inline comments explain WHY, not WHAT. +- Never reference plan-file details (audit IDs, "see plan section X") in source. -**On numerical work:** +**Numerical work:** -- Never claim observed drift is "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale. -- Never run model verification or benchmarks in parallel, unless explicitly requested to — a single CUDA/MPS device does not have memory for concurrent loads. Serialize them. +- Never claim drift is "fp noise" without empirical evidence — bugs and accumulated rounding look identical at noise scale. +- Never run model verification or benchmarks in parallel — a single CUDA/MPS device OOMs. -**On environment and tooling:** +**Environment:** -- Use `uv`, never `pip` or `poetry`. Run commands via `uv run ` or the appropriate `make` target. -- Source `.env` before any HF-Hub-hitting command (docs build, notebook runs, `boot_transformers` against gated models). The `HF_TOKEN` is required for Llama, Mistral, Gemma, and gated Qwen variants. -- No pre-commit hook is installed; run `make format` + `uv run mypy .` manually before push. +- `uv` only — never `pip` or `poetry`. Run via `uv run ` or `make`. +- Source `.env` before any HF-Hub-hitting command. `HF_TOKEN` required for Llama, Mistral, Gemma, gated Qwen. +- No pre-commit hook — `make format` + `uv run mypy .` manually before push. ## 11. "Done" checklist @@ -207,10 +203,10 @@ Claude Code: `/task-complete` automates the last row. See [§15 Workflow shortcu ## 13. Local-only conventions -Two gitignored paths for ephemeral work (gitignore entries checked in — first-class conventions, not personal): +Gitignored, first-class (entries checked in): -- **`transformer_lens/scratch.py`** — sibling-of-package scratch file for one-off bisection scripts and ad-hoc imports. -- **`.adapter-workspace/`** — directory for adapter WIP (notes, config dumps, repro scripts). +- **`transformer_lens/scratch.py`** — one-off bisection scripts, ad-hoc imports. +- **`.adapter-workspace/`** — adapter WIP (notes, dumps, repros). ## 14. Upstream dependency pins @@ -226,13 +222,12 @@ Load-bearing pins live in [pyproject.toml](pyproject.toml): **Bumping upstream pins:** -1. Bump the pin in `pyproject.toml`, refresh `uv.lock` (`uv lock`). -2. Run `make test-pr` locally; expect adapter / hook tests to surface real breakages. -3. For each break, fix the adapter or component — do not pin around the regression. -4. Run `uv run python -m transformer_lens.tools.model_registry.verify_models --architectures ` to catch numerical regressions the unit tests miss. -5. Land in a focused PR — pin bumps are reviewed separately from feature work because the blast radius is wide. +1. Bump in `pyproject.toml`, `uv lock`. +2. `make test-pr` — expect real breakages, don't pin around them. +3. `uv run python -m transformer_lens.tools.model_registry.verify_models --architectures ` to catch numerical regressions. +4. Land in a focused PR — pin bumps are reviewed separately (wide blast radius). -Specific to HF: when a `transformers` minor bump introduces a new architecture or renames an existing one, check the impact on `_HF_PASSTHROUGH_ATTRS` in [`transformer_lens/model_bridge/sources/_bridge_builder.py`](transformer_lens/model_bridge/sources/_bridge_builder.py) and `SUPPORTED_ARCHITECTURES` in [`transformer_lens/factories/architecture_adapter_factory.py`](transformer_lens/factories/architecture_adapter_factory.py). +HF specifically: a `transformers` minor bump that adds or renames an architecture impacts `_HF_PASSTHROUGH_ATTRS` in [`_bridge_builder.py`](transformer_lens/model_bridge/sources/_bridge_builder.py) and `SUPPORTED_ARCHITECTURES` in [`architecture_adapter_factory.py`](transformer_lens/factories/architecture_adapter_factory.py). ## 15. Workflow shortcuts diff --git a/CLAUDE.md b/CLAUDE.md index 38b1674a8..f075c399b 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,33 +1,34 @@ # CLAUDE.md -**Read [AGENTS.md](AGENTS.md) first.** It is the single source of truth for project conventions, quickstart commands, repo layout, hard rules, and PR conventions. Everything below is Claude-Code-specific tooling on top of it. +**Read [AGENTS.md](AGENTS.md) first** — single source of truth. Everything below is Claude-Code-specific. -## Claude Code conveniences in this repo +## Slash commands ([.claude/commands/](.claude/commands/)) -- **Slash commands** — see [.claude/commands/](.claude/commands/) for canonical workflows: - - `/test-unit` — `make unit-test` - - `/test-all` — `make test` (long; runs benchmarks + notebooks too) - - `/format` — `make format` then `uv run mypy .` - - `/typecheck` — `uv run mypy .` - - `/build-docs` — sources `.env`, runs `uv run build-docs` - - `/verify-model ` — guided single-model run of `verify_models.py` - - `/add-model-support ` — checklist-driven new-adapter workflow - - `/task-complete` — end-of-task gate: clean up new comments per the comment rules in AGENTS.md, then `/format`, then `make test-pr` (unit + docstring + acceptance + integration), looping until clean. Skips notebook + benchmark tiers (run those directly if your change touched them). -- **Project settings** — [.claude/settings.json](.claude/settings.json) is the checked-in scaffold for project-wide Claude Code settings (hooks, env). It intentionally ships with no permission allowlist; configure your own permissions in `.claude/settings.local.json` (gitignored). -- **First-time permission setup** — the empty checked-in allowlist means every command (`uv sync`, `make unit-test`, `git status`, `gh pr view`, etc.) prompts on first use. Two ways to reduce the noise: (1) run the built-in `/fewer-permission-prompts` skill, which scans your recent transcripts and adds the read-only commands you actually use to `.claude/settings.local.json`; or (2) approve commands with "Don't ask again" the first time each is used. Both routes write to the gitignored local settings file — nothing committed. -- **User-specific settings** — `.claude/settings.local.json`, `.claude/agents/`, `.claude/worktrees/` are all gitignored. Configure them per-user. +- `/test-unit` — `make unit-test` +- `/test-all` — `make test` (long; benchmarks + notebooks too) +- `/format` — `make format` + `uv run mypy .` +- `/typecheck` — `uv run mypy .` +- `/build-docs` — sources `.env`, runs `uv run build-docs` +- `/verify-model ` — guided single-model `verify_models.py` run +- `/add-model-support ` — checklist-driven new-adapter workflow +- `/task-complete` — end-of-task gate: clean new comments, `/format`, `make test-pr` (unit + docstring + acceptance + integration), loop until clean. Skips notebook + benchmark tiers. -## Pointers when working with Claude Code in this repo +## Settings + +- [.claude/settings.json](.claude/settings.json) is the checked-in scaffold (hooks, env); ships with no permission allowlist. +- **First-time permission setup**: every command prompts on first use. Either (1) run `/fewer-permission-prompts` to populate `.claude/settings.local.json` from your transcripts, or (2) approve with "Don't ask again." Both write to the gitignored local file. +- `.claude/settings.local.json`, `.claude/agents/`, `.claude/worktrees/` are gitignored — per-user. + +## Pointers - [AGENTS.md §10](AGENTS.md#10-hard-rules) — hard rules; load-bearing. -- [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — HT → Bridge mirroring rule; single most common source of PR-review pushback. -- [tests/QUARANTINES.md](tests/QUARANTINES.md) — check this before debugging any failing test. Known quarantines have "un-skip when…" lines; the macOS-arm64 KV-cache skip is the most common time-sink. -- [docs/source/content/debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — bisection workflow for Bridge-vs-HF logit drift. -- [docs/source/content/compatibility_mode.md](docs/source/content/compatibility_mode.md) — what `bridge.enable_compatibility_mode()` actually does. Read before deciding whether a new test needs it. +- [AGENTS.md §2](AGENTS.md#2-two-systems-live-in-this-repo) — HT → Bridge mirroring; most common PR-review pushback. +- [tests/QUARANTINES.md](tests/QUARANTINES.md) — check before debugging any failing test. The macOS-arm64 KV-cache skip is the most common time-sink. +- [debugging_numerical_divergence.md](docs/source/content/debugging_numerical_divergence.md) — Bridge-vs-HF logit drift bisection. +- [compatibility_mode.md](docs/source/content/compatibility_mode.md) — `bridge.enable_compatibility_mode()` contract; read before adding tests that use it. -## Looking for a starter task? +## Starter tasks -- [Issues labeled `good first issue`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) -- [Issues labeled `help wanted`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) -- [Issues labeled `verification-request`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3Averification-request) — models awaiting verification on hardware the requester didn't have. Pick one that fits your machine and run `/verify-model `. -- Backfilling per-adapter unit tests under [`tests/unit/model_bridge/supported_architectures/`](tests/unit/model_bridge/supported_architectures/) is high-leverage — recent commits show this is an active push and the pattern is well-established (copy a sibling adapter test). +- [`good first issue`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) · [`help wanted`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3A%22help+wanted%22) +- [`verification-request`](https://github.com/TransformerLensOrg/TransformerLens/issues?q=is%3Aissue+is%3Aopen+label%3Averification-request) — models awaiting verification; pick one that fits your machine, run `/verify-model `. +- Backfilling per-adapter unit tests in [`tests/unit/model_bridge/supported_architectures/`](tests/unit/model_bridge/supported_architectures/) is high-leverage; copy a sibling adapter test. diff --git a/docs/source/content/compatibility_mode.md b/docs/source/content/compatibility_mode.md index 8a9c9f87f..de6eb49ab 100644 --- a/docs/source/content/compatibility_mode.md +++ b/docs/source/content/compatibility_mode.md @@ -83,9 +83,9 @@ New integration tests should use the variant that matches the property they're t `enable_compatibility_mode()` mutates the bridge's weights in-place. It is: -- **One-shot**: calling it twice on the same bridge is safe but pointless; the second call re-folds already-folded weights into the new (already-folded) LayerNorm modules — which is a no-op for `fold_ln` semantically, but it does re-run the centering subtractions. Don't. -- **Not reversible** from within the bridge: if you need raw weights again, re-boot the bridge. -- **Idempotent in `_setup_hook_compatibility`**: that method can be called multiple times safely; only `process_weights` mutates weights. +- **One-shot:** calling it twice re-runs the centering subtractions. Don't. +- **Not reversible** from within the bridge — re-boot for raw weights. +- **`_setup_hook_compatibility` is idempotent**; only `process_weights` mutates weights. ## See also diff --git a/docs/source/content/debugging_numerical_divergence.md b/docs/source/content/debugging_numerical_divergence.md index 8fa43fe3e..a0064f098 100644 --- a/docs/source/content/debugging_numerical_divergence.md +++ b/docs/source/content/debugging_numerical_divergence.md @@ -6,16 +6,16 @@ When a Bridge adapter's integration test fails by `~1e-3` (or any larger delta) --- -## 0. Setup checklist before debugging +## 0. Setup checklist -- **fp32, eager attention** on both sides. `bridge.boot_transformers(..., dtype=torch.float32, attn_implementation="eager")` and the HF `AutoModelForCausalLM.from_pretrained(..., torch_dtype=torch.float32, attn_implementation="eager")`. Mismatched dtype or `sdpa` / `flash_attention_2` masks bugs. -- **`enable_compatibility_mode(no_processing=True)`** for the first pass. This isolates forward-pass bugs from weight-processing bugs. See [compatibility_mode.md](compatibility_mode.md) for the four-quadrant matrix. -- **Single-token input**, then short input (5–10 tokens), then longer. Most adapter bugs surface on the single-token case; widen only after that's clean. -- **Same seed / disable dropout**. Adapters generally don't use dropout, but a stray `nn.Dropout(p=0.1)` in a generalized component will silently de-correlate runs. +- **fp32 + eager attention on both sides.** `dtype=torch.float32, attn_implementation="eager"`. `sdpa` / `flash_attention_2` mask bugs. +- **`enable_compatibility_mode(no_processing=True)`** for the first pass — isolates forward-pass bugs from weight-processing bugs. See [compatibility_mode.md](compatibility_mode.md). +- **Single-token first**, then 5–10, then longer. Most adapter bugs surface single-token. +- **Same seed / no dropout.** A stray `nn.Dropout(p=0.1)` in a generalized component silently de-correlates runs. ## 1. Bisect by component -Compare intermediate activations end-to-end. The Bridge exposes them via hooks; HF exposes them via `output_hidden_states=True` and `output_attentions=True`. Walk forward through the stack and find the **first** layer where Bridge ≠ HF: +Walk Bridge hooks vs HF `output_hidden_states=True` / `output_attentions=True` and find the first layer where they diverge: | Stage | Bridge hook | HF output | |---|---|---| diff --git a/docs/source/content/hook_system.md b/docs/source/content/hook_system.md index 04c7e8b6a..dbf17cf43 100644 --- a/docs/source/content/hook_system.md +++ b/docs/source/content/hook_system.md @@ -95,17 +95,17 @@ model.remove_all_hook_fns(including_permanent=True) # also removes permanent ## Hook naming -Hook names are stable strings you address from outside the model. They differ between HookedTransformer and TransformerBridge: +Stable strings; differ between HookedTransformer and TransformerBridge: | System | Style | Example | |---|---|---| | `HookedTransformer` (legacy) | Uniform across architectures | `blocks.5.attn.hook_q`, `blocks.5.hook_resid_post`, `hook_embed` | | `TransformerBridge` (default) | Architecture-native | `blocks.5.attn.q.hook_out`, `blocks.5.hook_out`, `embed.hook_out` | -| `TransformerBridge` + compatibility mode | Bridge-native **and** HT-style aliases both fire | Same as above, plus `blocks.5.attn.hook_q` etc. | +| `TransformerBridge` + compatibility mode | Bridge-native AND HT-style aliases | Above + `blocks.5.attn.hook_q` etc. | -For the full canonical-name catalogue, the demo notebooks ([Main Demo](generated/demos/Main_Demo), [Exploratory Analysis Demo](generated/demos/Exploratory_Analysis_Demo)) walk through them with examples. For an architecture diagram showing where each name fires, see [TransformerLens_Diagram.svg](../_static/TransformerLens_Diagram.svg). +Full catalogue: [Main Demo](generated/demos/Main_Demo), [Exploratory Analysis Demo](generated/demos/Exploratory_Analysis_Demo). Architecture diagram: [TransformerLens_Diagram.svg](../_static/TransformerLens_Diagram.svg). -When porting HookedTransformer code to TransformerBridge, calling `bridge.enable_compatibility_mode()` (see [Compatibility Mode](compatibility_mode.md)) registers the HT-style aliases so your existing hook names continue to resolve. +Porting HT code to Bridge: `bridge.enable_compatibility_mode()` (see [Compatibility Mode](compatibility_mode.md)) registers HT aliases so existing names resolve. --- diff --git a/tests/AGENTS.md b/tests/AGENTS.md index 17779c68a..d233280ec 100644 --- a/tests/AGENTS.md +++ b/tests/AGENTS.md @@ -8,12 +8,12 @@ Read [the root AGENTS.md](../AGENTS.md) for project-wide rules. This file covers ## TL;DR -- **Tier placement matters.** Use [the tier table](#test-tiers) below — a unit test loading a model belongs in `integration/`, not `unit/`. -- **No mocking of model loads or the HF Hub.** Tests load real models. The session-scoped fixtures amortize the cost. -- **Use cached models for fast tests** (`gpt2`, `attn-only-{1,2,3,4}l`, `tiny-stories-1M`). Anything else gets `@pytest.mark.slow`. -- **MPS is a carve-out, not a green light.** `TRANSFORMERLENS_ALLOW_MPS=1` is required; only [`tests/mps/`](mps/) runs there. -- **Hard rules in [AGENTS.md §10](../AGENTS.md#10-hard-rules) apply**: no `xfail`/`skipif` to dodge CI, no dismissing failing tests as "pre-existing," no platform skips outside MPS. -- **Before debugging a failing test, check [QUARANTINES.md](QUARANTINES.md)** — if the failure matches a documented quarantine, it's known. If not, treat it as a real bug. +- **Tier placement matters** — a unit test loading a model belongs in `integration/`. See [tier table](#test-tiers). +- **No mocking model loads or HF Hub.** Session-scoped fixtures amortize the cost. +- **Cached models for fast tests** (`gpt2`, `attn-only-{1,2,3,4}l`, `tiny-stories-1M`). Anything else → `@pytest.mark.slow`. +- **MPS is a carve-out** — `TRANSFORMERLENS_ALLOW_MPS=1` required; only [`tests/mps/`](mps/) runs there. +- **[AGENTS.md §10](../AGENTS.md#10-hard-rules) applies**: no `xfail`/`skipif` to dodge CI; no platform skips outside MPS. +- **Check [QUARANTINES.md](QUARANTINES.md) before debugging a failing test** — known quarantines have documented reasons. --- @@ -61,42 +61,40 @@ Two cross-cutting rules: ## Cached-model allowlist -The CI cache in [`.github/workflows/checks.yml`](../.github/workflows/checks.yml) covers: `gpt2`, `gpt2-xl`, `distilgpt2`, `pythia-70m`, `gpt-neo-125M`, `gemma-2-2b-it`, `bloom-560m`, `Qwen2-0.5B`, `bert-base-cased`, `NeelNanda/Attn_Only*`, `roneneldan/TinyStories-1M*`, `NeelNanda/SoLU*`, `redwood_attn_2l`, `tiny-random-llama-2`, `DialoGPT-medium`. +CI cache ([`checks.yml`](../.github/workflows/checks.yml)) covers: `gpt2`, `gpt2-xl`, `distilgpt2`, `pythia-70m`, `gpt-neo-125M`, `gemma-2-2b-it`, `bloom-560m`, `Qwen2-0.5B`, `bert-base-cased`, `NeelNanda/Attn_Only*`, `roneneldan/TinyStories-1M*`, `NeelNanda/SoLU*`, `redwood_attn_2l`, `tiny-random-llama-2`, `DialoGPT-medium`. -Of those, prefer **`attn-only-{1,2,3,4}l`** and **`tiny-stories-1M`** for fast tests — `gpt2` is "quite slow" on CI's CPU runners ([contributing.md](../docs/source/content/contributing.md)). Use `gpt2` when you specifically need GPT-2 numerics. - -Anything outside this set → `@pytest.mark.slow`. +Prefer `attn-only-{1,2,3,4}l` and `tiny-stories-1M` for fast tests — `gpt2` is slow on CI's CPU runners. Use `gpt2` only when you need GPT-2 numerics. Anything outside the cached set → `@pytest.mark.slow`. --- ## The `slow` marker -Defined in [`pyproject.toml`](../pyproject.toml) as `slow: marks tests as slow (deselect with '-m "not slow"')`. Add it when the test: +`pyproject.toml`: `"slow: marks tests as slow (deselect with '-m \"not slow\"')"`. Add when the test: - loads a non-cached model -- iterates exhaustively over many parameter combinations -- takes more than ~5 seconds per invocation +- iterates exhaustively over many param combos +- takes >5 s per invocation -Deselect with `pytest -m "not slow"`. Note: the default `make` targets do NOT filter out `slow`; the marker is for ad-hoc local runs. +Deselect with `pytest -m "not slow"`. Default `make` targets do NOT filter; the marker is for ad-hoc runs. --- ## MPS rules -- The macOS CI job [`mps-checks`](../.github/workflows/checks.yml) sets `TRANSFORMERLENS_ALLOW_MPS=1` and runs `tests/unit`, `tests/integration`, and `tests/mps` on `macos-latest`. -- `get_device()` returns `"cpu"` unless `TRANSFORMERLENS_ALLOW_MPS=1` is set — the default protects against silent MPS divergence. -- The long `--ignore=` list in the workflow is for tests known to diverge on MPS (MoE, optimizer compatibility, KV-cache layout). It is **not** a license to add new skips — it documents existing limitations. -- [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) is the template: float32 only (MPS lacks bfloat16), TinyStories-1M only (50 MB fits the runner), `torch.mps.empty_cache()` + `gc.collect()` between tests. -- Top of the module: `pytestmark = pytest.mark.skipif(not torch.backends.mps.is_available(), reason="MPS not available")` — required on any MPS-only test. +- [`mps-checks`](../.github/workflows/checks.yml) sets `TRANSFORMERLENS_ALLOW_MPS=1` and runs `tests/unit`, `tests/integration`, `tests/mps` on `macos-latest`. +- `get_device()` returns `"cpu"` unless `TRANSFORMERLENS_ALLOW_MPS=1` — protects against silent MPS divergence. +- The workflow's long `--ignore=` list documents existing MPS divergence (MoE, optimizer compat, KV-cache layout); it's **not** a license to add new skips. +- [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) is the template: float32 only (no bfloat16 on MPS), TinyStories-1M only (50 MB fits the runner), `torch.mps.empty_cache()` + `gc.collect()` between tests. +- MPS-only modules need: `pytestmark = pytest.mark.skipif(not torch.backends.mps.is_available(), reason="MPS not available")`. --- ## Hard "don'ts" -Repo-wide rules in [AGENTS.md §10](../AGENTS.md#10-hard-rules) apply. Specifics for this directory: +Plus [AGENTS.md §10](../AGENTS.md#10-hard-rules): -- **No mocking model loads.** Use the session-scoped fixtures; they're cheap enough. -- **No mocking the HF Hub.** Tests hit the real hub with `enable_hf_retry()` handling 429s. -- **No platform `skipif` other than MPS.** No `skipif(sys.platform == 'win32')` or `skipif(not torch.cuda.is_available())` to dodge CI. -- **No `xfail` to dodge a failing test.** Fix the underlying bug — even if it predates your PR. -- **No copying acceptance-tier tests as templates for unit tests.** Their model fixtures will time out or OOM in the unit tier. +- **No mocking model loads** — session-scoped fixtures are cheap enough. +- **No mocking the HF Hub** — tests hit the real hub with `enable_hf_retry()` handling 429s. +- **No platform `skipif` outside MPS** — no `skipif(sys.platform == 'win32')` or `skipif(not torch.cuda.is_available())` to dodge CI. +- **No `xfail` to dodge a failing test** — fix the bug, even if pre-existing. +- **No copying acceptance-tier tests as unit-test templates** — their model fixtures time out / OOM at the unit tier. diff --git a/tests/QUARANTINES.md b/tests/QUARANTINES.md index 1d8212bd4..5cf3e9fb2 100644 --- a/tests/QUARANTINES.md +++ b/tests/QUARANTINES.md @@ -1,124 +1,118 @@ # Test Quarantines -Inventory of every `@pytest.mark.skip`, `skipif`, and `xfail` in [`tests/`](.). Use this to triage failures: a test on this list with the matching reason is **not your bug** and shouldn't be debugged blindly. A test that fails but **isn't** on this list is something to investigate. +Inventory of every `skip` / `skipif` / `xfail` in [`tests/`](.). A test on this list with the matching reason is **not your bug** — don't debug blindly. A failure NOT on this list is real. -The repo-wide rule (see [AGENTS.md §10](../AGENTS.md#10-hard-rules)): **never add `xfail` / `skipif` to dodge a failing CI**. Every quarantine below has a category and a "When to un-skip" line. If you add a new skip, add it here too. +Rule ([AGENTS.md §10](../AGENTS.md#10-hard-rules)): **never add `xfail` / `skipif` to dodge a failing CI.** New skips need a row here. --- ## Permanent — optional dependency -These tests require an optional install group and are skipped on the default install. Will never enable in CI without infrastructure changes. - | Path | Marker | Trigger | |---|---|---| -| [`tests/unit/test_lit.py`](unit/test_lit.py) (×18) | `skipif(not LIT_AVAILABLE)` | `pip install lit-nlp` (the `lit` optional-dep group) | -| [`tests/unit/components/test_attention.py`](unit/components/test_attention.py) (line 48) | `skipif(not is_bitsandbytes_available())` | `uv sync --group quantization` | -| [`tests/unit/test_weight_processing.py`](unit/test_weight_processing.py) (line 477) | `skipif(not is_bitsandbytes_available())` | Same as above | -| [`tests/unit/factories/test_mlp_factory.py`](unit/factories/test_mlp_factory.py) (line 40) | `skipif(not is_bitsandbytes_available())` | Same as above | +| [`unit/test_lit.py`](unit/test_lit.py) (×18) | `skipif(not LIT_AVAILABLE)` | `pip install lit-nlp` (`lit` group) | +| [`unit/components/test_attention.py`:48](unit/components/test_attention.py) | `skipif(not is_bitsandbytes_available())` | `uv sync --group quantization` | +| [`unit/test_weight_processing.py`:477](unit/test_weight_processing.py) | same | same | +| [`unit/factories/test_mlp_factory.py`:40](unit/factories/test_mlp_factory.py) | same | same | -**Un-skip:** never (these are correct behaviour). To run locally, install the optional group. +**Un-skip:** never. Install the optional group to run locally. --- ## Permanent — hardware requirement -| Path | Marker | Required hardware | +| Path | Marker | Required | |---|---|---| -| [`tests/unit/test_next_sentence_prediction.py`](unit/test_next_sentence_prediction.py) (line 131) | `skipif(not torch.cuda.is_available())` | Any CUDA device | -| [`tests/unit/model_bridge/compatibility/test_next_sentence_prediction.py`](unit/model_bridge/compatibility/test_next_sentence_prediction.py) (line 95) | `skipif(not torch.cuda.is_available())` | Any CUDA device | -| [`tests/unit/components/test_attention.py`](unit/components/test_attention.py) (line 83) | `skipif(not torch.cuda.is_available())` (`reason="CUDA required for half/bfloat16 tests"`) | Any CUDA device | -| [`tests/acceptance/test_hooked_encoder.py`](acceptance/test_hooked_encoder.py) (line 227) | `skipif(not torch.cuda.is_available())` | Any CUDA device | -| [`tests/acceptance/test_hooked_encoder_decoder.py`](acceptance/test_hooked_encoder_decoder.py) (line 421) | `skipif(not torch.cuda.is_available())` | Any CUDA device | -| [`tests/acceptance/test_multi_gpu.py`](acceptance/test_multi_gpu.py) (line 91, 105) | `skipif(torch.cuda.device_count() < 2)` | 2+ CUDA devices | -| [`tests/acceptance/test_multi_gpu.py`](acceptance/test_multi_gpu.py) (line 22) | `skipif(torch.cuda.device_count() < 4)` | 4+ CUDA devices | -| [`tests/acceptance/model_bridge/test_multi_gpu_bridge.py`](acceptance/model_bridge/test_multi_gpu_bridge.py) (line 257) | `skipif(torch.cuda.device_count() < 2)` | 2+ CUDA devices | -| [`tests/mps/test_mps_basic.py`](mps/test_mps_basic.py) (module-level `pytestmark`) | `skipif(not torch.backends.mps.is_available())` | Apple Silicon | - -**Un-skip:** never. CI provides each hardware tier via its own dedicated job (CUDA via compatibility-checks → in practice CPU-only; MPS via `mps-checks`; multi-GPU is local-only). See [tests/AGENTS.md §MPS rules](AGENTS.md#mps-rules) and the carve-out list in [`.github/workflows/checks.yml`](../.github/workflows/checks.yml). +| [`unit/test_next_sentence_prediction.py`:131](unit/test_next_sentence_prediction.py) | `skipif(not cuda)` | Any CUDA | +| [`unit/model_bridge/compatibility/test_next_sentence_prediction.py`:95](unit/model_bridge/compatibility/test_next_sentence_prediction.py) | `skipif(not cuda)` | Any CUDA | +| [`unit/components/test_attention.py`:83](unit/components/test_attention.py) | `skipif(not cuda)` (half/bfloat16) | Any CUDA | +| [`acceptance/test_hooked_encoder.py`:227](acceptance/test_hooked_encoder.py) | `skipif(not cuda)` | Any CUDA | +| [`acceptance/test_hooked_encoder_decoder.py`:421](acceptance/test_hooked_encoder_decoder.py) | `skipif(not cuda)` | Any CUDA | +| [`acceptance/test_multi_gpu.py`:91,105](acceptance/test_multi_gpu.py) | `skipif(device_count < 2)` | 2+ CUDA | +| [`acceptance/test_multi_gpu.py`:22](acceptance/test_multi_gpu.py) | `skipif(device_count < 4)` | 4+ CUDA | +| [`acceptance/model_bridge/test_multi_gpu_bridge.py`:257](acceptance/model_bridge/test_multi_gpu_bridge.py) | `skipif(device_count < 2)` | 2+ CUDA | +| [`mps/test_mps_basic.py`](mps/test_mps_basic.py) module-level | `skipif(not mps)` | Apple Silicon | + +**Un-skip:** never. CI provides each tier (CUDA via compatibility-checks → CPU-only in practice; MPS via `mps-checks`; multi-GPU local-only). See [tests/AGENTS.md §MPS rules](AGENTS.md#mps-rules) and the `--ignore=` list in [`checks.yml`](../.github/workflows/checks.yml). --- ## Intentional — CI cost / network budget -These tests gate on `os.getenv("CI")` to avoid expensive HF Hub fetches or model loads that don't fit the runner. +`skipif(os.getenv("CI"))` to avoid expensive HF fetches / large loads. -| Path | Marker | +| Path | Reason | |---|---| -| [`tests/unit/model_bridge/supported_architectures/test_gemma2_adapter.py`](unit/model_bridge/supported_architectures/test_gemma2_adapter.py) (line 49) | `skipif(CI, reason="Network/disk fetch of tiny Gemma2 — skip in CI")` | -| [`tests/integration/model_bridge/test_bridge_integration.py`](integration/model_bridge/test_bridge_integration.py) (line 801) | `skipif(CI, reason="Skip Gemma2 test in CI to avoid timeout")` | -| [`tests/acceptance/model_bridge/compatibility/test_hook_completeness.py`](acceptance/model_bridge/compatibility/test_hook_completeness.py) (line 156) | `skipif(CI, reason="Gemma2 is too large for CI")` | +| [`unit/model_bridge/supported_architectures/test_gemma2_adapter.py`:49](unit/model_bridge/supported_architectures/test_gemma2_adapter.py) | "Network/disk fetch of tiny Gemma2" | +| [`integration/model_bridge/test_bridge_integration.py`:801](integration/model_bridge/test_bridge_integration.py) | "Skip Gemma2 in CI to avoid timeout" | +| [`acceptance/model_bridge/compatibility/test_hook_completeness.py`:156](acceptance/model_bridge/compatibility/test_hook_completeness.py) | "Gemma2 too large for CI" | -**Un-skip:** when local. These should run cleanly on a dev machine with `HF_TOKEN` sourced. +**Un-skip:** locally with `HF_TOKEN` sourced. --- ## Intentional — manual verification only -Skipped in pytest because they need real (large) weights, run as part of `verify_models`. - -| Path | Marker | +| Path | Reason | |---|---| -| [`tests/integration/model_bridge/test_qwen3_moe_bridge.py`](integration/model_bridge/test_qwen3_moe_bridge.py) (lines 155, 166) | `skip(reason="Requires real weights — run manually during verification")` | +| [`integration/model_bridge/test_qwen3_moe_bridge.py`:155,166](integration/model_bridge/test_qwen3_moe_bridge.py) | "Requires real weights — run via `verify_models`" | -**Un-skip:** run via `/verify-model Qwen/Qwen3-MoE-...` and the canonical [`verify_models`](../transformer_lens/tools/model_registry/AGENTS.md) workflow. +**Un-skip:** `/verify-model Qwen/Qwen3-MoE-...` ([tools/model_registry/AGENTS.md](../transformer_lens/tools/model_registry/AGENTS.md)). --- -## Upstream / platform bug — wait for fix +## Upstream / platform bug -| Path | Marker | Underlying issue | +| Path | Reason | Issue | |---|---|---| -| [`tests/unit/model_bridge/test_bridge_generate_no_tokenizer.py`](unit/model_bridge/test_bridge_generate_no_tokenizer.py) (lines 30, 128) | `skipif(_MACOS_ARM64, reason="Upstream macOS-arm64 KV-cache NaN; see linked issue.")` | Upstream PyTorch / HF KV-cache NaN on M-series Macs | +| [`unit/model_bridge/test_bridge_generate_no_tokenizer.py`:30,128](unit/model_bridge/test_bridge_generate_no_tokenizer.py) | `skipif(_MACOS_ARM64)` — KV-cache NaN | Upstream PyTorch/HF on M-series Macs | -**Un-skip:** when the upstream issue resolves. Don't bypass — the test will produce NaN logits. +**Un-skip:** when upstream resolves. Don't bypass — produces NaN logits. --- -## ⚠️ Technical debt — actively quarantined whole-file +## ⚠️ Technical debt — whole-file -These are **entire test modules** skipped via module-level `pytestmark`. They represent significant coverage gaps. Treat them as priority work to re-enable, not as permanent quarantine. +Entire test modules quarantined via module-level `pytestmark`. Significant coverage gap — priority to re-enable. | Path | Reason | |---|---| -| [`tests/acceptance/test_hooked_transformer.py`](acceptance/test_hooked_transformer.py) (line 19) | `"Temporarily skipped due to CI test pollution issues"` | -| [`tests/acceptance/test_hooked_encoder.py`](acceptance/test_hooked_encoder.py) (line 13) | `"Temporarily skipped due to CI test pollution issues"` | -| [`tests/acceptance/test_hooked_encoder_decoder.py`](acceptance/test_hooked_encoder_decoder.py) (line 10) | `"Temporarily skipped due to CI test pollution issues"` | +| [`acceptance/test_hooked_transformer.py`:19](acceptance/test_hooked_transformer.py) | "CI test pollution" | +| [`acceptance/test_hooked_encoder.py`:13](acceptance/test_hooked_encoder.py) | same | +| [`acceptance/test_hooked_encoder_decoder.py`:10](acceptance/test_hooked_encoder_decoder.py) | same | -**Un-skip:** root-cause the "test pollution issues" — likely a fixture-scope or import-ordering bug that leaks state across tests. Until then, these major acceptance tiers are dark. +**Un-skip:** root-cause the test pollution (fixture-scope or import-ordering bug). Until then, these acceptance tiers are dark. --- ## Technical debt — individual -Tests quarantined individually pending fixes. Each should have a tracking issue or a clear "fix me" note in the surrounding code. - -| Path | Marker | What it covers | +| Path | Marker | Covers | |---|---|---| -| [`tests/unit/factored_matrix/test_constructor.py`](unit/factored_matrix/test_constructor.py) (line 54) | `skip(...)` | FactoredMatrix constructor edge case | -| [`tests/unit/model_bridge/test_architecture_adapter.py`](unit/model_bridge/test_architecture_adapter.py) (line 436) | `skip(...)` | Adapter behaviour | -| [`tests/unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py`](unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py) (lines 138, 142) | `skipif(...) / xfail(...)` | Bridge↔HT patching parity | -| [`tests/unit/model_bridge/test_hook_alias_resolution.py`](unit/model_bridge/test_hook_alias_resolution.py) (line 89) | `xfail(strict=True, reason=...)` per-architecture | Hook-alias gaps | -| [`tests/unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py`](unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py) (lines 574, 609, 660, 680, 771) | `skipif(...)` × 5 | Specific Qwen3.5 adapter behaviours | -| [`tests/unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py`](unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py) (line 531) | `skipif(...)` | Qwen3-Next adapter behaviour | -| [`tests/integration/test_weight_processing_integration.py`](integration/test_weight_processing_integration.py) (line 238) | `skip(...)` | Weight-processing edge case | -| [`tests/integration/test_tensor_extraction_consistency.py`](integration/test_tensor_extraction_consistency.py) (line 33) | `skip(...)` | Tensor extraction consistency | -| [`tests/integration/test_tokenization_methods.py`](integration/test_tokenization_methods.py) (line 53) | `skipif(...)` | Tokenization method coverage | -| [`tests/integration/test_hooked_encoder_properties.py`](integration/test_hooked_encoder_properties.py) (line 71) | `xfail(...)` | HookedEncoder properties | -| [`tests/acceptance/model_bridge/compatibility/test_backward_hooks.py`](acceptance/model_bridge/compatibility/test_backward_hooks.py) (line 11) | `skip(...)` | Backward-hook compatibility | -| [`tests/acceptance/test_hooked_transformer.py`](acceptance/test_hooked_transformer.py) (lines 551, 560) | `skipif(...)` × 2 (inside the module-level skip) | Additional `from_pretrained_no_processing` cases | - -**Un-skip:** debug the underlying issue and remove the marker. Each removed marker should land in a focused PR with a regression test. +| [`unit/factored_matrix/test_constructor.py`:54](unit/factored_matrix/test_constructor.py) | `skip` | FactoredMatrix constructor edge case | +| [`unit/model_bridge/test_architecture_adapter.py`:436](unit/model_bridge/test_architecture_adapter.py) | `skip` | Adapter behaviour | +| [`unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py`:138,142](unit/model_bridge/test_bridge_vs_hooked_transformer_patching.py) | `skipif`/`xfail` | Bridge↔HT patching parity | +| [`unit/model_bridge/test_hook_alias_resolution.py`:89](unit/model_bridge/test_hook_alias_resolution.py) | `xfail(strict=True)` per-arch | Hook-alias gaps | +| [`unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py`:574,609,660,680,771](unit/model_bridge/supported_architectures/test_qwen3_5_adapter.py) | `skipif` ×5 | Qwen3.5 quirks | +| [`unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py`:531](unit/model_bridge/supported_architectures/test_qwen3_next_adapter.py) | `skipif` | Qwen3-Next quirks | +| [`integration/test_weight_processing_integration.py`:238](integration/test_weight_processing_integration.py) | `skip` | Weight-processing edge case | +| [`integration/test_tensor_extraction_consistency.py`:33](integration/test_tensor_extraction_consistency.py) | `skip` | Tensor extraction | +| [`integration/test_tokenization_methods.py`:53](integration/test_tokenization_methods.py) | `skipif` | Tokenization coverage | +| [`integration/test_hooked_encoder_properties.py`:71](integration/test_hooked_encoder_properties.py) | `xfail` | HookedEncoder properties | +| [`acceptance/model_bridge/compatibility/test_backward_hooks.py`:11](acceptance/model_bridge/compatibility/test_backward_hooks.py) | `skip` | Backward-hook compatibility | +| [`acceptance/test_hooked_transformer.py`:551,560](acceptance/test_hooked_transformer.py) | `skipif` ×2 (inside module-level skip) | `from_pretrained_no_processing` | + +**Un-skip:** debug the underlying issue and remove the marker. Each removal lands in a focused PR with a regression test. --- ## Adding a new quarantine -If you're tempted to add a `skip` / `skipif` / `xfail`, **read [AGENTS.md §10](../AGENTS.md#10-hard-rules) first**. The default answer is "fix the bug instead." +Read [AGENTS.md §10](../AGENTS.md#10-hard-rules) first — default answer is "fix the bug instead." -If a quarantine really is the right call: +If a quarantine is genuinely right: -1. Pick the right marker — `skipif(condition)` for environment gates, `skip(reason=)` for known-bad code paths, `xfail(strict=True, reason=)` when you expect failure and want CI to alert you if it starts passing. -2. Use a `reason=` string descriptive enough to look up later — not `"flaky"`, not `"broken"`. -3. **Add a row to the right section above**, with the path, marker, and an "un-skip when…" line. -4. If you're skipping a whole module via `pytestmark = pytest.mark.skip(...)`, flag it in the "actively quarantined whole-file" section so it gets visibility. +1. Pick the right marker — `skipif(condition)` for env gates; `skip(reason=)` for known-bad paths; `xfail(strict=True, reason=)` when you expect failure and want CI to alert if it passes. +2. Use a `reason=` descriptive enough to look up — not `"flaky"` or `"broken"`. +3. Add a row above with path, marker, "un-skip when" line. +4. Whole-module `pytestmark` skips go in the ⚠️ section for visibility. diff --git a/transformer_lens/model_bridge/supported_architectures/AGENTS.md b/transformer_lens/model_bridge/supported_architectures/AGENTS.md index d1a20a5e9..5f3f46d7a 100644 --- a/transformer_lens/model_bridge/supported_architectures/AGENTS.md +++ b/transformer_lens/model_bridge/supported_architectures/AGENTS.md @@ -87,9 +87,9 @@ Optional overrides on the class: ## Config-attr propagation -HF raw config attributes are invisible to TL-side consumers unless explicitly propagated to `self.cfg`. After analysing the HF model, walk the `config.json` and list every non-standard attribute. For each, decide whether the base adapter machinery already handles it; if not, mirror it onto `self.cfg`. +HF raw config attributes are invisible to TL-side consumers unless propagated to `self.cfg`. Walk the `config.json` for non-standard attributes; mirror anything the base machinery doesn't already handle. -**Surface-on-cfg attributes** (set on `self.cfg.`; the rest of the bridge reads them from there): +**Surface-on-cfg attributes** (set on `self.cfg.`; bridge reads from there): | HF attribute | Surface as | Used by | |---|---|---| @@ -114,16 +114,14 @@ Rule of thumb: if the model card or HF source mentions a numerical knob, assume ## When to override `preprocess_weights` -The framework default for `preprocess_weights()` is a no-op pass-through. Override it when **a numerical operation that HF's forward applies natively must also be baked into the raw weights** — otherwise `bridge.enable_compatibility_mode()` (which calls `process_weights` on the raw weights, expecting them to already encode all the math) produces wrong results. +Default is a no-op pass-through. Override when a numerical op HF applies natively in forward must also be baked into raw weights — otherwise `bridge.enable_compatibility_mode()` (which calls `process_weights` expecting the math to already be in the weights) diverges. -The trigger rule: +> **Trigger:** if a config attr changes the forward-pass math AND isn't re-applied by compat-mode weight processing, fold it into the relevant weight in `preprocess_weights()`. -> If a config attr changes the math of the forward pass AND would not be re-applied during compatibility-mode weight processing, fold it into the relevant weight inside `preprocess_weights()`. +Examples: -Concrete examples in-tree: - -- **Cohere** — `cfg.logit_scale` (default `0.0625`) must be folded into `unembed.weight`. HF forward multiplies logits by `logit_scale`; compat-mode `process_weights` does not, so without the fold, compat-mode logits are off by `1/16`. -- **Gemma1 / Gemma2 / Gemma3** — embedding scale (`√d_model`) must be folded into `embed.weight`. HF's `GemmaTextScaledWordEmbedding` scales internally on forward; compat-mode reads embeddings raw. +- **Cohere** — `cfg.logit_scale` (default `0.0625`) folds into `unembed.weight`. HF forward multiplies logits by `logit_scale`; compat-mode doesn't. +- **Gemma1/2/3** — embedding scale (`√d_model`) folds into `embed.weight`. HF's `GemmaTextScaledWordEmbedding` scales on forward; compat-mode reads raw. Skeleton: @@ -145,47 +143,41 @@ def preprocess_weights(self, state_dict: dict[str, torch.Tensor]) -> dict[str, t return state_dict ``` -**When you do NOT need `preprocess_weights`:** if the numerical operation lives entirely inside an HF submodule that the bridge delegates to (e.g. RoPE inside `CohereRotaryEmbedding`), HF forward applies it on both paths and you inherit parity for free. +**Skip the override when:** the op lives inside an HF submodule the bridge delegates to (e.g. RoPE inside `CohereRotaryEmbedding`) — HF forward applies it on both paths, parity is free. -**Verification**: a missing `preprocess_weights` fold most often degrades Phase 4 (text-generation quality), and can show up as Phase 3 (weight processing) below 75% if severe enough to trip the strict gate. P4's 50% bar is intentionally lenient, so a sub-100% P4 on a small parity-test model is worth investigating even when the system reports `STATUS_VERIFIED`. See [tools/model_registry/AGENTS.md §Phase-score thresholds](../../tools/model_registry/AGENTS.md#phase-score-thresholds). +**Verification:** a missing fold degrades Phase 4 (text-generation), and Phase 3 if severe enough to trip the strict gate. P4's 50% bar is intentionally lenient, so sub-100% P4 on a small parity-test model is worth investigating even with `STATUS_VERIFIED`. See [tools/model_registry/AGENTS.md §Phase-score thresholds](../../tools/model_registry/AGENTS.md#phase-score-thresholds). --- ## Tokenizer policy -Tokenizer behaviour is **per-model**, not per-architecture. Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, the multilingual fork may pad on the left where the original pads on the right, the new-generation tokenizer may have a different EOS. **Do not copy these values from your starter adapter blindly; check the target's [tokenizer_config.json] and chat template first.** - -Tokenizer-policy fields the adapter / load path is responsible for: +Tokenizer behaviour is **per-model**, not per-architecture — siblings routinely differ (instruct vs base BOS, padding side, EOS, chat template). **Don't copy these from your starter adapter blindly.** -| Flag | Meaning | Default behaviour | What goes wrong if wrong | -|---|---|---|---| -| `default_prepend_bos` | Prepend BOS to every input by default | Framework default unless set | Off-by-one in logit positions; generation starting from wrong context | -| `default_padding_side` | `"left"` or `"right"` | HF tokenizer default | Generation produces garbage for batched inputs of unequal length | -| EOS handling in generation | Stop on `tokenizer.eos_token_id` (single or list) | Single token from `cfg.eos_token_id` | Generation runs past the natural stop or stops prematurely | -| Chat-template wiring | Whether the bridge auto-applies `tokenizer.apply_chat_template` | Off — user calls it explicitly | Instruct-tuned models produce base-model-style continuations | -| Tokenizer class mismatch (HF reports `Tokenizer`, model card uses `Tokenizer`) | n/a | Whatever HF resolves | Subtle BPE/SentencePiece divergence; rare but real for forks | +| Flag | Default | Failure mode | +|---|---|---| +| `default_prepend_bos` | Framework default | Off-by-one logits; generation starts from wrong context | +| `default_padding_side` (`"left"` / `"right"`) | HF tokenizer default | Batched-uneven-length generation produces garbage | +| EOS handling (single or list) | `cfg.eos_token_id` | Generation runs past or stops short | +| Chat-template auto-apply | Off (user calls explicitly) | Instruct models produce base-style continuations | +| Tokenizer class mismatch | HF resolves | Subtle BPE/SentencePiece divergence (rare, real for forks) | -**How to verify the right values:** +**How to verify:** -1. **Run the tokenizer.** `tokenizer_config.json` LIES for some architectures — Cohere is the canonical example: it declares `add_bos_token=False` but HF's `__call__` prepends BOS anyway via `add_special_tokens=True`. The only reliable check is to actually invoke it: +1. **Run the tokenizer.** `tokenizer_config.json` lies for some architectures — Cohere declares `add_bos_token=False` but `__call__` prepends BOS via `add_special_tokens=True`. Only reliable check is to invoke it: ```python from transformers import AutoTokenizer t = AutoTokenizer.from_pretrained("") - print("encode:", t.encode("hello")) # raw encode - print("__call__:", t("hello").input_ids) # what generation actually uses - print("bos_token_id:", t.bos_token_id) + print(t("hello").input_ids, t.bos_token_id) ``` - If `t("hello").input_ids[0] == t.bos_token_id`, set `cfg.default_prepend_bos = True`. If the first token is NOT BOS, leave the flag unset. + First-token == BOS → set `cfg.default_prepend_bos = True`. Otherwise leave unset. -2. **Cross-check against `tokenizer_config.json`** for `padding_side`, `bos_token`, `eos_token`, `chat_template`. These tend to be honest; the runtime override mostly affects BOS-prepending. +2. **Cross-check `tokenizer_config.json`** for `padding_side`, `bos_token`, `eos_token`, `chat_template` (these tend to be honest; runtime mostly overrides BOS-prepending). +3. **Compare to closest sibling** in registry; differences are usually deliberate. +4. **When unsure, leave the framework default** — explicit wrongness hides under "I configured it." -3. **Compare to the closest sibling already in the registry.** If they differ, treat that as a deliberate choice and propagate it. - -4. **When unsure, leave the framework default.** Explicit wrongness is worse than implicit defaulting — the wrong explicit value hides under "I configured it" assumptions. - -`default_prepend_bos` is the most common trap (Cohere's tokenizer-config-vs-runtime mismatch, instruct/base divergence within Llama / Mistral / Gemma families). The same recipe applies to verifying every flag above. +`default_prepend_bos` is the most common trap (Cohere config-vs-runtime mismatch, instruct/base divergence in Llama/Mistral/Gemma). --- @@ -227,7 +219,13 @@ After writing `myarch.py` with `MyArchitectureAdapter`, register in **all four** ``` Missing → generated docs table omits the new architecture. -**Verify the four-place wiring**: [`tests/unit/tools/test_model_registry.py`](../../../tests/unit/tools/test_model_registry.py) has a `TestRegistrySyncedWithFactory` class with four bidirectional invariants over `SUPPORTED_ARCHITECTURES`, `HF_SUPPORTED_ARCHITECTURES`, and `CANONICAL_AUTHORS_BY_ARCH`. Run `uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory` after registering — the failure message tells you which set is missing your new key. (There's an `INTENTIONAL_EXCLUDES` carve-out in the same class for internal-only architectures and HF-emits-different-casing aliases; almost no new adapters need to add themselves there.) +**Verify the four-place wiring**: [`TestRegistrySyncedWithFactory`](../../../tests/unit/tools/test_model_registry.py) asserts four bidirectional invariants across `SUPPORTED_ARCHITECTURES`, `HF_SUPPORTED_ARCHITECTURES`, `CANONICAL_AUTHORS_BY_ARCH`. Run: + +```bash +uv run pytest tests/unit/tools/test_model_registry.py -k TestRegistrySyncedWithFactory +``` + +Failure message names the missing set. (`INTENTIONAL_EXCLUDES` in the test handles internal-only architectures and HF-emits-different-casing aliases; new adapters rarely belong there.) --- @@ -251,7 +249,7 @@ After writing `myarch.py` with `MyArchitectureAdapter`, register in **all four** ## Adding the HF repo to the registry -After registration, add the model ID to [`../../tools/model_registry/data/supported_models.json`](../../tools/model_registry/data/supported_models.json) so `verify_models` can resolve it. The entry shape is: +After registration, add the model ID to [`data/supported_models.json`](../../tools/model_registry/data/supported_models.json) so `verify_models` can resolve it. Entry shape: ```json { @@ -261,39 +259,35 @@ After registration, add the model ID to [`../../tools/model_registry/data/suppor "verified_date": null, "metadata": null, "note": null, - "phase1_score": null, - "phase2_score": null, - "phase3_score": null, - "phase4_score": null, - "phase7_score": null, - "phase8_score": null + "phase1_score": null, "phase2_score": null, "phase3_score": null, + "phase4_score": null, "phase7_score": null, "phase8_score": null } ``` -**Hand-edits are allowed only for adding new model-ID entries.** The `status`, `verified_date`, `note`, and `phaseN_score` fields are written exclusively by `update_model_status()` — never set them manually. See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). +**Hand-edits add new entries only.** `status`, `verified_date`, `note`, `phaseN_score` are written by `update_model_status()`; never set manually. See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md). -When adding the user-provided model, prompt the user whether to also add the canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]` — e.g. when adding `google/gemma-2-2b`, ask about `google/gemma-2-2b-it`, `google/gemma-2-9b`, `google/gemma-2-9b-it`, `google/gemma-2-27b`, `google/gemma-2-27b-it`. The HF scraper picks these up eventually, but explicit entries unblock `verify_models` against the siblings now. +Prompt the user about canonical sibling variants from `CANONICAL_AUTHORS_BY_ARCH[]` — e.g. adding `google/gemma-2-2b`, ask about `-2b-it`, `-9b`, `-9b-it`, `-27b`, `-27b-it`. The HF scraper picks these up eventually; explicit entries unblock `verify_models` immediately. -Do NOT add the model to [`transformer_lens/supported_models.py`](../../supported_models.py) — that file is HookedTransformer-only. +Do NOT add the model to [`supported_models.py`](../../supported_models.py) — HookedTransformer-only. --- ## Model source paths: `boot_transformers` vs `boot_native` -TransformerBridge has **two parallel load paths**, both routed through the same adapter system: +Two parallel load paths, same adapter system: | Source | Entry | Use case | |---|---|---| -| `boot_transformers` | `TransformerBridge.boot_transformers(model_name, ...)` | Default: wraps a HuggingFace model. Adapter maps HF module paths to canonical names. Every adapter in this directory supports this path. | -| `boot_native` | `TransformerBridge.boot_native(cfg, init_mode=...)` | TL-native transformer built from scratch, no HF dependency. Used by [`Realtime_Training_Telemetry_Demo.ipynb`](../../../demos/Realtime_Training_Telemetry_Demo.ipynb) and research that needs cfg-driven small models. | +| `boot_transformers` | `TransformerBridge.boot_transformers(model_name, ...)` | Default — wraps an HF model. Every adapter here supports it. | +| `boot_native` | `TransformerBridge.boot_native(cfg, init_mode=...)` | TL-native, no HF dependency. Used by [`Realtime_Training_Telemetry_Demo.ipynb`](../../../demos/Realtime_Training_Telemetry_Demo.ipynb) and cfg-driven small models. | -Native models are built by [`transformer_lens/model_bridge/sources/native/model.py`](../sources/native/model.py) and routed through [`native.py`](native.py) — the adapter whose `component_mapping` adapts to `cfg` (rotary drops `pos_embed`, RMS norm picks `RMSNormalizationBridge`, gated MLP picks `GatedMLPBridge`, `attn_only` drops MLP). Init policies live in [`sources/native/init.py`](../sources/native/init.py): `gpt2` (default, Normal with 1/√(2·n_layers) residual scaling), `xavier_uniform` / `xavier_normal`, `kaiming_uniform` / `kaiming_normal`. Determinism uses a scoped `torch.Generator` so seeded init does not perturb the caller's global RNG. +Native models live in [`sources/native/model.py`](../sources/native/model.py); routed through [`native.py`](native.py) whose `component_mapping` adapts to `cfg` (rotary drops `pos_embed`, RMS norm → `RMSNormalizationBridge`, gated → `GatedMLPBridge`, `attn_only` drops MLP). Init policies in [`sources/native/init.py`](../sources/native/init.py): `gpt2` (default, Normal + 1/√(2·n_layers) residual scaling), `xavier_uniform/normal`, `kaiming_uniform/normal`. Scoped `torch.Generator` keeps seeded init from perturbing global RNG. -Cfg-driven Native features: `normalization_type` (LN / RMS / RMSPre), `final_rms`, `gated_mlp`, `attn_only`, `n_key_value_heads` (GQA), `attn_scores_soft_cap`, `output_logits_soft_cap`, `positional_embedding_type` (standard / rotary), `rotary_dim` / `rotary_base` / `rope_scaling` (linear PI, dynamic/NTK, llama3 by-parts). +Cfg-driven Native features: `normalization_type` (LN / RMS / RMSPre), `final_rms`, `gated_mlp`, `attn_only`, `n_key_value_heads`, `attn_scores_soft_cap`, `output_logits_soft_cap`, `positional_embedding_type`, `rotary_dim`, `rotary_base`, `rope_scaling` (linear PI, dynamic/NTK, llama3 by-parts). -**Sister source paths** under [`sources/`](../sources/) — `transformers/`, `vllm/`, `inspect/` — are different backends; their existence is mostly an internal detail for new adapter authors. Almost all new work goes through `transformers/` (HF backend) and only writes a `supported_architectures/.py` file. +Sister backends under [`sources/`](../sources/) — `transformers/`, `vllm/`, `inspect/` — are internal. Almost all new work writes a `supported_architectures/.py` file targeting the HF backend. -When **does a new adapter need to support both paths?** Almost never — if you're adding support for an HF-released architecture, you're writing a `boot_transformers` adapter and you're done. `boot_native` is for research-grade from-scratch models and uses the single `native.py` adapter generically. Touch `native.py` only if you're adding a primitive (e.g. a new norm type) that the cfg-driven dispatch doesn't already cover. +**Need both paths?** Almost never. Touch `native.py` only when adding a primitive (e.g. new norm type) the cfg-driven dispatch doesn't cover. --- @@ -326,37 +320,31 @@ class TestMyArchHookCompatibility: def test_(self, adapter): ... ``` -Unit adapter tests don't load weights — they instantiate the adapter from a synthetic cfg and assert structural / hook-compatibility properties (component_mapping shape, whether `setup_hook_compatibility` is overridden, conversion shapes, etc.). They run in the default `make unit-test` tier with no HF Hub access. +No weight load, no HF Hub access — synthetic cfg + structural assertions only. Runs in default `make unit-test`. -Add a dedicated test for **each architecture quirk** your adapter handles — softcaps, RMSNorm offsets, sliding window, custom eps attr, MoE routing. The Gemma1 test is a great example of a one-quirk-one-test file ("the adapter must NOT override setup_hook_compatibility because the HF embedding scales internally"). +Add one test per architecture quirk (softcaps, RMSNorm offsets, sliding window, custom `eps_attr`, MoE routing). Gemma1's "must NOT override `setup_hook_compatibility`" is a good one-quirk-one-test example. -### 2. Integration parity test — `tests/integration/model_bridge/test__adapter.py` (or `test__bridge.py`) +### 2. Integration parity test — `tests/integration/model_bridge/test__adapter.py` -Loads a real cached HF model, asserts logit parity vs HuggingFace at fp32 + eager attention. Templates: [`test_deepseek_adapter.py`](../../../tests/integration/model_bridge/test_deepseek_adapter.py), [`test_falcon_adapter.py`](../../../tests/integration/model_bridge/test_falcon_adapter.py), [`test_mamba_adapter.py`](../../../tests/integration/model_bridge/test_mamba_adapter.py). - -If your architecture's reference model is large enough to OOM CI, gate it with `@pytest.mark.skipif(bool(os.getenv("CI")), reason="...")` and add a row to [`tests/QUARANTINES.md`](../../../tests/QUARANTINES.md) under "CI cost / network budget." +Loads a real cached HF model; asserts logit parity at fp32 + eager attention. Templates: [`test_deepseek_adapter.py`](../../../tests/integration/model_bridge/test_deepseek_adapter.py), [`test_falcon_adapter.py`](../../../tests/integration/model_bridge/test_falcon_adapter.py), [`test_mamba_adapter.py`](../../../tests/integration/model_bridge/test_mamba_adapter.py). If the reference model OOMs CI, gate with `@pytest.mark.skipif(bool(os.getenv("CI")), ...)` and add a row to [`QUARANTINES.md`](../../../tests/QUARANTINES.md) under "CI cost / network budget." ### End-to-end registry verification -After both tests pass: - ```bash set -a; source .env; set +a uv run python -m transformer_lens.tools.model_registry.verify_models --architectures MyArchForCausalLM --per-arch 1 ``` -This is what [`/verify-model`](../../../.claude/commands/verify-model.md) wraps. See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md) for the canonical workflow and the [`verify_models` vs `main_benchmark` trap](../../tools/model_registry/AGENTS.md#tldr). - -If logit parity fails, see [docs/source/content/debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md) for the bisection workflow. +Wrapped by [`/verify-model`](../../../.claude/commands/verify-model.md). See [tools/model_registry/AGENTS.md](../../tools/model_registry/AGENTS.md) for the contract and the [`verify_models` vs `main_benchmark` trap](../../tools/model_registry/AGENTS.md#tldr). For parity failures: [debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md). --- ## Hard "don'ts" -- **No single-model files.** One file per architecture family. -- **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` / `typing.cast` if the type system disagrees. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). -- **No skipping any of the four registration sites.** Each one breaks something different (import, factory, scraper, generated docs). The invariant test in [`test_model_registry.py`](../../../tests/unit/tools/test_model_registry.py) will catch most of these. -- **No HT-style hook aliases inside adapters.** Aliases live in [`../bridge.py`](../bridge.py) via `build_alias_to_canonical_map()`; adapters declare Bridge-native names only. -- **No inheriting tokenizer-policy flags from the starter adapter** (`default_prepend_bos`, padding side, EOS). Tokenizer policy is per-model; see [Tokenizer policy](#tokenizer-policy). -- **No manual edits to existing `supported_models.json` entries' status / phase fields.** Only `update_model_status()` writes those. -- **No skipping the unit adapter test.** Every architecture has one; an adapter PR without one is incomplete. +- **No single-model files** — one file per architecture family. +- **No `# type: ignore` on `ComponentMapping`** — `isinstance` / `cast` ([AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). +- **No skipping any of the four registration sites** — each breaks something different. [Invariant test](../../../tests/unit/tools/test_model_registry.py) catches most. +- **No HT-style hook aliases inside adapters** — aliases live in [`bridge.py`](../bridge.py); adapters declare Bridge-native names only. +- **No inheriting tokenizer-policy flags from the starter** — see [Tokenizer policy](#tokenizer-policy). +- **No manual edits to existing `supported_models.json` entries' status/phase fields** — only `update_model_status()`. +- **No skipping the unit adapter test** — required for every architecture. diff --git a/transformer_lens/tools/model_registry/AGENTS.md b/transformer_lens/tools/model_registry/AGENTS.md index 270fe4628..e3ac11134 100644 --- a/transformer_lens/tools/model_registry/AGENTS.md +++ b/transformer_lens/tools/model_registry/AGENTS.md @@ -6,12 +6,12 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file ## TL;DR -> **Use `verify_models`, not `main_benchmark`.** Only `uv run python -m transformer_lens.tools.model_registry.verify_models …` updates `data/supported_models.json`. `uv run python -m transformer_lens.benchmarks.main_benchmark …` runs the same benchmark math but defaults to **not** writing the registry (needs `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). **If you ran `main_benchmark`, the registry is stale — re-run via `verify_models`.** +> **Use `verify_models`, not `main_benchmark`.** Only `verify_models` writes `data/supported_models.json`. `main_benchmark` runs the same math but defaults to NOT writing the registry (needs `--update-registry`, and even with it misses Phase 7/8 scores and the resume checkpoint). If you ran `main_benchmark`, the registry is stale. -- **`update_model_status()` in `registry_io.py` is the only function that mutates status / phase / note fields on existing entries.** Never set those fields by hand. -- **Adding a brand-new model-ID entry is allowed (and required before `verify_models --model ` can find it).** See [Adding a new model entry](#adding-a-new-model-entry). -- **Never run `verify_models` in parallel.** A single CUDA/MPS device cannot hold concurrent loads (see [AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). -- **HF token required.** `verify_models` calls `AutoConfig.from_pretrained(..., token=get_hf_token())` for parameter estimation; gated models fail without `HF_TOKEN`. Source `.env` first: `set -a; source .env; set +a`. +- **`update_model_status()` in `registry_io.py` is the only mutator of status/phase/note on existing entries.** Never set by hand. +- **Adding a new model-ID entry is allowed** (required before `verify_models --model ` can find it). See [Adding a new model entry](#adding-a-new-model-entry). +- **Never run in parallel** — single CUDA/MPS device OOMs ([AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). +- **HF token required** for parameter estimation on gated models. Source `.env`: `set -a; source .env; set +a`. --- @@ -74,39 +74,32 @@ Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. This file ## Adding a new model entry -To verify a model that isn't already in `data/supported_models.json`, you must add the entry by hand first. This is the **only** allowed hand-edit: +To verify a model not yet in `data/supported_models.json`, hand-add the entry first. This is the **only** allowed hand-edit: ```json { "architecture_id": "MyArchForCausalLM", "model_id": "org/repo-name", "status": 0, - "verified_date": null, - "metadata": null, - "note": null, - "phase1_score": null, - "phase2_score": null, - "phase3_score": null, - "phase4_score": null, - "phase7_score": null, - "phase8_score": null + "verified_date": null, "metadata": null, "note": null, + "phase1_score": null, "phase2_score": null, "phase3_score": null, + "phase4_score": null, "phase7_score": null, "phase8_score": null } ``` -After adding the entry, `verify_models --model org/repo-name` will populate the status / score / note fields via `update_model_status()`. **Do not** set those fields manually in the JSON; they are owned by the verification pipeline. +`verify_models --model org/repo-name` then populates status/score/note via `update_model_status()`. Never set those fields manually. --- ## `data/verification_checkpoint.json` (gitignored) -Resume state for long-running `verify_models` runs. Lists tested / verified / failed / skipped model IDs plus start timestamp. Behaviour: +Resume state for long-running runs (tested/verified/failed/skipped IDs + timestamp): -- On Ctrl-C, the SIGINT handler in `verify_models.py` finishes the current model, persists the checkpoint, and exits cleanly. -- Re-running with `--resume` reads it and skips already-tested models. -- Deleted automatically on a successful full run. -- Missing or corrupt → fresh run from scratch (safe, just slow). +- Ctrl-C → SIGINT handler finishes current model, persists checkpoint, exits cleanly. +- `--resume` reads it, skips already-tested models. +- Deleted on successful full run; missing/corrupt → fresh run (safe). -Agents should never edit it manually; let `verify_models` manage it. +Never edit manually. --- @@ -136,25 +129,25 @@ Agents should never edit it manually; let `verify_models` manage it. | 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL score (processor unavailable) also fails. | | 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL score also fails. | -Phase 4 is intentionally lenient — the source comment ([`verify_models.py:554`](verify_models.py)) calls it *"a quality metric, not a correctness check."* The 50% bar asks "is the generated text coherent at all?" rather than "is this adapter clean?" +Phase 4 is intentionally lenient — source ([`verify_models.py:554`](verify_models.py)) calls it *"a quality metric, not a correctness check."* The 50% bar asks "is the text coherent at all?" not "is this adapter clean?" -**For adapter authors**: a `STATUS_VERIFIED` entry whose Phase 4 score is well below 100% on a small parity-test model (e.g. `tiny-ForCausalLM`) can still indicate a real adapter bug that the system intentionally doesn't gate on. Cohere's first cold-agent run produced `P4=74.8%, status=1, note="Full verification completed"` — the system was correct (74.8 > 50, P4 doesn't gate), but the score reflected a missing `preprocess_weights` fold for `logit_scale`. Treat sub-100% P4 on tiny test models as a yellow flag worth investigating, even when the system reports VERIFIED. +**For adapter authors:** a `STATUS_VERIFIED` entry with P4 well below 100% on a small parity-test model can still indicate a real bug the system doesn't gate on (e.g. missing `preprocess_weights` fold). Investigate manually even when VERIFIED. -**Reading the result**: +**Reading the result:** -- `status == 1` AND `note == "Full verification completed"` → all hard gates passed AND no quality flag. Good. -- `status == 1` AND `note` mentions `"low text quality"` → P4 < 50%; investigate. -- `status == 1` AND P4 well below 100% on a small model, no quality flag → still potentially a subtle weight-fold or tokenizer bug; investigate manually. -- `status == 3` (FAILED) → `note` carries the specific failure reason and which tests failed; debug from there. +- `status==1` + `note="Full verification completed"` → all gates passed, no quality flag. Good. +- `status==1` + `note` mentions `"low text quality"` → P4 < 50%; investigate. +- `status==1` + P4 < 100% on a small model, no quality flag → potential weight-fold/tokenizer bug; investigate. +- `status==3` (FAILED) → `note` carries the failure reason; debug from there. -For Phase-1 / Phase-3 failures, see [supported_architectures/AGENTS.md §When to override preprocess_weights](../../model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights) and [debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md). For Phase-4 quality drift, see [supported_architectures/AGENTS.md §Tokenizer policy](../../model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) and the same `preprocess_weights` reference (logit-scale / embedding-scale folds typically degrade P4 without crossing the 50% gate). +P1/P3 failures: [supported_architectures/AGENTS.md §When to override preprocess_weights](../../model_bridge/supported_architectures/AGENTS.md#when-to-override-preprocess_weights), [debugging_numerical_divergence.md](../../../docs/source/content/debugging_numerical_divergence.md). P4 drift: [§Tokenizer policy](../../model_bridge/supported_architectures/AGENTS.md#tokenizer-policy) (logit-scale / embedding-scale folds typically degrade P4 without crossing the 50% gate). --- ## Hard "don'ts" -- **No `main_benchmark` for registry updates.** It misses Phase 7 / 8, doesn't manage the checkpoint, and silently skips the registry write unless `--update-registry` is passed. -- **No parallel `verify_models` runs.** Device OOM. See [AGENTS.md §10](../../../AGENTS.md#10-hard-rules). -- **No manual edits to existing entries' `status` / `verified_date` / `note` / `phaseN_score` fields.** Only `update_model_status()` writes those. (Adding new model-ID entries is allowed — see [Adding a new model entry](#adding-a-new-model-entry).) -- **No deleting `data/verification_checkpoint.json` mid-run.** Let the SIGINT handler clean up. -- **No skipping `.env` sourcing.** Gated-model arch verification (Llama / Mistral / Gemma / gated Qwen) needs `HF_TOKEN`. +- **No `main_benchmark` for registry updates** — misses P7/P8, no checkpoint, no registry write without `--update-registry`. +- **No parallel `verify_models`** — device OOM ([AGENTS.md §10](../../../AGENTS.md#10-hard-rules)). +- **No manual edits to existing entries' `status`/`verified_date`/`note`/`phaseN_score`** — only `update_model_status()` writes those. (New entries OK — see [Adding a new model entry](#adding-a-new-model-entry).) +- **No deleting `data/verification_checkpoint.json` mid-run** — let SIGINT clean up. +- **No skipping `.env`** — gated-model verification needs `HF_TOKEN`. From 8ab579ca8a511b77b35ce35a66ef53333110bb42 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 19:52:02 -0500 Subject: [PATCH 09/11] additional adjustments --- .claude/commands/format.md | 10 +++---- .claude/commands/task-complete.md | 8 +++--- .../supported_architectures/AGENTS.md | 26 ++++++++++++++++++- 3 files changed, 35 insertions(+), 9 deletions(-) diff --git a/.claude/commands/format.md b/.claude/commands/format.md index 21be5a3a9..fec0d2dda 100644 --- a/.claude/commands/format.md +++ b/.claude/commands/format.md @@ -1,14 +1,14 @@ --- -description: Format the codebase (pycln + isort + black) and run mypy. +description: Type-check then format the working tree. --- -Format and type-check the working tree: +Run mypy first, then format. Mypy fixes (`isinstance`, `cast`, signatures) can introduce format drift — running format after means a single pass. ``` -make format uv run mypy . +make format ``` -`make format` runs `pycln --all` (unused imports), `isort` (import sorting), and `black` (line length 100). `uv run mypy .` runs the type checker with the config in [pyproject.toml](../../pyproject.toml). +`uv run mypy .` uses the config in [pyproject.toml](../../pyproject.toml). `make format` runs `pycln --all` (unused imports), `isort`, and `black` (line length 100). -If mypy reports new errors, fix them — do not add `# type: ignore`. Prefer `isinstance` assertions or `typing.cast` for narrowing. See [AGENTS.md §10](../../AGENTS.md#10-hard-rules). +If mypy reports errors, fix the underlying typing issue — never add `# type: ignore`. Prefer `isinstance` / `typing.cast` ([AGENTS.md §10](../../AGENTS.md#10-hard-rules)). diff --git a/.claude/commands/task-complete.md b/.claude/commands/task-complete.md index 9e36a5289..318a5bb91 100644 --- a/.claude/commands/task-complete.md +++ b/.claude/commands/task-complete.md @@ -15,14 +15,16 @@ Review every comment and docstring **added or modified during this task** agains Use `git diff` against the merge-base to scope the review to genuinely new comments — do NOT rewrite unrelated comments elsewhere in the file. -### 2. Format +### 2. Type-check, then format + +Run mypy **before** format. Mypy fixes (`isinstance`, `typing.cast`, signature changes) can introduce format drift — running format after mypy means a single format pass. ``` -make format uv run mypy . +make format ``` -If mypy reports new errors, fix the underlying typing issue. Do not add `# type: ignore` — prefer `isinstance` assertions or `typing.cast`. +If mypy reports new errors, fix the underlying typing issue. Do not add `# type: ignore`. ### 3. Run the standard test tiers diff --git a/transformer_lens/model_bridge/supported_architectures/AGENTS.md b/transformer_lens/model_bridge/supported_architectures/AGENTS.md index 5f3f46d7a..8d1bf047d 100644 --- a/transformer_lens/model_bridge/supported_architectures/AGENTS.md +++ b/transformer_lens/model_bridge/supported_architectures/AGENTS.md @@ -110,6 +110,8 @@ HF raw config attributes are invisible to TL-side consumers unless propagated to Rule of thumb: if the model card or HF source mentions a numerical knob, assume it needs to land on `self.cfg`. If that knob changes weights or final outputs and HF's forward applies it natively, you ALSO need a `preprocess_weights` override or compatibility mode will diverge. +**Passthrough gotcha:** if your attr is NOT a declared `TransformerBridgeConfig` field, `TransformerBridgeConfig.from_dict(hf_config)` silently filters it out — `getattr(cfg, "", None)` in your adapter will return `None` and your fallback default fires regardless of the model's actual value. To propagate, add the attr name to `_HF_PASSTHROUGH_ATTRS` in [`sources/transformers.py`](../sources/transformers.py) (and the duplicate list in [`sources/_bridge_builder.py`](../sources/_bridge_builder.py)). Verify with an integration-test assertion: `assert bridge.cfg. == hf_model.config.`. + --- ## When to override `preprocess_weights` @@ -326,7 +328,29 @@ Add one test per architecture quirk (softcaps, RMSNorm offsets, sliding window, ### 2. Integration parity test — `tests/integration/model_bridge/test__adapter.py` -Loads a real cached HF model; asserts logit parity at fp32 + eager attention. Templates: [`test_deepseek_adapter.py`](../../../tests/integration/model_bridge/test_deepseek_adapter.py), [`test_falcon_adapter.py`](../../../tests/integration/model_bridge/test_falcon_adapter.py), [`test_mamba_adapter.py`](../../../tests/integration/model_bridge/test_mamba_adapter.py). If the reference model OOMs CI, gate with `@pytest.mark.skipif(bool(os.getenv("CI")), ...)` and add a row to [`QUARANTINES.md`](../../../tests/QUARANTINES.md) under "CI cost / network budget." +Loads a real cached HF model; asserts logit parity at fp32 + eager attention. Templates: [`test_deepseek_adapter.py`](../../../tests/integration/model_bridge/test_deepseek_adapter.py), [`test_falcon_adapter.py`](../../../tests/integration/model_bridge/test_falcon_adapter.py), [`test_mamba_adapter.py`](../../../tests/integration/model_bridge/test_mamba_adapter.py). + +Required test classes (one per concern; copy-rename from a sibling): + +| Class | Asserts | +|---|---| +| `TestBridgeCreation` | Boot succeeds, expected components present, `cfg.` set correctly | +| `TestForwardEquivalence` | Logit parity vs HF at fp32 + eager attention | +| `TestHFDelegation` | Bridge submodules hold live HF objects (e.g. `bridge.blocks[0].attn.q.original_component is hf_model.model.layers[0].self_attn.q_proj`) | +| `TestHookShapes` | Hook fires; output shape matches expectation (replace with `ParallelHooks` / `MoEHooks` etc. for architecture-specific block shapes) | +| `Test` | One class per architectural quirk: `LogitScale`, `RMSNormOffset`, `Softcap`, `TiedEmbedding`, etc. — each propagated config attr should have a test asserting `bridge.cfg. == hf_model.config.` | + +The boot fixture uses `dtype=` (Bridge's API), NOT `torch_dtype=` (HF's): + +```python +@pytest.fixture(scope="module") +def bridge(): + return TransformerBridge.boot_transformers( + "", device="cpu", dtype=torch.float32, attn_implementation="eager", + ) +``` + +If the reference model OOMs CI, gate with `@pytest.mark.skipif(bool(os.getenv("CI")), ...)` and add a row to [`QUARANTINES.md`](../../../tests/QUARANTINES.md) under "CI cost / network budget." ### End-to-end registry verification From 220e312b3922944692e404180a90890c26e78c66 Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Tue, 2 Jun 2026 22:33:26 -0500 Subject: [PATCH 10/11] additional detail files --- transformer_lens/benchmarks/AGENTS.md | 19 ++++++++ transformer_lens/config/AGENTS.md | 44 +++++++++++++++++ .../model_bridge/sources/AGENTS.md | 47 +++++++++++++++++++ 3 files changed, 110 insertions(+) create mode 100644 transformer_lens/benchmarks/AGENTS.md create mode 100644 transformer_lens/config/AGENTS.md create mode 100644 transformer_lens/model_bridge/sources/AGENTS.md diff --git a/transformer_lens/benchmarks/AGENTS.md b/transformer_lens/benchmarks/AGENTS.md new file mode 100644 index 000000000..4f9159bb5 --- /dev/null +++ b/transformer_lens/benchmarks/AGENTS.md @@ -0,0 +1,19 @@ +# Benchmarks — AGENTS.md + +This is the benchmark library that `verify_models` calls into. Read [the root AGENTS.md](../../AGENTS.md) for project-wide rules. + +## ⚠ Not the registry-update path + +**Use [`verify_models`](../tools/model_registry/AGENTS.md), not [`main_benchmark.py`](main_benchmark.py), for registry updates.** `main_benchmark` runs the same benchmark math but defaults to NOT writing `data/supported_models.json` (needs `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). + +If an agent is here because the user asked to "update the registry" or "verify a model," they took the wrong entry point — redirect to [tools/model_registry/AGENTS.md](../tools/model_registry/AGENTS.md). + +## What this directory IS for + +- The phase-by-phase benchmark implementations (`forward_pass.py`, `generation.py`, `hook_registration.py`, `weight_processing.py`, `multimodal.py`, `audio.py`, `text_quality.py`, `granular_weight_processing.py`, `component_outputs.py`, `backward_gradients.py`, `activation_cache.py`, `component_benchmark.py`, `hook_structure.py`). +- `main_benchmark.py` — exploratory benchmark runner for ad-hoc comparison. Useful for debugging a single model's phase scores without touching the registry. +- `utils.py` — shared helpers including `BenchmarkSeverity`. + +## When you ARE in the right place + +You're modifying a benchmark implementation (e.g., adding a new check to Phase 2 hook firing), debugging why a phase score is wrong, or adding a benchmark for a new architectural feature. In that case, follow project conventions in [AGENTS.md §10](../../AGENTS.md#10-hard-rules) and ensure the change is exercised by an existing or new `verify_models` phase. diff --git a/transformer_lens/config/AGENTS.md b/transformer_lens/config/AGENTS.md new file mode 100644 index 000000000..c88f0ef56 --- /dev/null +++ b/transformer_lens/config/AGENTS.md @@ -0,0 +1,44 @@ +# Config — AGENTS.md + +The config dataclasses that drive both `HookedTransformer` and `TransformerBridge`. Read [the root AGENTS.md](../../AGENTS.md) for project-wide rules. + +## File map + +| File | Class | Used by | +|---|---|---| +| [`transformer_lens_config.py`](transformer_lens_config.py) | `TransformerLensConfig` | Minimal base — only fields actually used by the system | +| [`transformer_bridge_config.py`](transformer_bridge_config.py) | `TransformerBridgeConfig(TransformerLensConfig)` | The Bridge config; what every Bridge adapter receives as `cfg` | +| [`hooked_transformer_config.py`](hooked_transformer_config.py) | `HookedTransformerConfig` | Legacy HT-only config (deprecated; see [AGENTS.md §2](../../AGENTS.md#2-two-systems-live-in-this-repo)) | + +## Adding a new HF-config attr to `TransformerBridgeConfig` — decision tree + +The `logit_scale` bug existed because the rules below weren't documented anywhere. **Pick exactly one of the four paths**; doing more than one creates silent-override risk. + +| Use case | Path | +|---|---| +| First-class TL field that adapters / hooks / weight processing read | **Declare as a dataclass parameter** on `TransformerBridgeConfig`. Set a sensible default. Update `map_default_transformer_lens_config` in [`sources/transformers.py`](../model_bridge/sources/transformers.py) to translate the HF-config attr name to your field name. | +| HF attr the adapter reads at runtime, no semantic translation needed | **Add to `_HF_PASSTHROUGH_ATTRS`** in BOTH [`sources/transformers.py:481`](../model_bridge/sources/transformers.py) AND [`sources/_bridge_builder.py:18`](../model_bridge/sources/_bridge_builder.py). The trap: adding to only one half-fixes. See [sources/AGENTS.md](../model_bridge/sources/AGENTS.md). | +| HF attr name differs from existing TL field | **Add an explicit handler** in `map_default_transformer_lens_config` (e.g. Gemma2's `final_logit_softcapping` → `output_logits_soft_cap`). Don't also add to PASSTHROUGH. | +| Just a derived view of an existing field | **Add a `@property`** on `TransformerBridgeConfig` (e.g. `head_dim` aliases `d_head`). | + +**Don't** declare the same attr both as a dataclass field AND a PASSTHROUGH entry — PASSTHROUGH writes happen AFTER `from_dict`, so the runtime value will silently overwrite whatever defaults / explicit handlers set. + +## Existing properties that may surprise you + +- **`head_dim`** is a read-only `@property` aliasing `d_head` (line 212). `setattr(cfg, "head_dim", val)` raises `AttributeError: property 'head_dim' has no setter`. Don't add it to PASSTHROUGH. +- **`n_heads`** has `= -1` as its placeholder in the constructor signature, deliberately. Comment in the source: *"Add n_heads to signature so it's not filtered out by from_dict"*. Don't "fix" this default. + +## Verifying a new field propagates + +Integration test pattern (the regression test that would have caught the `logit_scale` bug): + +```python +def test_cfg__matches_hf(bridge: TransformerBridge, hf_model: Any) -> None: + """Regression: must propagate from HF (not silently fall back).""" + bridge_val = getattr(bridge.cfg, "") + assert bridge_val == hf_model.config. +``` + +The `getattr` form sidesteps mypy's "undeclared field" complaint without `# type: ignore` (see [AGENTS.md §10](../../AGENTS.md#10-hard-rules)). + +Pick a test model whose `` value differs from the adapter's hardcoded fallback (if any) — otherwise the assertion passes by tautology. diff --git a/transformer_lens/model_bridge/sources/AGENTS.md b/transformer_lens/model_bridge/sources/AGENTS.md new file mode 100644 index 000000000..a75bbe210 --- /dev/null +++ b/transformer_lens/model_bridge/sources/AGENTS.md @@ -0,0 +1,47 @@ +# Sources — AGENTS.md + +Backend loaders that translate external models (HF, native) into a `TransformerBridge`. Read [the root AGENTS.md](../../../AGENTS.md) for project-wide rules. + +## File map + +| File | Role | +|---|---| +| [`_bridge_builder.py`](_bridge_builder.py) | Loader-agnostic helpers: `build_bridge_config_from_hf`, `build_bridge_from_module`, `detect_tokenizer_bos_eos` | +| [`transformers.py`](transformers.py) | HuggingFace backend: `boot` (entry point used by `TransformerBridge.boot_transformers`), `map_default_transformer_lens_config` (HF→TL config translation), `check_model_support`, `list_supported_models` | +| [`native/model.py`](native/model.py), [`native/init.py`](native/init.py) | TL-native models built from scratch (no HF), used by `TransformerBridge.boot_native` | + +## ⚠ The duplicate `_HF_PASSTHROUGH_ATTRS` trap + +There are **two** identical `_HF_PASSTHROUGH_ATTRS` lists: + +- `transformers.py:481` — inside `map_default_transformer_lens_config`, copies HF-config attrs onto `tl_config`. +- `_bridge_builder.py:18` — module-level, copies HF-config attrs onto `bridge_config` AFTER `TransformerBridgeConfig.from_dict` runs. + +Both fire sequentially on the same `hf_config` during `boot_transformers`. **When adding a passthrough attr, add it to BOTH lists** — adding to only one half-fixes the bug. A previous regen-agent run shipped a half-fix that the regression test caught only because we also added an assertion on the canonical bridge config. + +## Two-pass config-translation pipeline + +``` +hf_config + │ + ├─ map_default_transformer_lens_config() [transformers.py] + │ ├─ explicit handlers (d_model, n_heads, head_dim → d_head, etc.) + │ └─ _HF_PASSTHROUGH_ATTRS copy → tl_config + │ + ├─ TransformerBridgeConfig.from_dict() [filters to declared fields] + │ + └─ _HF_PASSTHROUGH_ATTRS copy → bridge_config [_bridge_builder.py] +``` + +**Implications:** + +- If an attr has an explicit handler in `map_default_transformer_lens_config` (like `head_dim` → `d_head`), it's translated to a declared `TransformerBridgeConfig` field — adding it to PASSTHROUGH is **wrong** (often raises `AttributeError` from a read-only property). +- If an attr is purely runtime / adapter-specific (like Cohere's `logit_scale`), it has no explicit handler and is dropped by `from_dict` — it MUST be in both PASSTHROUGH lists, or the adapter's `getattr(cfg, "", default)` silently falls back to its default forever. +- See [config/AGENTS.md](../../config/AGENTS.md) for the decision tree. + +## `boot_transformers` vs `boot_native` + +- `boot_transformers` is the HF path. Goes through `boot()` in `transformers.py` → `build_bridge_from_module` in `_bridge_builder.py` → adapter init. +- `boot_native` builds a TL-native transformer from a `TransformerBridgeConfig` directly (no HF dependency). Uses [`native.py`](../supported_architectures/native.py) as the single adapter; the model class is in [`native/model.py`](native/model.py). + +Almost every contributor change goes through the `boot_transformers` path. Touch `native/` only when adding a primitive (new norm type, new positional embedding variant) that the cfg-driven dispatch in `native/model.py` doesn't already cover. From d168d2f8487cf42e6dce7651f1309c72ed03479d Mon Sep 17 00:00:00 2001 From: jlarson4 Date: Wed, 3 Jun 2026 09:12:22 -0500 Subject: [PATCH 11/11] Additional documentation cleanup --- docs/source/content/contributing.md | 67 +++++++++---------- .../content/debugging_numerical_divergence.md | 6 +- 2 files changed, 36 insertions(+), 37 deletions(-) diff --git a/docs/source/content/contributing.md b/docs/source/content/contributing.md index 55202bbab..e23986081 100644 --- a/docs/source/content/contributing.md +++ b/docs/source/content/contributing.md @@ -96,28 +96,27 @@ Note that `black` line length is set to 100 in `pyproject.toml` (instead of the ## Project conventions -These rules apply across the codebase and reflect maintainer experience with recurring failure modes. +A few conventions have grown out of recurring failure modes. Following them tends to make reviews go faster. -### On test failures +### Test failures -- **Never dismiss a failing test as "pre-existing" or "unrelated"** — investigate every failure and fix the underlying issue. -- **Never add platform skips, `@pytest.mark.skip`, or `xfail` to make a failing CI green.** Debug the actual bug. If a skip is genuinely needed (optional dep, hardware requirement), add an entry to `tests/QUARANTINES.md` with the rationale. -- **If new tests surface pre-existing bugs, fix them.** Don't punt to a future PR. +Treat every failing test as a signal worth investigating, even if the failure looks unrelated to your change. "Pre-existing failure" is rarely true once you dig in, and the cost of finding out is small compared to shipping a regression. If a new test surfaces a pre-existing bug, it's worth fixing in the same PR rather than deferring. -### On code quality +Avoid reaching for `@pytest.mark.skip`, `xfail`, or platform-gated `skipif` to make a red CI go green. If a skip is genuinely required — an optional dependency, a hardware requirement, an upstream platform bug — document it in `tests/QUARANTINES.md` so the next person debugging knows why it's quarantined. -- **Never add `# type: ignore`.** Prefer `isinstance` assertions or `typing.cast` for narrowing. -- **Comments**: terse, one-line docstrings; inline comments should explain WHY, not WHAT. No multi-paragraph explanations. +### Code quality -### On numerical work +We avoid `# type: ignore` comments throughout the codebase; if the type system disagrees with you, the usual escape hatches are `isinstance` narrowing or `typing.cast`. Keep comments terse — one-line docstrings, and inline comments that explain *why* code does what it does rather than restating *what* it does. -- **Never claim observed drift is "fp noise" without empirical evidence.** Real bugs and accumulated rounding error look identical at noise scale until you measure. The cheap proof is fp64: if the diff stays the same magnitude in `dtype=torch.float64` on both sides, it's a bug; if it drops by ~8 orders of magnitude, it was noise. -- **Never run model verification or benchmarks in parallel.** A single CUDA/MPS device does not have memory for concurrent loads. +### Numerical work -### On environment and tooling +When a forward pass disagrees with HuggingFace, it's tempting to call the difference "floating-point noise" and move on. The trouble is that genuine bugs and accumulated rounding error are indistinguishable at small magnitudes until you measure. A cheap check: rerun the comparison in `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug; if it drops by roughly eight orders of magnitude, it was noise. The [Debugging Numerical Divergence](debugging_numerical_divergence.md) guide walks the rest of the bisection. -- Use `uv`, never `pip` or `poetry`. Run commands via `uv run ` or the appropriate `make` target. -- Source `.env` before any HF-Hub-hitting command. +Model verification and benchmark runs each load a full model; a single CUDA or MPS device generally lacks the memory to host two at once. Run them serially. + +### Environment and tooling + +Use `uv` rather than `pip` or `poetry` — commands run via `uv run ` or the appropriate `make` target. Source `.env` (`set -a; source .env; set +a`) before any command that hits the HuggingFace Hub. ## Two systems live in this repo @@ -128,7 +127,7 @@ The library is mid-transition between two parallel paths: | `TransformerBridge` | v3 — default for new work | `transformer_lens/model_bridge/` | Raw HF weights by default; `bridge.enable_compatibility_mode()` for HT-equivalent — see [Compatibility Mode](compatibility_mode.md) | `transformer_lens/tools/model_registry/data/supported_models.json` | | `HookedTransformer` | Legacy, maintenance mode, deprecated in 3.0 | `transformer_lens/HookedTransformer.py` + `transformer_lens/components/` | Folds LayerNorm + centres weights → does NOT match HF | `transformer_lens/supported_models.py` (**HT-only**) | -**Mirroring rule**: if you change behaviour in `HookedTransformer` that has a counterpart in `TransformerBridge`, update both in the same PR. They are parallel implementations of the same surface; drift between them is a recurring source of bugs. Conversely, `supported_models.py` is HT-only — Bridge-only models live in the Bridge registry data file under `transformer_lens/tools/model_registry/`. +Because the two systems are parallel implementations of the same surface, behavioural changes on one side usually need a matching change on the other. If you change a feature in `HookedTransformer` that has a counterpart in `TransformerBridge` (or vice versa), update both in the same PR — drift between them has historically been a steady source of bugs. The registries are *not* parallel, though: `supported_models.py` is HookedTransformer-only, while Bridge-only models live in the Bridge registry data file under `transformer_lens/tools/model_registry/`. ## PR conventions @@ -258,12 +257,12 @@ Two guides walk through the process: - [Architecture Adapter Creation Guide](adapter_development/adapter-creation-guide.md) — start here. A step-by-step workflow for taking an HF model from unsupported to tested, registered adapter. - [HuggingFace Model Analysis Guide](adapter_development/hf-model-analysis-guide.md) — a reference for reading an HF model's `config.json` and source files to extract the attributes you'll set on `self.cfg`. -Adapters live in `transformer_lens/model_bridge/supported_architectures/.py` and are registered in **four** places: +Adapters live in `transformer_lens/model_bridge/supported_architectures/.py` and need to be registered in **four** places. Each registration site has a different consequence if you skip it, which is why the next section's invariant test is worth running before you open the PR. -1. **`transformer_lens/model_bridge/supported_architectures/__init__.py`** — import the adapter class and add to `__all__`. Missing → import error at boot. -2. **`transformer_lens/factories/architecture_adapter_factory.py`** — import the class and add to the `SUPPORTED_ARCHITECTURES` dict. The key must match `config.architectures[0]` exactly. Missing → `boot_transformers` raises "unsupported architecture." -3. **`transformer_lens/tools/model_registry/__init__.py`** — add to `HF_SUPPORTED_ARCHITECTURES` (the set) and `CANONICAL_AUTHORS_BY_ARCH` (the foundation-orgs map). Missing → HF scraper misses canonical models. -4. **`transformer_lens/tools/model_registry/generate_report.py`** — add a one-line entry to `ARCHITECTURE_DESCRIPTIONS` so the generated docs table covers the new architecture. +1. **`transformer_lens/model_bridge/supported_architectures/__init__.py`** — import the adapter class and add it to `__all__`. The package fails to import at boot if this is missed. +2. **`transformer_lens/factories/architecture_adapter_factory.py`** — import the class and add it to the `SUPPORTED_ARCHITECTURES` dict. The dict key must match `config.architectures[0]` exactly; otherwise `boot_transformers` raises "unsupported architecture." +3. **`transformer_lens/tools/model_registry/__init__.py`** — add the architecture name to `HF_SUPPORTED_ARCHITECTURES` (the set) and `CANONICAL_AUTHORS_BY_ARCH` (the foundation-orgs map). Without this, the HF scraper won't discover canonical models for the new architecture. +4. **`transformer_lens/tools/model_registry/generate_report.py`** — add a one-line entry to `ARCHITECTURE_DESCRIPTIONS` so the generated coverage table covers the new architecture. If you want a starter file, copy [adapter-template.py](../_static/adapter-template.py) into `supported_architectures/` and rename it. @@ -286,7 +285,7 @@ Two test layers, both required: - **HF raw config attributes are invisible to TL-side consumers unless explicitly propagated to `self.cfg`.** Walk the HF `config.json` and mirror any non-standard knobs (`final_logit_softcapping`, `attn_logit_softcapping`, `query_pre_attn_scalar`, `sliding_window`, `layer_types`, custom `eps_attr` names) onto `self.cfg` so weight processing and forward passes can see them. - **Some config attrs need both surface-on-cfg AND fold-into-weight** via a `preprocess_weights()` override. The trigger: a numerical operation HF's forward applies natively must also be baked into the raw weights, or `bridge.enable_compatibility_mode()` (which calls `process_weights` on raw weights) produces wrong results. Concrete examples in-tree: Cohere `logit_scale` → `unembed.weight`; Gemma embedding scale (`√d_model`) → `embed.weight`. Skip the fold and Phase 3 / Phase 4 of `verify_models` will silently degrade. -- **Tokenizer policy is per-model, not per-architecture.** Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, padding side can flip, EOS handling can differ. Never copy `default_prepend_bos`, padding side, or EOS handling from your starter adapter without checking the target's behaviour. **`tokenizer_config.json` lies for some architectures** (Cohere declares `add_bos_token=False` but HF's `__call__` prepends BOS anyway). The reliable check is to invoke the tokenizer: +- **Tokenizer policy is per-model, not per-architecture.** Sibling models in the same family routinely differ — the chat-instruct variant may prepend BOS where the base does not, padding side can flip, EOS handling can differ. It's worth re-checking `default_prepend_bos`, padding side, and EOS handling against the specific target rather than copying them from a starter adapter. `tokenizer_config.json` is not always reliable on its own — some architectures (Cohere is a notable example) declare `add_bos_token=False` but HF's `__call__` prepends BOS anyway. The most reliable check is to invoke the tokenizer directly: ```python from transformers import AutoTokenizer @@ -295,9 +294,9 @@ Two test layers, both required: print(t.bos_token_id) ``` - If `t("hello").input_ids[0] == t.bos_token_id`, set `cfg.default_prepend_bos = True`. Otherwise leave the flag unset. + If `t("hello").input_ids[0] == t.bos_token_id`, set `cfg.default_prepend_bos = True`; otherwise leave the flag unset. - **Hook names inside adapters are Bridge-native** (e.g., `blocks.{i}.hook_out`). HookedTransformer-style aliases (e.g., `blocks.{i}.hook_resid_post`) are registered elsewhere — in `transformer_lens/model_bridge/bridge.py` via `build_alias_to_canonical_map()`. Adapters declare canonical names only. -- **No `# type: ignore` on `ComponentMapping` types.** Use `isinstance` or `typing.cast` if the type system disagrees. +- **`ComponentMapping` types do not need `# type: ignore`.** If the type system disagrees, prefer `isinstance` narrowing or `typing.cast`; the project as a whole avoids `# type: ignore`. ### Verifying a new model @@ -308,22 +307,22 @@ set -a; source .env; set +a uv run python -m transformer_lens.tools.model_registry.verify_models --model ``` -`verify_models` runs phases 1–4 (forward correctness vs HF, hook firing + gradients, weight processing, generation quality) and updates `data/supported_models.json` with status and per-phase scores. **Always run `--dry-run` first** to project memory and parameter count without loading the model. Run one model at a time — concurrent loads OOM a single device. +`verify_models` runs phases 1–4 (forward correctness vs HF, hook firing + gradients, weight processing, generation quality) and updates `data/supported_models.json` with the resulting status and per-phase scores. We recommend running `--dry-run` first to project memory and parameter count without loading the model, and verifying one model at a time — concurrent loads tend to OOM a single device. -**Use `verify_models`, never `main_benchmark`** — only `verify_models` writes the registry. `main_benchmark` runs the same math but defaults to skipping the registry write (requires `--update-registry`, and even with that flag misses Phase 7 / 8 scores and the resume checkpoint). +A note on entry points: `verify_models` is the script that writes the registry. `main_benchmark` runs the same underlying benchmarks but defaults to *not* writing the registry (it requires `--update-registry`, and even then it doesn't record Phase 7 / 8 scores or the resume checkpoint). If you want the registry updated after your run, use `verify_models`. -**Read the per-phase scores, not just the status.** `verify_models` enforces hard pass/fail at the thresholds in `_MIN_PHASE_SCORES`; below threshold or a required-test failure causes `STATUS_FAILED`. The contract: +It's worth reading the per-phase scores in addition to the final status — the verifier enforces hard pass/fail thresholds, and a model that just clears the bar tells you something different than one that breezes through. The current thresholds: | Phase | Min score | Required tests | Effect when below threshold | |---|---|---|---| -| 1 | 100% | — | `STATUS_FAILED` | -| 2 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | -| 3 | 75% | `logits_equivalence`, `loss_equivalence` | `STATUS_FAILED` | -| 4 | 50% | — | **Non-gating** — below 50% adds `"low text quality"` to the registry `note`; never causes `STATUS_FAILED`. | -| 7 | 75% | `multimodal_forward` | `STATUS_FAILED`. NULL score also fails. | -| 8 | 75% | `audio_forward` | `STATUS_FAILED`. NULL score also fails. | - -Phase 4 is intentionally lenient — it's a coherence metric, not a correctness check. A sub-100% Phase-4 score on a small parity-test model can still indicate a real adapter bug that the system doesn't gate on (missing `preprocess_weights` fold, wrong `default_prepend_bos`, etc.); investigate manually even when the entry says VERIFIED. +| 1 | 100% | — | Verification fails | +| 2 | 75% | `logits_equivalence`, `loss_equivalence` | Verification fails | +| 3 | 75% | `logits_equivalence`, `loss_equivalence` | Verification fails | +| 4 | 50% | — | **Non-gating** — below 50% adds `"low text quality"` to the registry `note`; never fails verification. | +| 7 | 75% | `multimodal_forward` | Verification fails. A NULL score also fails. | +| 8 | 75% | `audio_forward` | Verification fails. A NULL score also fails. | + +Phase 4 is intentionally lenient — it's a coherence metric, not a correctness check. A sub-100% Phase-4 score on a small parity-test model can still indicate a real adapter bug that the gates don't catch (missing `preprocess_weights` fold, wrong `default_prepend_bos`, and so on); the model can pass verification overall and still be worth a manual look. If verification fails by `~1e-3` or more against the HF reference, the bisection workflow lives at [Debugging Numerical Divergence](debugging_numerical_divergence.md). diff --git a/docs/source/content/debugging_numerical_divergence.md b/docs/source/content/debugging_numerical_divergence.md index a0064f098..01568a3ce 100644 --- a/docs/source/content/debugging_numerical_divergence.md +++ b/docs/source/content/debugging_numerical_divergence.md @@ -2,7 +2,7 @@ When a Bridge adapter's integration test fails by `~1e-3` (or any larger delta) against the HuggingFace reference, the failure mode is almost always one of a small set of recurring bugs. This page walks the bisection workflow. -> **Hard rule** (see also the [project conventions in contributing.md](contributing.md#on-numerical-work)): never call observed drift "fp noise" without empirical evidence. Real bugs and accumulated rounding error look identical at noise scale until you measure. +> A note before you start: it's tempting to attribute small drift to "floating-point noise" and move on, but genuine bugs and accumulated rounding error are indistinguishable at small magnitudes until you measure. The [numerical-work conventions in contributing.md](contributing.md#numerical-work) describe the cheap fp64 check that disambiguates the two. --- @@ -97,7 +97,7 @@ Empirically, in this codebase: - **bf16, eager**: `< 1e-2` is the noise floor. - **fp32, sdpa**: `< 5e-4` due to sdpa's internal reductions. Use eager for parity tests. -If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug. If it drops by ~8 orders of magnitude, it was noise. See the [project conventions in contributing.md](contributing.md#on-numerical-work) for the rule behind this. +If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` on both sides. If the diff stays the same magnitude, it's a bug. If it drops by ~8 orders of magnitude, it was noise. See the [numerical-work conventions in contributing.md](contributing.md#numerical-work) for more context on why this check is worth running. ## 7. Tooling @@ -108,4 +108,4 @@ If you suspect noise, the cheap proof is to run **fp64**: `dtype=torch.float64` --- -If you exhaust this guide and still can't localize the bug, the failure pattern is worth adding to §2 above so the next agent doesn't repeat the bisection. +If you exhaust this guide and still can't localize the bug, the failure pattern is worth adding to §2 above so the next contributor doesn't repeat the bisection.