diff --git a/README.md b/README.md
index 030ee52..3eeaa37 100644
--- a/README.md
+++ b/README.md
@@ -29,27 +29,21 @@ A **Databricks-native document intelligence + agent** stack: parse PDFs once wit
                                            [2], regulation [3]…"
 ```
 
+For motivation, architecture diagrams, the Spec-Kit + Claude Code build workflow, and the chicken-egg deploy-ordering story, see [**`docs/design.md`**](./docs/design.md). For day-2 ops, see [**`docs/runbook.md`**](./docs/runbook.md).
+
 ---
 
 ## Table of contents
 
-- [Why this exists](#why-this-exists)
 - [Features](#features)
 - [Readiness levels](#readiness-levels)
 - [Prerequisites](#prerequisites)
-  - [Software](#software)
-  - [Databricks workspace](#databricks-workspace)
-  - [Free trial signup](#free-trial-signup)
 - [Getting started](#getting-started)
-- [Architecture](#architecture)
-- [How it's built — three pillars](#how-its-built--three-pillars)
-- [Deploy ordering: foundation → consumers](#deploy-ordering-foundation--consumers)
 - [CLEARS quality gate](#clears-quality-gate)
 - [Configuration](#configuration)
 - [Testing & validation](#testing--validation)
 - [Deployment](#deployment)
 - [Repo layout](#repo-layout)
-- [What you can learn from this repo](#what-you-can-learn-from-this-repo)
 - [Limitations](#limitations)
 - [Contributing](#contributing)
 - [Security](#security)
@@ -58,14 +52,6 @@ A **Databricks-native document intelligence + agent** stack: parse PDFs once wit
 
 ---
 
-## Why this exists
-
-Databricks shipped a lot of new generative-AI surface area in 2025–2026: `ai_parse_document`, Mosaic AI Vector Search, the Agent Framework, AI Gateway, Lakebase, Databricks Apps. Tutorials show each piece in isolation; nobody shows them wired together with **eval gates, governance, and reproducible deploys** the way you'd actually ship to analysts.
-
-This repo is that worked example. Drop a PDF into a governed UC volume; ten minutes later, an analyst can ask cited questions in plain English with end-to-end audit. The whole stack is described declaratively as one **Databricks Asset Bundle (DAB)** plus a small bootstrap script. DAB manages catalog/schema/volume, pipeline, jobs, the Vector Search **endpoint**, the Lakebase instance, the serving endpoint, the monitor, the app, and the dashboard; the Vector Search **index** itself is created and synced by `jobs/index_refresh/sync_index.py` (DAB doesn't yet manage indexes as a resource type), and the agent model version is registered by `agent/log_and_register.py`. The bootstrap script orchestrates them in the right order.
-
-It also demonstrates a development workflow: **Spec-Kit** for spec-driven design, **Claude Code** with Databricks skill bundles for AI-assisted implementation, six **non-negotiable constitution principles** that gate every plan. See [How it's built](#how-its-built--three-pillars).
-
 ## Features
 
 - **End-to-end document intelligence pipeline** — Auto Loader ingest → `ai_parse_document` → section explosion → `ai_classify` + `ai_extract` → 5-dim quality rubric → Vector Search Delta-Sync index (the endpoint is DAB-managed; the index is created/synced by `jobs/index_refresh/sync_index.py`). SQL-only pipeline (Lakeflow Spark Declarative Pipelines).
@@ -181,7 +167,7 @@ DOCINTEL_WAREHOUSE_ID=<from-step-2> \
 ./scripts/bootstrap-dev.sh
 ```
 
-The script handles the chicken-egg ordering automatically — see [Deploy ordering](#deploy-ordering-foundation--consumers).
+The script handles the chicken-egg ordering automatically — see [`docs/design.md` § Deploy ordering](./docs/design.md#deploy-ordering-foundation--consumers).
 
 ### 5. Run the eval gate
 
@@ -228,323 +214,9 @@ For a guided 30-minute tour, see [`specs/001-doc-intel-10k/quickstart.md`](./spe
 
 ---
 
-## Architecture
-
-### Two halves: an offline pipeline, and an online agent
-
-```
-   ╔═══════════════════════════════════════════════════════════════════╗
-   ║                  pipelines/sql/  (one SQL file per tier)          ║
-   ╚═══════════════════════════════════════════════════════════════════╝
-
-  raw_filings/       ┌─────────────────┐   ┌─────────────────┐   ┌──────────────────┐
-  ACME_10K.pdf  ──▶  │  bronze_filings │──▶│ silver_parsed_  │──▶│ gold_filing_     │
-  BETA_10K.pdf       │  (raw bytes,    │   │ filings (parsed │   │ sections (one    │
-  GAMMA_10K.pdf      │   filename,     │   │ VARIANT —       │   │  row per parsed  │
-                     │   ingested_at)  │   │ ai_parse_       │   │  $.sections[*];  │
-                     │                 │   │ document)       │   │  fallback to     │
-                     │  >50MB rejects: │   │                 │   │  full_document   │
-                     │  bronze_filings │   │ Status: ok /    │   │  if absent)      │
-                     │  _rejected      │   │ partial / error │   │                  │
-                     └─────────────────┘   └─────────────────┘   │ gold_filing_kpis │
-                          01_bronze.sql       02_silver_parse    │ (typed columns:  │
-                                                .sql             │  segment_revenue │
-                                                                 │  ARRAY<STRUCT…>, │
-                                                                 │  top_risks       │
-                                                                 │  ARRAY<STRING>)  │
-                                                                 └──────────────────┘
-                                                                  03_gold_classify
-                                                                  _extract.sql
-                                                                          │
-                                                                          ▼
-                                                                 ┌──────────────────┐
-                                                                 │ gold_filing_     │
-                                                                 │ quality          │
-                                                                 │ (5-dim rubric:   │
-                                                                 │  parse, layout,  │
-                                                                 │  ocr, sections,  │
-                                                                 │  kpi → 0-30)     │
-                                                                 └──────────────────┘
-                                                                  04_gold_quality.sql
-```
-
-**Key idea — "parse once, extract many":** PDFs are expensive to parse. Silver runs `ai_parse_document` exactly once per file and stores the structured result as a `VARIANT`. Everything downstream — classification, KPI extraction, summarization, quality scoring — reads the parsed output, never the raw bytes. This is a non-negotiable constitution principle.
-
-**Triggering**: prod runs the pipeline in `continuous: true` mode so Auto Loader (`read_files`) reacts to new PDFs in the volume automatically. Dev overrides to `continuous: false` to avoid a 24/7 cluster during smoke iterations. See `resources/foundation/doc_intel.pipeline.yml` and the dev override block in `databricks.yml`.
-
-### Vector Search bridges data and agent
-
-```
-   gold_filing_sections           ┌─────────────────────────┐
-   (governed Delta table)  ─────▶ │  Mosaic AI Vector       │
-                                  │  Search Index           │
-   Filter: embed_eligible=true    │  (Delta-Sync — auto-    │
-   Embed column: "summary"        │   refreshes when Gold    │
-                                  │   updates)              │
-                                  └─────────────────────────┘
-
-   Why "summary" not the raw text?
-   ─────────────────────────────
-   Embedding a 50-page 10-K verbatim is noisy. We embed an LLM-written
-   summary instead — tighter, more searchable. Constitution principle IV:
-   "Quality before retrieval."
-```
-
-**Ownership note**: DAB manages the Vector Search **endpoint** (`resources/consumers/filings_index.yml`) and the index-refresh **job** (`resources/consumers/index_refresh.job.yml`). The **index** itself isn't yet a DAB-managed resource type as of CLI 0.298 — `jobs/index_refresh/sync_index.py` creates the Delta-Sync index on first run and triggers a sync on subsequent runs. That's why the bootstrap script's stage-2 deploy creates the endpoint + job, and the job's first execution materializes the actual index.
-
-### Agent has two paths, one endpoint
-
-```
-   User question
-        │
-        ▼
-   ┌────────────────────────────────────────────┐
-   │  AnalystAgent.predict()                    │
-   │  ─────────────────────                      │
-   │   contains "compare" / "vs" /              │
-   │   "between" + ≥2 company names?            │
-   └────────────┬─────────────────┬─────────────┘
-                │ no              │ yes
-                ▼                 ▼
-   ┌──────────────────────┐  ┌──────────────────────┐
-   │ Single-filing path   │  │ Supervisor path      │
-   │                      │  │                      │
-   │ 1. Hybrid search     │  │ For each company:    │
-   │    (keyword + vec)   │  │   ▸ run analyst path │
-   │ 2. Re-rank → top 5   │  │   ▸ pull KPIs from   │
-   │ 3. LLM generates     │  │     gold_filing_kpis │
-   │    answer w/ [1] [2] │  │ Format markdown      │
-   │    citations         │  │ table with cites.    │
-   └──────────────────────┘  └──────────────────────┘
-                │                 │
-                └────────┬────────┘
-                         ▼
-              ┌──────────────────────┐
-              │  Response JSON:      │
-              │   answer             │
-              │   citations[]        │
-              │   grounded: bool     │
-              │   latency_ms         │
-              └──────────────────────┘
-```
-
-The agent is an `mlflow.pyfunc` model registered in Unity Catalog and served behind an **AI Gateway** (rate limiting per-user, usage tracking, inference-table audit). Identity passthrough is implemented at the *App layer* when the workspace has Databricks Apps user-token passthrough enabled: the Streamlit app extracts the user's `x-forwarded-access-token` header and constructs a user-scoped `WorkspaceClient`. The served model is OBO-ready via MLflow `auth_policy` and Model Serving user credentials. If app-level passthrough is not enabled, the app falls back to service-principal auth and the repo must be treated as a reference/dev deployment, not a production row-level-security deployment. See [`SECURITY.md`](./SECURITY.md) and `app/README.md`.
-
-### Runtime stack
-
-```
-   ┌──────────────────────────────────────────────────────────────────┐
-   │                                                                  │
-   │     Databricks App (Streamlit)  ←  user interacts here          │
-   │     app/app.py                                                   │
-   │                                                                  │
-   │     ┌────────────────┐   ┌──────────────────┐                   │
-   │     │ Chat input box │   │ Citation chips   │                    │
-   │     │ Thumbs up/down │   │ Markdown tables  │                    │
-   │     └────────┬───────┘   └─────┬────────────┘                    │
-   │              │                 │                                 │
-   └──────────────│─────────────────│─────────────────────────────────┘
-                  │                 │
-                  │ query           │ feedback writes
-                  ▼                 ▼
-   ┌────────────────────────┐  ┌────────────────────────┐
-   │ Model Serving endpoint │  │  Lakebase Postgres     │
-   │ "analyst-agent-dev"    │  │  ─────────────────      │
-   │  (CPU, scales to 0)    │  │  conversation_history   │
-   │                        │  │  query_logs             │
-   │  + AI Gateway:         │  │  feedback               │
-   │    rate limit          │  │                        │
-   │      (per-user key)    │  │  (Postgres for tiny    │
-   │    inference-table     │  │   per-turn writes —    │
-   │      audit             │  │   Delta isn't great    │
-   │    usage tracking      │  │   at row-by-row)       │
-   └────────────────────────┘  └────────────────────────┘
-
-   OBO (user identity end-to-end, when enabled):
-   ──────────────────────────────
-   App reads `x-forwarded-access-token` from the request, builds
-   `WorkspaceClient(token=...)`, calls the serving endpoint with the
-   user's identity. The agent-side MLflow auth policy and Model Serving
-   OBO credentials let downstream calls run as the user. If the app-side
-   feature is unavailable, the bootstrap script prints an explicit warning
-   and the deployment remains reference/dev only.
-```
-
-**Why Postgres for state?** Delta tables are great for analytics but bad at "insert one tiny row per chat turn at high frequency." Lakebase is Databricks's managed Postgres — same governance, right tool for the job.
-
----
-
-## How it's built — three pillars
-
-This repo is a worked example of combining three things that, together, change how you ship Databricks projects.
-
-### Pillar 1 — Spec-Kit (spec-driven development)
-
-[Spec-Kit](https://github.com/github/spec-kit) is a workflow that forces you to write — and *clarify* — a specification before writing code. Each phase is a slash-command in Claude Code that produces a checked-in artifact:
-
-```
-   /speckit-specify   →  specs/<NNN>/spec.md         What & why (no how)
-        │
-        ▼
-   /speckit-clarify   →  appended Q&A in spec.md     Resolve ambiguity
-        │
-        ▼
-   /speckit-plan      →  specs/<NNN>/plan.md         Tech stack + structure
-        │              + research.md, data-model.md,
-        │                contracts/, quickstart.md
-        ▼
-   /speckit-tasks     →  specs/<NNN>/tasks.md        Dependency-ordered tasks
-        │
-        ▼
-   /speckit-analyze   →  cross-artifact consistency check
-        │
-        ▼
-   /speckit-implement →  the actual code
-```
-
-`.specify/extensions.yml` auto-commits at each phase boundary so the trail is clean. `.specify/memory/constitution.md` defines six **non-negotiable principles** every plan must respect:
-
-| # | Principle | What it means |
-|---|---|---|
-| I | **Unity Catalog source of truth** | Every table, volume, model, index, endpoint lives under `<catalog>.<schema>` — no DBFS, no workspace-local resources |
-| II | **Parse once, extract many** | `ai_parse_document` runs once at Silver → VARIANT; everything downstream reads the parsed output |
-| III | **Declarative over imperative** | SDP SQL pipelines, Lakeflow Jobs, DAB resources — no production notebooks |
-| IV | **Quality before retrieval** | 5-dim rubric scores every section; only ≥22/30 reach the index. Embed `summary`, not raw text |
-| V | **Eval-gated agents** | MLflow CLEARS scores must clear thresholds before any deploy is considered complete |
-| VI | **Reproducible deploys** | `databricks bundle deploy -t <env>` recreates the entire stack; `dev` and `prod` parity enforced |
-
-When you read `specs/001-doc-intel-10k/plan.md` you'll see a "Constitution Check" gate that maps each design decision back to the principle it satisfies. When you read `specs/001-doc-intel-10k/tasks.md` you'll see how each task derives from the plan, and how user-stories (P1, P2, P3) are independently demoable.
-
-### Pillar 2 — Databricks Asset Bundles + the Claude Code skill suite
-
-[**Databricks Asset Bundles**](https://docs.databricks.com/aws/en/dev-tools/bundles/) (DABs) describe most of the workspace state as YAML. One root `databricks.yml` declares variables and targets (`dev`, `prod`); `resources/**/*.yml` declares each resource (pipeline, jobs, Vector Search endpoint, index-refresh job, serving endpoint, app, monitor, dashboard, Lakebase instance + catalog). `databricks bundle deploy -t dev` reconciles workspace state to YAML. The two non-DAB-managed pieces — the Vector Search **index** itself and the registered **model version** — are produced at runtime by `jobs/index_refresh/sync_index.py` and `agent/log_and_register.py` respectively, which the bootstrap script orchestrates.
-
-This repo was built with Databricks-specific Claude Code skill bundles. Those bundles are distributed by Databricks via the CLI / Claude Code plugin channel and **are not vendored in this open-source tree** — install them locally if you have access, or reference the canonical Databricks docs (mapping in [`CONTRIBUTING.md`](./CONTRIBUTING.md)).
-
-| Skill bundle | What it provides | Canonical docs |
-|---|---|---|
-| **databricks-core** | Auth, profiles, data exploration, bundle basics | [docs](https://docs.databricks.com/aws/en/dev-tools/cli/) |
-| **databricks-dabs** | DAB structure, validation, deploy workflow, target separation | [docs](https://docs.databricks.com/aws/en/dev-tools/bundles/) |
-| **databricks-pipelines** | Lakeflow Spark Declarative Pipelines (`ai_parse_document`, `ai_classify`, `ai_extract`, `APPLY CHANGES INTO`) | [docs](https://docs.databricks.com/aws/en/dlt/) |
-| **databricks-jobs** | Lakeflow Jobs with retries, schedules, table-update / file-arrival triggers | [docs](https://docs.databricks.com/aws/en/jobs/) |
-| **databricks-apps** | Databricks Apps (Streamlit), App resource bindings | [docs](https://docs.databricks.com/aws/en/dev-tools/databricks-apps/) |
-| **databricks-lakebase** | Lakebase Postgres instances, branches, computes, endpoint provisioning | [docs](https://docs.databricks.com/aws/en/oltp/) |
-| **databricks-model-serving** | Model Serving endpoints, AI Gateway, served entities, scaling config | [docs](https://docs.databricks.com/aws/en/machine-learning/model-serving/) |
-
-Skills are loaded by Claude Code on demand. When you ask Claude to "wire up Vector Search," it should read the Databricks pipeline/model-serving guidance *before* writing YAML, so the output reflects current Databricks API shapes — not stale training data.
-
-### Pillar 3 — Claude Code as the implementation surface
-
-Spec-Kit produces the specs. The Databricks skills provide platform expertise. **Claude Code orchestrates both**: every phase artifact and every code file in this repo was authored by prompting Claude Code with the spec/plan/tasks as context.
-
-The workflow looks like:
-
-1. `/speckit-specify` → Claude writes spec.md from a natural-language description, you iterate via `/speckit-clarify` until ambiguity is resolved.
-2. `/speckit-plan` → Claude consults the constitution + Databricks skills, drafts plan.md with research decisions and architecture.
-3. `/speckit-tasks` → Claude generates a dependency-ordered task list grouped by user story (P1, P2, P3).
-4. `/speckit-implement` → Claude writes the actual SQL/Python/YAML, one task at a time, committing per task.
-5. Operational loops: when the deploy hits unexpected issues (it always does), Claude reads the runbook, fixes the issue, updates the runbook, commits.
-
-The "AI-driven" part isn't "the AI did it for you" — it's "the AI carries the boring parts (boilerplate YAML, retry-loop scripts, dependency analysis) so you focus on the actually-hard parts (what the spec should say, what the constitution should require)."
-
----
-
-## Deploy ordering: foundation → consumers
-
-DABs deploy *everything in one shot*. But our resources have a chicken-and-egg problem on a fresh workspace:
-
-```
-        ┌────────────────────────────────────────────────┐
-        │   What "bundle deploy" tries to create:        │
-        │                                                │
-        │   ▸ Pipeline   ────┐                           │
-        │   ▸ Tables     ────┼──── all need each other  │
-        │   ▸ Vector idx  ───┤                           │
-        │   ▸ Model       ───┤    Monitor wants the      │
-        │   ▸ Endpoint   ────┤    KPI table to exist     │
-        │   ▸ App         ───┤    BEFORE it can attach   │
-        │   ▸ Monitor    ────┘                           │
-        │   ▸ Lakebase   ────                            │
-        └────────────────────────────────────────────────┘
-
-   Endpoint needs a registered model version.
-        Model version needs the model logged.
-              Model logging needs the agent code.
-                    Monitor needs the table populated.
-                          Table needs the pipeline to run.
-
-   ▶ Single `bundle deploy` → 4+ errors on a fresh workspace.
-```
-
-The fix is a **staged deploy** orchestrated by `scripts/bootstrap-dev.sh`. Resources are split into two directories by data dependency:
-
-```
-   resources/
-   ├── foundation/        ← no data deps — deploy first
-   │   ├── catalog.yml             (schema + volume + grants)
-   │   ├── doc_intel.pipeline.yml
-   │   ├── retention.job.yml
-   │   └── lakebase_instance.yml
-   │
-   └── consumers/         ← need foundation to be RUNNING and producing data
-       ├── agent.serving.yml     (needs registered model version)
-       ├── kpi_drift.yml         (needs gold_filing_kpis table)
-       ├── filings_index.yml     (VS endpoint)
-       ├── index_refresh.job.yml (needs source table)
-       ├── analyst.app.yml       (needs Lakebase + agent endpoint)
-       ├── usage.dashboard.yml
-       └── lakebase_catalog.yml  (needs instance AVAILABLE)
-```
-
-**The bootstrap script auto-detects which mode to run** by checking whether the agent serving endpoint already has a populated config:
-
-```
-                       does analyst-agent-${target} have served entities?
-                                     │
-                          no ◀───────┴───────▶ yes
-                          │                     │
-                          ▼                     ▼
-                ┌──────────────────┐   ┌──────────────────┐
-                │  FIRST-DEPLOY    │   │  STEADY-STATE    │
-                │  (staged)        │   │  (full deploy)   │
-                ├──────────────────┤   ├──────────────────┤
-                │ 1. temp-rename   │   │ 1. bundle deploy │
-                │    consumers/*   │   │    (full bundle) │
-                │    .yml.skip     │   │                  │
-                │ 2. bundle deploy │   │ 2. refresh data: │
-                │    (foundation)  │   │    upload, run   │
-                │ 3. produce data: │   │    pipeline,     │
-                │    upload, run,  │   │    register new  │
-                │    register      │   │    model version │
-                │    model         │   │    + repoint     │
-                │ 4. wait Lakebase │   │    serving in-   │
-                │    AVAILABLE     │   │    place         │
-                │ 5. restore yamls │   │                  │
-                │ 6. bundle deploy │   │                  │
-                │    (full bundle) │   │                  │
-                └────────┬─────────┘   └────────┬─────────┘
-                         │                       │
-                         └───────────┬───────────┘
-                                     ▼
-                         ┌──────────────────────────┐
-                         │  Common to both:         │
-                         │  • bundle run analyst_app│
-                         │  • UC grants chain       │
-                         │  • smoke check           │
-                         └──────────────────────────┘
-```
-
-**Why two modes?** DAB tracks resource state; if you run the temp-rename trick against an *existing* deployment, DAB sees the consumer YAMLs as removed and plans to **delete** the serving endpoint, app, monitor, etc. Safe-ish on a fresh workspace; destructive in steady-state. The script detects mode and does the right thing.
-
-CI (`.github/workflows/deploy.yml`) assumes steady-state — the first-ever bring-up of a workspace must be done locally with `./scripts/bootstrap-dev.sh`. After that, every push to `main` runs the steady-state path: full `bundle deploy` → refresh data → repoint serving endpoint → grants → CLEARS gate.
-
-Full breakdown in [`docs/runbook.md`](./docs/runbook.md).
-
----
-
 ## CLEARS quality gate
 
-Before any deploy reaches production, an evaluation must pass. This is constitution principle V — eval-gated agents.
+Before any deploy reaches production, an evaluation must pass (constitution principle V — eval-gated agents).
 
 ```
    evals/dataset.jsonl  (30 questions: 20 single-filing P2 + 10 cross-company P3)
@@ -652,93 +324,21 @@ For day-2 ops (rolling agent versions, debugging low quality scores, inspecting
 ```
 databricks/
 ├── databricks.yml                 # Bundle root — variables + dev/prod targets
-├── README.md                      # This file
-├── CLAUDE.md                      # Runtime guidance for Claude Code sessions
-├── CONTRIBUTING.md                # Contribution guidelines
-├── SECURITY.md                    # Identity modes, OBO, grants
-├── PRODUCTION_READINESS.md        # Reference / Pilot / Production checklists
-├── VALIDATION.md                  # Validation procedure with expected outputs
-├── REAL_10K_PILOT.md              # Real EDGAR pilot guidance
-├── LICENSE                        # MIT
-│
-├── pipelines/sql/                 # Lakeflow SDP — Bronze → Silver → Gold (SQL)
-│   ├── 01_bronze.sql              # Auto Loader BINARYFILE ingest + size filter
-│   ├── 02_silver_parse.sql        # ai_parse_document → VARIANT
-│   ├── 03_gold_classify_extract.sql  # ai_classify + ai_extract → typed KPIs
-│   └── 04_gold_quality.sql        # 5-dim rubric → embed_eligible filter
-│
-├── agent/                         # Mosaic AI Agent Framework
-│   ├── analyst_agent.py           # mlflow.pyfunc model + routing
-│   ├── retrieval.py               # Hybrid search + re-rank + OBO VS client
-│   ├── supervisor.py              # Cross-company fan-out
-│   ├── tools.py                   # UC Function tool over gold_filing_kpis
-│   ├── _obo.py                    # On-behalf-of credentials helpers
-│   ├── log_and_register.py        # Register + auth_policy + alias
-│   └── tests/                     # pytest unit tests
-│
-├── app/                           # Streamlit App on Databricks Apps
-│   ├── app.py                     # Chat UI + citations + thumbs feedback + OBO
-│   ├── lakebase_client.py         # psycopg writes to query_logs / feedback
-│   ├── app.yaml                   # App runtime config (port, CORS, XSRF)
-│   └── README.md                  # App-specific runtime + local-dev notes
-│
-├── evals/                         # MLflow CLEARS eval gate
-│   ├── dataset.jsonl              # 30 hand-authored questions (P2 + P3)
-│   └── clears_eval.py             # mlflow.evaluate(model_type="databricks-agent")
-│
-├── jobs/                          # Lakeflow Jobs Python tasks
-│   ├── retention/prune_volume.py  # 90-day raw PDF cleanup
-│   └── index_refresh/sync_index.py  # Vector Search SYNC INDEX
-│
-├── resources/                     # DAB resources, split by data dependency
-│   ├── foundation/                # Stage 1 — no data deps
-│   └── consumers/                 # Stage 2 — depend on foundation data
-│
-├── scripts/                       # Operational scripts
-│   ├── bootstrap-dev.sh           # Fresh-workspace bring-up (staged deploy)
-│   └── wait_for_kpis.py           # Poll helper used by bootstrap + CI
-│
-├── samples/                       # Synthetic 10-Ks for smoke tests + eval
-│   ├── synthesize.py              # Reproducible PDF generator
-│   ├── ACME_10K_2024.pdf
-│   ├── BETA_10K_2024.pdf
-│   ├── GAMMA_10K_2024.pdf
-│   └── garbage_10K_2024.pdf       # SC-006 negative test (low quality)
-│
-├── specs/                         # Spec-Kit artifacts
-│   └── 001-doc-intel-10k/
-│       ├── spec.md                # What & why
-│       ├── plan.md                # Tech stack + Constitution Check
-│       ├── tasks.md               # Dependency-ordered implementation tasks
-│       ├── research.md            # Decision log
-│       ├── data-model.md          # Entity → table mapping
-│       ├── quickstart.md          # 30-min deploy walkthrough
-│       └── contracts/             # JSON schemas for KPIs + agent I/O
-│
-├── docs/
-│   └── runbook.md                 # Day-2 ops + bring-up workflow
-│
-├── .specify/                      # Spec-Kit machinery (constitution, hooks)
-│   ├── memory/constitution.md     # Six non-negotiable principles
-│   └── extensions.yml             # Auto-commit hooks per phase
-│
-└── .github/workflows/
-    └── deploy.yml                 # PR validate; main → steady-state deploy + CLEARS gate
-                                   # (first-ever bring-up must be done locally via bootstrap-dev.sh)
+├── pipelines/sql/                 # Lakeflow SDP — Bronze → Silver → Gold (SQL only)
+├── agent/                         # Mosaic AI Agent Framework — pyfunc, retrieval, OBO
+├── app/                           # Streamlit on Databricks Apps + Lakebase client
+├── evals/                         # MLflow CLEARS eval gate (dataset + runner)
+├── jobs/                          # Lakeflow Jobs (retention, index refresh)
+├── resources/foundation/          # DAB resources with no data deps
+├── resources/consumers/           # DAB resources that depend on foundation data
+├── scripts/                       # bootstrap-dev.sh + helpers
+├── samples/                       # Synthetic 10-K PDFs (regenerable)
+├── specs/001-doc-intel-10k/       # Spec-Kit artifacts (spec, plan, tasks, etc.)
+├── docs/                          # design.md (this repo's "why") + runbook.md (day-2 ops)
+└── .specify/                      # Spec-Kit machinery (constitution, hooks)
 ```
 
----
-
-## What you can learn from this repo
-
-- **How to wire `ai_parse_document` into Lakeflow SDP** — pattern for streaming-tables + `STREAM(...)` views + `APPLY CHANGES INTO` keyed on filename.
-- **How to score document quality before retrieval** — five 0–6 dimensions in SQL, threshold filter on the index source.
-- **How to log a Mosaic AI agent to UC** — `mlflow.pyfunc` with both inputs *and* outputs in the signature (UC requirement), `AnyType` for variable-shape fields, `auth_policy` + `resources` for OBO.
-- **How to ground an agent with citations** — hybrid Vector Search → re-rank → top-k → LLM with explicit "cite sources [1] [2]" prompt.
-- **How to handle DAB deploy ordering** — chicken-egg dependencies between heterogeneous resources, solved with a 5-step bootstrap rather than `depends_on` (which DAB doesn't reliably honor across resource types).
-- **How to gate deploys on MLflow eval** — `mlflow.evaluate(model_type="databricks-agent")` with documented metric keys, per-axis thresholds, exit-code gate in CI.
-- **How to do end-to-end OBO** — `ModelServingUserCredentials` from `databricks_ai_bridge`, `CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS` for Vector Search, MLflow `auth_policy` with `model-serving` + `vector-search` user scopes, App-side `user_api_scopes` declaration.
-- **How Spec-Kit + Claude Code + Databricks skills compose** — every artifact in `specs/` and `pipelines/` and `agent/` was generated through that loop.
+Top-level docs: [`CLAUDE.md`](./CLAUDE.md) (runtime guidance for Claude Code), [`CONTRIBUTING.md`](./CONTRIBUTING.md), [`SECURITY.md`](./SECURITY.md), [`PRODUCTION_READINESS.md`](./PRODUCTION_READINESS.md), [`VALIDATION.md`](./VALIDATION.md), [`REAL_10K_PILOT.md`](./REAL_10K_PILOT.md), [`LICENSE`](./LICENSE).
 
 ---
 
@@ -780,7 +380,5 @@ Released under the [**MIT License**](./LICENSE) — Copyright (c) 2026 Sathish K
 
 - [**Spec-Kit**](https://github.com/github/spec-kit) — spec-driven development workflow for AI coding agents.
 - [**Claude Code**](https://claude.com/claude-code) — Anthropic's CLI for AI-assisted development.
-- [**Anthropic Skills**](https://github.com/anthropics/skills) — general-purpose Claude Code skill bundles.
-- [**Databricks Lakehouse + Mosaic AI**](https://www.databricks.com/) — Unity Catalog, Lakeflow Spark Declarative Pipelines, Mosaic AI Vector Search, Agent Framework, Model Serving, AI Gateway, Databricks Apps, Lakebase, Lakehouse Monitoring.
-
-The 10-K analyst pattern is inspired by Databricks's own reference architecture for governed agent applications.
+- [**Agent Skills**](https://github.com/anthropics/skills) — general-purpose Claude Code skill bundles.
+- [**Databricks**](https://www.databricks.com/) — Unity Catalog, Lakeflow Spark Declarative Pipelines, Mosaic AI Vector Search, Agent Framework, Model Serving, AI Gateway, Databricks Apps, Lakebase, Lakehouse Monitoring.
diff --git a/docs/_social_preview.py b/docs/_social_preview.py
index f9a46a9..4bc55e1 100644
--- a/docs/_social_preview.py
+++ b/docs/_social_preview.py
@@ -26,17 +26,30 @@
 ACCENT = "#FF3621"       # Databricks orange
 LINE = "#252D3F"         # subtle separator
 
-# Arial bundles ship on macOS, support a wide glyph set including arrows,
-# and have explicit Regular/Bold/Black files (no .ttc index guessing).
-FONT_REG = "/System/Library/Fonts/Supplemental/Arial.ttf"
-FONT_BOLD = "/System/Library/Fonts/Supplemental/Arial Bold.ttf"
-FONT_BLACK = "/System/Library/Fonts/Supplemental/Arial Black.ttf"
+# Prefer macOS Arial for local generation, but fall back to Liberation Sans in
+# Linux devcontainers.
+FONT_CANDIDATES = {
+    "regular": [
+        "/System/Library/Fonts/Supplemental/Arial.ttf",
+        "/usr/share/fonts/truetype/liberation2/LiberationSans-Regular.ttf",
+    ],
+    "bold": [
+        "/System/Library/Fonts/Supplemental/Arial Bold.ttf",
+        "/usr/share/fonts/truetype/liberation2/LiberationSans-Bold.ttf",
+    ],
+    "black": [
+        "/System/Library/Fonts/Supplemental/Arial Black.ttf",
+        "/usr/share/fonts/truetype/liberation2/LiberationSans-Bold.ttf",
+    ],
+}
 
 OUT = Path(__file__).parent / "social-preview.png"
 
 
 def font(size: int, weight: str = "regular") -> ImageFont.FreeTypeFont:
-    path = {"regular": FONT_REG, "bold": FONT_BOLD, "black": FONT_BLACK}[weight]
+    path = next((p for p in FONT_CANDIDATES[weight] if Path(p).exists()), None)
+    if path is None:
+        raise FileNotFoundError(f"No usable font found for weight={weight!r}")
     return ImageFont.truetype(path, size)
 
 
@@ -70,7 +83,7 @@ def main() -> None:
     # One-line architecture summary, near bottom. ASCII arrows guarantee
     # glyph coverage across any future font swap.
     arch_f = font(22, "bold")
-    arch_text = "ai_parse_document  ->  typed KPIs  ->  Vector Search  ->  cited agent on Mosaic AI"
+    arch_text = "ai_parse_document  ->  typed KPIs  ->  Vector Search  ->  eval-gated cited agent"
     d.text((margin, H - margin - 80), arch_text, font=arch_f, fill=FG)
 
     # Separator + footer.
diff --git a/docs/design.md b/docs/design.md
new file mode 100644
index 0000000..f405159
--- /dev/null
+++ b/docs/design.md
@@ -0,0 +1,355 @@
+# Design — Databricks Document Intelligence Agent
+
+This document covers the *why*, the architecture, and the build workflow behind the repo. For setup and day-to-day use, see [`README.md`](../README.md). For day-2 ops, see [`runbook.md`](./runbook.md).
+
+## Table of contents
+
+- [Why this exists](#why-this-exists)
+- [Architecture](#architecture)
+  - [Two halves: an offline pipeline, and an online agent](#two-halves-an-offline-pipeline-and-an-online-agent)
+  - [Vector Search bridges data and agent](#vector-search-bridges-data-and-agent)
+  - [Agent has two paths, one endpoint](#agent-has-two-paths-one-endpoint)
+  - [Runtime stack](#runtime-stack)
+- [How it's built — three pillars](#how-its-built--three-pillars)
+  - [Pillar 1 — Spec-Kit](#pillar-1--spec-kit-spec-driven-development)
+  - [Pillar 2 — Databricks Asset Bundles + the Claude Code skill suite](#pillar-2--databricks-asset-bundles--the-claude-code-skill-suite)
+  - [Pillar 3 — Claude Code as the implementation surface](#pillar-3--claude-code-as-the-implementation-surface)
+- [Deploy ordering: foundation → consumers](#deploy-ordering-foundation--consumers)
+- [What you can learn from this repo](#what-you-can-learn-from-this-repo)
+
+---
+
+## Why this exists
+
+Databricks shipped a lot of new generative-AI surface area in 2025–2026: `ai_parse_document`, Mosaic AI Vector Search, the Agent Framework, AI Gateway, Lakebase, Databricks Apps. Tutorials show each piece in isolation; nobody shows them wired together with **eval gates, governance, and reproducible deploys** the way you'd actually ship to analysts.
+
+This repo is that worked example. Drop a PDF into a governed UC volume; ten minutes later, an analyst can ask cited questions in plain English with end-to-end audit. The whole stack is described declaratively as one **Databricks Asset Bundle (DAB)** plus a small bootstrap script. DAB manages catalog/schema/volume, pipeline, jobs, the Vector Search **endpoint**, the Lakebase instance, the serving endpoint, the monitor, the app, and the dashboard; the Vector Search **index** itself is created and synced by `jobs/index_refresh/sync_index.py` (DAB doesn't yet manage indexes as a resource type), and the agent model version is registered by `agent/log_and_register.py`. The bootstrap script orchestrates them in the right order.
+
+It also demonstrates a development workflow: **Spec-Kit** for spec-driven design, **Claude Code** with Databricks skill bundles for AI-assisted implementation, six **non-negotiable constitution principles** that gate every plan. See [How it's built](#how-its-built--three-pillars).
+
+---
+
+## Architecture
+
+### Two halves: an offline pipeline, and an online agent
+
+```
+   ╔═══════════════════════════════════════════════════════════════════╗
+   ║                  pipelines/sql/  (one SQL file per tier)          ║
+   ╚═══════════════════════════════════════════════════════════════════╝
+
+  raw_filings/       ┌─────────────────┐   ┌─────────────────┐   ┌──────────────────┐
+  ACME_10K.pdf  ──▶  │  bronze_filings │──▶│ silver_parsed_  │──▶│ gold_filing_     │
+  BETA_10K.pdf       │  (raw bytes,    │   │ filings (parsed │   │ sections (one    │
+  GAMMA_10K.pdf      │   filename,     │   │ VARIANT —       │   │  row per parsed  │
+                     │   ingested_at)  │   │ ai_parse_       │   │  $.sections[*];  │
+                     │                 │   │ document)       │   │  fallback to     │
+                     │  >50MB rejects: │   │                 │   │  full_document   │
+                     │  bronze_filings │   │ Status: ok /    │   │  if absent)      │
+                     │  _rejected      │   │ partial / error │   │                  │
+                     └─────────────────┘   └─────────────────┘   │ gold_filing_kpis │
+                          01_bronze.sql       02_silver_parse    │ (typed columns:  │
+                                                .sql             │  segment_revenue │
+                                                                 │  ARRAY<STRUCT…>, │
+                                                                 │  top_risks       │
+                                                                 │  ARRAY<STRING>)  │
+                                                                 └──────────────────┘
+                                                                  03_gold_classify
+                                                                  _extract.sql
+                                                                          │
+                                                                          ▼
+                                                                 ┌──────────────────┐
+                                                                 │ gold_filing_     │
+                                                                 │ quality          │
+                                                                 │ (5-dim rubric:   │
+                                                                 │  parse, layout,  │
+                                                                 │  ocr, sections,  │
+                                                                 │  kpi → 0-30)     │
+                                                                 └──────────────────┘
+                                                                  04_gold_quality.sql
+```
+
+**Key idea — "parse once, extract many":** PDFs are expensive to parse. Silver runs `ai_parse_document` exactly once per file and stores the structured result as a `VARIANT`. Everything downstream — classification, KPI extraction, summarization, quality scoring — reads the parsed output, never the raw bytes. This is a non-negotiable constitution principle.
+
+**Triggering**: prod runs the pipeline in `continuous: true` mode so Auto Loader (`read_files`) reacts to new PDFs in the volume automatically. Dev overrides to `continuous: false` to avoid a 24/7 cluster during smoke iterations. See `resources/foundation/doc_intel.pipeline.yml` and the dev override block in `databricks.yml`.
+
+### Vector Search bridges data and agent
+
+```
+   gold_filing_sections           ┌─────────────────────────┐
+   (governed Delta table)  ─────▶ │  Mosaic AI Vector       │
+                                  │  Search Index           │
+   Filter: embed_eligible=true    │  (Delta-Sync — auto-    │
+   Embed column: "summary"        │   refreshes when Gold    │
+                                  │   updates)              │
+                                  └─────────────────────────┘
+
+   Why "summary" not the raw text?
+   ─────────────────────────────
+   Embedding a 50-page 10-K verbatim is noisy. We embed an LLM-written
+   summary instead — tighter, more searchable. Constitution principle IV:
+   "Quality before retrieval."
+```
+
+**Ownership note**: DAB manages the Vector Search **endpoint** (`resources/consumers/filings_index.yml`) and the index-refresh **job** (`resources/consumers/index_refresh.job.yml`). The **index** itself isn't yet a DAB-managed resource type as of CLI 0.298 — `jobs/index_refresh/sync_index.py` creates the Delta-Sync index on first run and triggers a sync on subsequent runs. That's why the bootstrap script's stage-2 deploy creates the endpoint + job, and the job's first execution materializes the actual index.
+
+### Agent has two paths, one endpoint
+
+```
+   User question
+        │
+        ▼
+   ┌────────────────────────────────────────────┐
+   │  AnalystAgent.predict()                    │
+   │  ─────────────────────                      │
+   │   contains "compare" / "vs" /              │
+   │   "between" + ≥2 company names?            │
+   └────────────┬─────────────────┬─────────────┘
+                │ no              │ yes
+                ▼                 ▼
+   ┌──────────────────────┐  ┌──────────────────────┐
+   │ Single-filing path   │  │ Supervisor path      │
+   │                      │  │                      │
+   │ 1. Hybrid search     │  │ For each company:    │
+   │    (keyword + vec)   │  │   ▸ run analyst path │
+   │ 2. Re-rank → top 5   │  │   ▸ pull KPIs from   │
+   │ 3. LLM generates     │  │     gold_filing_kpis │
+   │    answer w/ [1] [2] │  │ Format markdown      │
+   │    citations         │  │ table with cites.    │
+   └──────────────────────┘  └──────────────────────┘
+                │                 │
+                └────────┬────────┘
+                         ▼
+              ┌──────────────────────┐
+              │  Response JSON:      │
+              │   answer             │
+              │   citations[]        │
+              │   grounded: bool     │
+              │   latency_ms         │
+              └──────────────────────┘
+```
+
+The agent is an `mlflow.pyfunc` model registered in Unity Catalog and served behind an **AI Gateway** (rate limiting per-user, usage tracking, inference-table audit). Identity passthrough is implemented at the *App layer* when the workspace has Databricks Apps user-token passthrough enabled: the Streamlit app extracts the user's `x-forwarded-access-token` header and constructs a user-scoped `WorkspaceClient`. The served model is OBO-ready via MLflow `auth_policy` and Model Serving user credentials. If app-level passthrough is not enabled, the app falls back to service-principal auth and the repo must be treated as a reference/dev deployment, not a production row-level-security deployment. See [`../SECURITY.md`](../SECURITY.md) and [`../app/README.md`](../app/README.md).
+
+### Runtime stack
+
+```
+   ┌──────────────────────────────────────────────────────────────────┐
+   │                                                                  │
+   │     Databricks App (Streamlit)  ←  user interacts here          │
+   │     app/app.py                                                   │
+   │                                                                  │
+   │     ┌────────────────┐   ┌──────────────────┐                   │
+   │     │ Chat input box │   │ Citation chips   │                    │
+   │     │ Thumbs up/down │   │ Markdown tables  │                    │
+   │     └────────┬───────┘   └─────┬────────────┘                    │
+   │              │                 │                                 │
+   └──────────────│─────────────────│─────────────────────────────────┘
+                  │                 │
+                  │ query           │ feedback writes
+                  ▼                 ▼
+   ┌────────────────────────┐  ┌────────────────────────┐
+   │ Model Serving endpoint │  │  Lakebase Postgres     │
+   │ "analyst-agent-dev"    │  │  ─────────────────      │
+   │  (CPU, scales to 0)    │  │  conversation_history   │
+   │                        │  │  query_logs             │
+   │  + AI Gateway:         │  │  feedback               │
+   │    rate limit          │  │                        │
+   │      (per-user key)    │  │  (Postgres for tiny    │
+   │    inference-table     │  │   per-turn writes —    │
+   │      audit             │  │   Delta isn't great    │
+   │    usage tracking      │  │   at row-by-row)       │
+   └────────────────────────┘  └────────────────────────┘
+
+   OBO (user identity end-to-end, when enabled):
+   ──────────────────────────────
+   App reads `x-forwarded-access-token` from the request, builds
+   `WorkspaceClient(token=...)`, calls the serving endpoint with the
+   user's identity. The agent-side MLflow auth policy and Model Serving
+   OBO credentials let downstream calls run as the user. If the app-side
+   feature is unavailable, the bootstrap script prints an explicit warning
+   and the deployment remains reference/dev only.
+```
+
+**Why Postgres for state?** Delta tables are great for analytics but bad at "insert one tiny row per chat turn at high frequency." Lakebase is Databricks's managed Postgres — same governance, right tool for the job.
+
+---
+
+## How it's built — three pillars
+
+This repo combines three things: Spec-Kit for spec-driven design, Databricks Asset Bundles + Claude Code skill bundles for declarative platform work, and Claude Code as the implementation surface.
+
+### Pillar 1 — Spec-Kit (spec-driven development)
+
+[Spec-Kit](https://github.com/github/spec-kit) is a workflow that forces you to write — and *clarify* — a specification before writing code. Each phase is a slash-command in Claude Code that produces a checked-in artifact:
+
+```
+   /speckit-specify   →  specs/<NNN>/spec.md         What & why (no how)
+        │
+        ▼
+   /speckit-clarify   →  appended Q&A in spec.md     Resolve ambiguity
+        │
+        ▼
+   /speckit-plan      →  specs/<NNN>/plan.md         Tech stack + structure
+        │              + research.md, data-model.md,
+        │                contracts/, quickstart.md
+        ▼
+   /speckit-tasks     →  specs/<NNN>/tasks.md        Dependency-ordered tasks
+        │
+        ▼
+   /speckit-analyze   →  cross-artifact consistency check
+        │
+        ▼
+   /speckit-implement →  the actual code
+```
+
+`.specify/extensions.yml` auto-commits at each phase boundary so the trail is clean. `.specify/memory/constitution.md` defines six **non-negotiable principles** every plan must respect:
+
+| # | Principle | What it means |
+|---|---|---|
+| I | **Unity Catalog source of truth** | Every table, volume, model, index, endpoint lives under `<catalog>.<schema>` — no DBFS, no workspace-local resources |
+| II | **Parse once, extract many** | `ai_parse_document` runs once at Silver → VARIANT; everything downstream reads the parsed output |
+| III | **Declarative over imperative** | SDP SQL pipelines, Lakeflow Jobs, DAB resources — no production notebooks |
+| IV | **Quality before retrieval** | 5-dim rubric scores every section; only ≥22/30 reach the index. Embed `summary`, not raw text |
+| V | **Eval-gated agents** | MLflow CLEARS scores must clear thresholds before any deploy is considered complete |
+| VI | **Reproducible deploys** | `databricks bundle deploy -t <env>` recreates the entire stack; `dev` and `prod` parity enforced |
+
+When you read `specs/001-doc-intel-10k/plan.md` you'll see a "Constitution Check" gate that maps each design decision back to the principle it satisfies. When you read `specs/001-doc-intel-10k/tasks.md` you'll see how each task derives from the plan, and how user-stories (P1, P2, P3) are independently demoable.
+
+### Pillar 2 — Databricks Asset Bundles + the Claude Code skill suite
+
+[**Databricks Asset Bundles**](https://docs.databricks.com/aws/en/dev-tools/bundles/) (DABs) describe most of the workspace state as YAML. One root `databricks.yml` declares variables and targets (`dev`, `prod`); `resources/**/*.yml` declares each resource (pipeline, jobs, Vector Search endpoint, index-refresh job, serving endpoint, app, monitor, dashboard, Lakebase instance + catalog). `databricks bundle deploy -t dev` reconciles workspace state to YAML. The two non-DAB-managed pieces — the Vector Search **index** itself and the registered **model version** — are produced at runtime by `jobs/index_refresh/sync_index.py` and `agent/log_and_register.py` respectively, which the bootstrap script orchestrates.
+
+This repo was built with Databricks-specific Claude Code skill bundles. Those bundles are distributed by Databricks via the CLI / Claude Code plugin channel and **are not vendored in this open-source tree** — install them locally if you have access, or reference the canonical Databricks docs (mapping in [`../CONTRIBUTING.md`](../CONTRIBUTING.md)).
+
+| Skill bundle | What it provides | Canonical docs |
+|---|---|---|
+| **databricks-core** | Auth, profiles, data exploration, bundle basics | [docs](https://docs.databricks.com/aws/en/dev-tools/cli/) |
+| **databricks-dabs** | DAB structure, validation, deploy workflow, target separation | [docs](https://docs.databricks.com/aws/en/dev-tools/bundles/) |
+| **databricks-pipelines** | Lakeflow Spark Declarative Pipelines (`ai_parse_document`, `ai_classify`, `ai_extract`, `APPLY CHANGES INTO`) | [docs](https://docs.databricks.com/aws/en/dlt/) |
+| **databricks-jobs** | Lakeflow Jobs with retries, schedules, table-update / file-arrival triggers | [docs](https://docs.databricks.com/aws/en/jobs/) |
+| **databricks-apps** | Databricks Apps (Streamlit), App resource bindings | [docs](https://docs.databricks.com/aws/en/dev-tools/databricks-apps/) |
+| **databricks-lakebase** | Lakebase Postgres instances, branches, computes, endpoint provisioning | [docs](https://docs.databricks.com/aws/en/oltp/) |
+| **databricks-model-serving** | Model Serving endpoints, AI Gateway, served entities, scaling config | [docs](https://docs.databricks.com/aws/en/machine-learning/model-serving/) |
+
+Skills are loaded by Claude Code on demand. When you ask Claude to "wire up Vector Search," it should read the Databricks pipeline/model-serving guidance *before* writing YAML, so the output reflects current Databricks API shapes — not stale training data.
+
+### Pillar 3 — Claude Code as the implementation surface
+
+Spec-Kit produces the specs. The Databricks skills provide platform expertise. **Claude Code orchestrates both**: every phase artifact and every code file in this repo was authored by prompting Claude Code with the spec/plan/tasks as context.
+
+The workflow looks like:
+
+1. `/speckit-specify` → Claude writes spec.md from a natural-language description, you iterate via `/speckit-clarify` until ambiguity is resolved.
+2. `/speckit-plan` → Claude consults the constitution + Databricks skills, drafts plan.md with research decisions and architecture.
+3. `/speckit-tasks` → Claude generates a dependency-ordered task list grouped by user story (P1, P2, P3).
+4. `/speckit-implement` → Claude writes the actual SQL/Python/YAML, one task at a time, committing per task.
+5. Operational loops: when the deploy hits unexpected issues (it always does), Claude reads the runbook, fixes the issue, updates the runbook, commits.
+
+AI-driven here means Claude carries the boring parts (boilerplate YAML, retry-loop scripts, dependency analysis) so you spend time on what the spec should say and what the constitution should require.
+
+---
+
+## Deploy ordering: foundation → consumers
+
+DABs deploy *everything in one shot*. But our resources have a chicken-and-egg problem on a fresh workspace:
+
+```
+        ┌────────────────────────────────────────────────┐
+        │   What "bundle deploy" tries to create:        │
+        │                                                │
+        │   ▸ Pipeline   ────┐                           │
+        │   ▸ Tables     ────┼──── all need each other  │
+        │   ▸ Vector idx  ───┤                           │
+        │   ▸ Model       ───┤    Monitor wants the      │
+        │   ▸ Endpoint   ────┤    KPI table to exist     │
+        │   ▸ App         ───┤    BEFORE it can attach   │
+        │   ▸ Monitor    ────┘                           │
+        │   ▸ Lakebase   ────                            │
+        └────────────────────────────────────────────────┘
+
+   Endpoint needs a registered model version.
+        Model version needs the model logged.
+              Model logging needs the agent code.
+                    Monitor needs the table populated.
+                          Table needs the pipeline to run.
+
+   ▶ Single `bundle deploy` → 4+ errors on a fresh workspace.
+```
+
+The fix is a **staged deploy** orchestrated by `scripts/bootstrap-dev.sh`. Resources are split into two directories by data dependency:
+
+```
+   resources/
+   ├── foundation/        ← no data deps — deploy first
+   │   ├── catalog.yml             (schema + volume + grants)
+   │   ├── doc_intel.pipeline.yml
+   │   ├── retention.job.yml
+   │   └── lakebase_instance.yml
+   │
+   └── consumers/         ← need foundation to be RUNNING and producing data
+       ├── agent.serving.yml     (needs registered model version)
+       ├── kpi_drift.yml         (needs gold_filing_kpis table)
+       ├── filings_index.yml     (VS endpoint)
+       ├── index_refresh.job.yml (needs source table)
+       ├── analyst.app.yml       (needs Lakebase + agent endpoint)
+       ├── usage.dashboard.yml
+       └── lakebase_catalog.yml  (needs instance AVAILABLE)
+```
+
+**The bootstrap script auto-detects which mode to run** by checking whether the agent serving endpoint already has a populated config:
+
+```
+                       does analyst-agent-${target} have served entities?
+                                     │
+                          no ◀───────┴───────▶ yes
+                          │                     │
+                          ▼                     ▼
+                ┌──────────────────┐   ┌──────────────────┐
+                │  FIRST-DEPLOY    │   │  STEADY-STATE    │
+                │  (staged)        │   │  (full deploy)   │
+                ├──────────────────┤   ├──────────────────┤
+                │ 1. temp-rename   │   │ 1. bundle deploy │
+                │    consumers/*   │   │    (full bundle) │
+                │    .yml.skip     │   │                  │
+                │ 2. bundle deploy │   │ 2. refresh data: │
+                │    (foundation)  │   │    upload, run   │
+                │ 3. produce data: │   │    pipeline,     │
+                │    upload, run,  │   │    register new  │
+                │    register      │   │    model version │
+                │    model         │   │    + repoint     │
+                │ 4. wait Lakebase │   │    serving in-   │
+                │    AVAILABLE     │   │    place         │
+                │ 5. restore yamls │   │                  │
+                │ 6. bundle deploy │   │                  │
+                │    (full bundle) │   │                  │
+                └────────┬─────────┘   └────────┬─────────┘
+                         │                       │
+                         └───────────┬───────────┘
+                                     ▼
+                         ┌──────────────────────────┐
+                         │  Common to both:         │
+                         │  • bundle run analyst_app│
+                         │  • UC grants chain       │
+                         │  • smoke check           │
+                         └──────────────────────────┘
+```
+
+**Why two modes?** DAB tracks resource state; if you run the temp-rename trick against an *existing* deployment, DAB sees the consumer YAMLs as removed and plans to **delete** the serving endpoint, app, monitor, etc. Safe-ish on a fresh workspace; destructive in steady-state. The script detects mode and does the right thing.
+
+CI (`.github/workflows/deploy.yml`) assumes steady-state — the first-ever bring-up of a workspace must be done locally with `./scripts/bootstrap-dev.sh`. After that, every push to `main` runs the steady-state path: full `bundle deploy` → refresh data → repoint serving endpoint → grants → CLEARS gate.
+
+For the per-step procedure and known failure modes, see [`runbook.md` § Known deploy ordering gaps](./runbook.md#known-deploy-ordering-gaps-discovered-in-the-2026-04-24-smoke-test).
+
+---
+
+## What you can learn from this repo
+
+- **Wiring `ai_parse_document` into Lakeflow SDP** — pattern for streaming-tables + `STREAM(...)` views + `APPLY CHANGES INTO` keyed on filename.
+- **Scoring document quality before retrieval** — five 0–6 dimensions in SQL, threshold filter on the index source.
+- **Logging a Mosaic AI agent to UC** — `mlflow.pyfunc` with both inputs *and* outputs in the signature (UC requirement), `AnyType` for variable-shape fields, `auth_policy` + `resources` for OBO.
+- **Grounding an agent with citations** — hybrid Vector Search → re-rank → top-k → LLM with explicit "cite sources [1] [2]" prompt.
+- **Handling DAB deploy ordering** — chicken-egg dependencies between heterogeneous resources, solved with a 5-step bootstrap rather than `depends_on` (which DAB doesn't reliably honor across resource types).
+- **Gating deploys on MLflow eval** — `mlflow.evaluate(model_type="databricks-agent")` with documented metric keys, per-axis thresholds, exit-code gate in CI.
+- **End-to-end OBO** — `ModelServingUserCredentials` from `databricks_ai_bridge`, `CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS` for Vector Search, MLflow `auth_policy` with `model-serving` + `vector-search` user scopes, App-side `user_api_scopes` declaration.
+- **Spec-Kit + Claude Code + Databricks skills composing** — every artifact in `specs/` and `pipelines/` and `agent/` was generated through that loop.
diff --git a/docs/social-preview.png b/docs/social-preview.png
index 735c2e7..d02501b 100644
Binary files a/docs/social-preview.png and b/docs/social-preview.png differ
diff --git a/specs/001-doc-intel-10k/plan.md b/specs/001-doc-intel-10k/plan.md
index d387a3a..8b6e801 100644
--- a/specs/001-doc-intel-10k/plan.md
+++ b/specs/001-doc-intel-10k/plan.md
@@ -124,7 +124,7 @@ Output: [research.md](./research.md). Decisions captured:
 | Idempotency | `APPLY CHANGES INTO` keyed on `filename` for Silver and Gold | SDP native CDC, deterministic on re-upload, no Python helper | Hand-rolled MERGE (rejected: more code paths); content hash key (deferred — filename is sufficient for v1) |
 | Quality rubric | 5 dimensions × 0–6 scale; threshold ≥ 22/30; computed via `ai_query` calls in `04_gold_quality.sql` | Mirrors Reffy's 31-point pattern; SQL-native means no Python helper; explicit dimensions help debug rejections | Single `extraction_confidence` (rejected: no debuggability); 3-dim avg (rejected: too coarse) |
 | Vector Search index | Delta-Sync index over `gold_filing_sections` filtered by `embed_eligible`; embed `summary` column | Managed sync, no manual refresh; embeds curated content per principle IV | Direct Vector Index (rejected: no managed sync); embedding raw `parsed.text_full` (rejected: noise) |
-| Retrieval strategy | Hybrid (keyword + semantic) top-25 → re-rank → top-5 | Reffy pattern; re-rank improves relevance materially; CPU re-rank stays in budget | Pure semantic (rejected: misses exact filings/years); re-rank against top-100 (rejected: latency budget) |
+| Retrieval strategy | Hybrid (keyword + semantic) top-25 → re-rank → top-5 | Reffy pattern; re-rank tightens top-5 ordering; CPU re-rank stays in budget | Pure semantic (rejected: misses exact filings/years); re-rank against top-100 (rejected: latency budget) |
 | Agent framework | Mosaic AI Agent Framework via `databricks-agents` SDK + MLflow `pyfunc` | First-class Knowledge Assistant + Supervisor primitives; logged + registered in UC | LangGraph standalone (rejected: more glue, no UC registration story) |
 | Serving | CPU instance behind AI Gateway; identity passthrough on | Cost-first per Reffy; Gateway gives audit + rate limit + on-behalf-of | GPU (rejected: not needed at scale of pilot); raw endpoint (rejected: no governance layer) |
 | State store | Lakebase Postgres (managed) | Native to platform, low-latency reads/writes, fits Reffy pattern; integrates with Apps | Delta tables (rejected: write throughput on small turn-level updates); external Postgres (rejected: governance gap) |
diff --git a/specs/001-doc-intel-10k/quickstart.md b/specs/001-doc-intel-10k/quickstart.md
index 4447677..e3152de 100644
--- a/specs/001-doc-intel-10k/quickstart.md
+++ b/specs/001-doc-intel-10k/quickstart.md
@@ -1,6 +1,6 @@
 # Quickstart: Deploy and Test the 10-K Analyst
 
-Goal: from a clean clone, stand up the entire stack on the Databricks `dev` target and verify P1, P2, P3 acceptance scenarios in under 30 minutes.
+Goal: from a clean clone, stand up the entire stack on the Databricks `dev` target and verify P1, P2, P3 acceptance scenarios in 15–25 minutes.
 
 ## Prerequisites
 
diff --git a/specs/001-doc-intel-10k/research.md b/specs/001-doc-intel-10k/research.md
index 3ac6e85..ef9d4c6 100644
--- a/specs/001-doc-intel-10k/research.md
+++ b/specs/001-doc-intel-10k/research.md
@@ -55,7 +55,7 @@ tunable as a bundle parameter.
 **Rationale**: Reffy reports keyword-only sub-2s but reasoning needs LLM
 generation. Hybrid keyword + semantic retrieval to top-25, then a Mosaic AI
 re-ranker (CPU) trim to top-5, keeps single-filing P95 ≤ 8s achievable
-on CPU serving while improving relevance materially. Bigger windows blow
+on CPU serving while improving top-5 ordering qualitatively. Bigger windows blow
 the latency budget; pure semantic misses exact ticker/year matches in
 financial filings.