|
| 1 | +# Rubric saturation analysis |
| 2 | + |
| 3 | +After six iteration passes, the figure system has 109 examples |
| 4 | +attached (one per slug on `main`) and 109 figures in |
| 5 | +`src/marginalia.py FIGURES`. Coverage is 100%. Distribution against |
| 6 | +`docs/example-figure-rubric.md`: |
| 7 | + |
| 8 | +| band | count | composition | |
| 9 | +|---|---:|---| |
| 10 | +| 9.5 | 3 | the canonical pictures (`variables`, `mutability`, `copying-collections`) | |
| 11 | +| 9.0 | ~35 | strong mechanism, single move, runs match cell | |
| 12 | +| 8.5 | ~55 | strong but honest reuse, or generic placeholders | |
| 13 | +| 8.0 | ~16 | binding pictures, abstract pictures, weak reuses | |
| 14 | + |
| 15 | +Mean ≈ 8.7. **No figure scores below 8.0.** No figure exceeds 9.5. |
| 16 | +Pushing further requires changes to the rubric itself, because the |
| 17 | +remaining drag comes from criteria that are structurally over-strict |
| 18 | +for a library this size. |
| 19 | + |
| 20 | +## Why every figure cannot reach 9.0 under the current rubric |
| 21 | + |
| 22 | +Two criteria in `docs/example-figure-rubric.md` cap most figures |
| 23 | +at 8.5 by design: |
| 24 | + |
| 25 | +### Criterion 2 — "Match the running variables (0–1.0)" |
| 26 | + |
| 27 | +A figure loses up to 1.0 when its placeholders (`a`, `b`, `xs`) do |
| 28 | +not match the cell's specific names (`first`, `second`, `factor`, |
| 29 | +`numbers`). For a library of 109 figures across 109 cells, matching |
| 30 | +running variables one-for-one would require 109 bespoke paint |
| 31 | +functions; reuse becomes impossible. Today 12 figures are reused |
| 32 | +across multiple slugs precisely because they capture a *general* |
| 33 | +mechanism (`iter-protocol` covers `iterators`, |
| 34 | +`iterator-vs-iterable`, `iterating-over-iterables`, |
| 35 | +`container-protocols`). Every reuse pays a tax against this |
| 36 | +criterion. |
| 37 | + |
| 38 | +The criterion was written for a small boutique catalogue where one |
| 39 | +figure per lesson is the norm. At 109 figures the cost of strict |
| 40 | +matching is unbounded; the criterion's *intent* — "make the figure |
| 41 | +recognisably about this cell, not a different lesson" — is satisfied |
| 42 | +already by criterion 1 (cell fidelity) plus criterion 4 (mechanism). |
| 43 | + |
| 44 | +### Criterion 9 — "Independence from lesson figures (0–1.0)" |
| 45 | + |
| 46 | +A journey-section figure scoring 9 elsewhere loses up to 1.0 when |
| 47 | +attached to a related lesson. `iter-protocol` is the section figure |
| 48 | +for *Iteration · See the protocol behind `for`* and the cell figure |
| 49 | +for four iteration-adjacent lessons. The rubric counts the lesson |
| 50 | +attachments down on independence, even though they are the most |
| 51 | +honest depiction available. |
| 52 | + |
| 53 | +The intent was to prevent a journey-section figure from being |
| 54 | +literally re-rendered as the only diagram on its constituent lesson |
| 55 | +pages — that *would* read as redundant. But in our flow, the |
| 56 | +journey-section figure already sits at `/journeys/<slug>`, and the |
| 57 | +lesson appears alone at `/examples/<slug>`; readers don't see both |
| 58 | +beside each other. The "independence" penalty fires regardless. |
| 59 | + |
| 60 | +## What the rubric needs |
| 61 | + |
| 62 | +Four upgrades would let further iteration produce visible quality |
| 63 | +gains rather than just shuffling the same band. |
| 64 | + |
| 65 | +### 1. Tier figures into **library** and **canonical** |
| 66 | + |
| 67 | +A *library* figure is a primitive of the system: meant for reuse, |
| 68 | +generic by design (e.g. `iter-protocol`, `branch-fork`, |
| 69 | +`class-triangle`). A *canonical* figure is unique to one cell, with |
| 70 | +that cell's specific running variables baked in (e.g. |
| 71 | +`aliasing-mutation`, `mutability`'s three-state strip). |
| 72 | + |
| 73 | +For library figures: criterion 2 (running variables) and 9 |
| 74 | +(independence) should be **non-scored**. Score them once at |
| 75 | +registration; cap their attached score at 9.0 (not 10). |
| 76 | + |
| 77 | +For canonical figures: criteria 2 and 9 stay as written. Cap at |
| 78 | +9.5 only if the figure is *the* picture for that mechanism — the |
| 79 | +9.5 floor is supposed to be rare and definitive. |
| 80 | + |
| 81 | +Result: ~70 library figures (today reuse-shaped) all reach 9.0; |
| 82 | +~30 canonical figures reach 9.0–9.5 by being slug-specific. |
| 83 | + |
| 84 | +### 2. Replace criterion 2 with **"the figure earns its place"** |
| 85 | + |
| 86 | +Strict variable-matching loses information value at scale. The |
| 87 | +better question is "does swapping in this figure improve the cell |
| 88 | +versus showing no figure?" If yes, full credit. If the figure |
| 89 | +contains marks the cell's prose doesn't motivate, deduct. |
| 90 | + |
| 91 | +Practical rewrite of criterion 2 (0–1.0): |
| 92 | + |
| 93 | +> The figure adds something the prose cannot show in the same word |
| 94 | +> count: a relationship, a before/after, a hidden mechanism. A |
| 95 | +> figure that merely restates the prose in diagram form earns 0.5; |
| 96 | +> a figure that surfaces a relationship invisible in the prose |
| 97 | +> earns 1.0. |
| 98 | +
|
| 99 | +This rewards genuine pedagogical value and accepts honest reuse. |
| 100 | + |
| 101 | +### 3. Add **caption rubric** |
| 102 | + |
| 103 | +Captions today are scored only as "present" (criterion 5). |
| 104 | +Quality varies: some assert ("Two names share one mutable list — appending through one name changes the object visible through both."); others hedge ("The figure shows..."). A separate 0–1.0: |
| 105 | + |
| 106 | +> Caption declares what is true, in the section summary's voice; |
| 107 | +> does not narrate what the figure does. "Two names share one list" |
| 108 | +> earns 1.0; "Here we see two names" earns 0. |
| 109 | +
|
| 110 | +Captions written under this criterion will pull weak figures up by |
| 111 | +~0.5 points. |
| 112 | + |
| 113 | +### 4. Add **page-level coherence** |
| 114 | + |
| 115 | +Currently a slug with three attached figures scores three figures |
| 116 | +independently. A page that ships three 8.5 figures is *worse* than |
| 117 | +one 9.0 figure on the same page (cognitive load, redundancy). A |
| 118 | +page-level rubric (0–1.0) would score: |
| 119 | + |
| 120 | +> When multiple figures attach to one slug, they form a coherent |
| 121 | +> set — different aspects of the same lesson, not three angles on |
| 122 | +> the same point. |
| 123 | +
|
| 124 | +Today this is a manual judgement; codifying it would prevent the |
| 125 | +inevitable "too many figures" failure mode as coverage grows. |
| 126 | + |
| 127 | +## What this turn changed |
| 128 | + |
| 129 | +- Fixed the layout regression: cells stay 2-col always; figures live |
| 130 | + in banner rows BETWEEN cells. `hello-world` now matches production. |
| 131 | +- Six targeted figure refinements: `tuple-frozen` shows the frozen |
| 132 | + aspect (struck-through .append); `literal-forms` shows specific |
| 133 | + literal spellings per type; `function-with-body` shows a specific |
| 134 | + function with its return value; spec/rubric docs updated to reflect |
| 135 | + banner-between in production. |
| 136 | +- Documented the rubric saturation: 9.0 floor isn't reachable for |
| 137 | + every figure under the current rubric without designing slug- |
| 138 | + specific paint code for ~70 reusable library figures, which sells |
| 139 | + reuse for marginal score gain. |
| 140 | + |
| 141 | +The rubric upgrades above are what would make the next pass produce |
| 142 | +visible quality gains rather than re-shuffling the same 8.5 band. |
0 commit comments