Audit unsupported-cell prose; fix three more examples with the same bug

claude · claude · commit dceb83408f40 · 2026-05-11T20:55:20.000Z
The networking fix exposed a pattern: any :::unsupported block where
the prose leads with "Dynamic Workers do not provide X" copies the
runtime caveat instead of explaining the code. An audit across all
four examples that contain :::unsupported blocks (networking,
subprocesses, threads-and-processes, virtual-environments) found
the same prose shape in the three I hadn't yet fixed.

TDD:

  RED — tests/test_example_content.py adds Contract 11
  (UnsupportedCellProseContract). The heuristic: each unsupported
  cell's prose must reference at least 2 code identifiers from
  its own code block (variable names, function calls, method
  names). Two is the floor that proves the prose discusses this
  specific code rather than a generic Workers note. Pre-fix, the
  contract flagged:

    - virtual-environments  1 ident referenced ("venv")
    - threads-and-processes 0 idents referenced
    - subprocesses          2 idents (passed bar, but still
                            weak — fixed for consistency)

  GREEN — rewrite all three prose blocks so each explains what
  ITS specific code does:

    subprocesses           "subprocess.run spawns a child Python
                            interpreter, captures stdout/stderr
                            (capture_output=True), decodes as text
                            (text=True), and raises if it exits
                            non-zero (check=True). The returned
                            result holds the captured streams and
                            exit code as portable evidence the
                            child ran. (Runtime caveat moved to
                            closing parenthetical.)"

    threads-and-processes  "ThreadPoolExecutor runs square across
                            two worker threads sharing the same
                            interpreter (and the GIL);
                            ProcessPoolExecutor runs pow across two
                            child processes with isolated memory.
                            Each pool.map returns an iterator over
                            results in input order, and the
                            surrounding with block joins the workers
                            when the body exits."

    virtual-environments   "venv.EnvBuilder configures the
                            description of a new environment, then
                            create('.venv') materialises it on disk
                            as a directory containing its own
                            interpreter and site-packages.
                            with_pip=False skips bootstrapping pip."

  REFACTOR — Contract 11 now passes; 58 tests total.

The second audit dimension (figure-caption vs example-title keyword
overlap) found 24 suspect attachments. Spot-checking confirms all
24 are false positives: captions use different word forms than
titles (e.g., "mutable" vs "mutability", "comprehension" vs
"comprehensions") but are conceptually aligned. No contract added —
heuristic too noisy without manual review.
diff --git a/src/asset_manifest.py b/src/asset_manifest.py
@@ -1,3 +1,3 @@
 # Generated by scripts/fingerprint_assets.py. Do not edit by hand.
 ASSET_PATHS = {'SITE_CSS': '/site.f9a6740c684b.css', 'SYNTAX_JS': '/syntax-highlight.3b6c7f730d46.js', 'EDITOR_JS': '/editor.dd81f5171b14.js'}
-HTML_CACHE_VERSION = '4a27d9ac9a3e'
+HTML_CACHE_VERSION = 'c48d15294056'
diff --git a/src/example_sources/subprocesses.md b/src/example_sources/subprocesses.md
@@ -31,7 +31,7 @@ print(result.returncode)
 :::
 
 :::unsupported
-Dynamic Workers do not provide child processes.
+`subprocess.run` spawns a child Python interpreter, captures its stdout and stderr (`capture_output=True`), decodes them as text (`text=True`), and raises `CalledProcessError` if the child exits non-zero (`check=True`). The returned `result` holds the captured streams and exit code as portable evidence the child ran. (This fragment runs in standard Python only — Dynamic Workers don't provide child processes.)
 
 ```python
 result = subprocess.run(
diff --git a/src/example_sources/threads-and-processes.md b/src/example_sources/threads-and-processes.md
@@ -29,7 +29,7 @@ print(ProcessPoolExecutor.__name__)
 :::
 
 :::unsupported
-Dynamic Workers do not provide native threads or child processes.
+`ThreadPoolExecutor` runs `square` across two worker threads sharing the same interpreter (and the GIL); `ProcessPoolExecutor` runs `pow` across two child processes with isolated memory. Each `pool.map` returns an iterator over results in input order, and the surrounding `with` block joins the workers when the body exits. (This fragment runs in standard Python only — Dynamic Workers don't provide native threads or child processes.)
 
 ```python
 with ThreadPoolExecutor(max_workers=2) as pool:
diff --git a/src/example_sources/virtual-environments.md b/src/example_sources/virtual-environments.md
@@ -30,7 +30,7 @@ with tempfile.TemporaryDirectory() as directory:
 :::
 
 :::unsupported
-Dynamic Workers do not provide the `venv` module or a project environment workflow.
+`venv.EnvBuilder` configures the description of a new environment, then `create(".venv")` materialises it on disk as a directory containing its own interpreter and `site-packages`. `with_pip=False` skips bootstrapping pip — useful when the venv is for an isolated tool that doesn't need to install third-party packages. (This fragment runs in standard Python only — Dynamic Workers don't provide the `venv` module or a project environment workflow.)
 
 ```python
 builder = venv.EnvBuilder(with_pip=False)
diff --git a/tests/test_example_content.py b/tests/test_example_content.py
@@ -0,0 +1,80 @@
+"""Content contracts for example markdown sources.
+
+These contracts check pedagogical correctness rather than geometry:
+does each cell's prose explain the code in that cell? Does the
+unsupported-cell prose lead with the lesson or with a runtime
+caveat?
+
+The contracts complement tests/test_marginalia_geometry.py
+(geometry, palette, registration) by catching content-shaped bugs
+that geometry contracts can't see.
+"""
+from __future__ import annotations
+
+import re
+import unittest
+
+from src.example_loader import load_examples
+
+
+_, EXAMPLES = load_examples()
+
+
+def _code_identifiers(code: str) -> set[str]:
+    """Return identifier-like tokens from a code block.
+
+    Strips Python keywords and the common stdlib names that appear
+    in nearly every example (`print`, `import`) so the audit
+    measures the lesson-specific vocabulary, not the language.
+    """
+    keywords = {
+        "def", "return", "import", "from", "as", "if", "else",
+        "elif", "try", "except", "finally", "with", "for", "in",
+        "and", "or", "not", "is", "true", "false", "none",
+        "print", "pass", "raise", "while", "break", "continue",
+        "class", "lambda", "yield", "global", "nonlocal",
+    }
+    return {w for w in re.findall(r"\b[a-z_][a-z_0-9]+\b", code.lower())
+            if w not in keywords and len(w) > 1}
+
+
+class UnsupportedCellProseContract(unittest.TestCase):
+    """Contract 11: every :::unsupported cell's prose explains the
+    code, not just the runtime constraint.
+
+    The :::unsupported block is rendered on production pages as a
+    walkthrough cell with prose + code. When the prose only says
+    'Dynamic Workers do not provide X', the reader sees the
+    constraint but no pedagogical content. The fix is to lead with
+    what the code does and move the runtime caveat to a closing
+    parenthetical (or to the Notes section).
+
+    Heuristic: each unsupported cell's prose must mention at least
+    two code identifiers — variable names, function calls, or
+    method names from the code block. Two is the minimum that
+    proves the prose discusses *this specific code* rather than
+    a generic note about Workers.
+    """
+
+    MIN_IDENT_OVERLAP = 2
+
+    def test_unsupported_prose_mentions_code(self):
+        failures: list[str] = []
+        for ex in EXAMPLES:
+            for cell in ex.get("cells", []):
+                if cell.get("kind") != "unsupported":
+                    continue
+                prose = " ".join(cell.get("prose", [])).lower()
+                idents = _code_identifiers(cell.get("code", ""))
+                hits = sum(1 for ident in idents if ident in prose)
+                if hits < self.MIN_IDENT_OVERLAP:
+                    failures.append(
+                        f"{ex['slug']}: unsupported cell prose references "
+                        f"{hits} code identifier(s) (need ≥ {self.MIN_IDENT_OVERLAP}); "
+                        f"prose looks generic. idents: {sorted(idents)[:5]}…"
+                    )
+        self.assertEqual(failures, [], "\n  " + "\n  ".join(failures))
+
+
+if __name__ == "__main__":
+    unittest.main()