Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,20 @@ make verify-examples
make check-generated
```

## Quality checks

Five scripts enforce the catalog-level rules from `docs/example-quality-rubric.md`. Run them together with `make quality-checks`.

| Script | What it gates |
| --- | --- |
| `scripts/check_registry_integrity.py` | Every owner slug in `docs/quality-registries.toml` exists in `manifest.toml`; tokens are present. |
| `scripts/check_confusable_pairs.py` | Each confusable pair's owning page contains every token that signals the contrast. |
| `scripts/check_broad_surface_tours.py` | Each broad-title page either covers every required form or sets `scope_first_pass = true` with `see_also` links to focused neighbors. |
| `scripts/check_footgun_coverage.py` | Each canonical Python footgun has a page that contains both broken-form and fixed-form tokens. |
| `scripts/check_notes_supported.py` | Every `:::note` bullet shares at least one keyword with the page body, so notes cannot assert behavior the page never demonstrates. |

The single source of truth for the registries is `docs/quality-registries.toml`. Add a new pair, broad tour, or footgun there, then update the owning page so the tokens appear in cells or prose.

## Style expectations

- Keep examples compact and language-tour focused.
Expand Down
21 changes: 19 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.PHONY: test embed-examples build check-generated fingerprint browser-layout-test seo-cache-lint verify-examples format-examples verify-python-version verify dev deploy lint
.PHONY: test embed-examples build check-generated fingerprint browser-layout-test seo-cache-lint verify-examples check-registry-integrity check-confusable-pairs check-broad-surface-tours check-footgun-coverage check-notes-supported quality-checks format-examples verify-python-version verify dev deploy lint

test:
python3 -m unittest discover -s tests -v
Expand All @@ -23,6 +23,23 @@ seo-cache-lint:
verify-examples: build
scripts/verify_examples.py

check-registry-integrity:
scripts/check_registry_integrity.py

check-confusable-pairs:
scripts/check_confusable_pairs.py

check-broad-surface-tours:
scripts/check_broad_surface_tours.py

check-footgun-coverage:
scripts/check_footgun_coverage.py

check-notes-supported:
scripts/check_notes_supported.py

quality-checks: check-registry-integrity check-confusable-pairs check-broad-surface-tours check-footgun-coverage check-notes-supported

format-examples:
scripts/format_examples.py

Expand All @@ -32,7 +49,7 @@ verify-python-version: build
lint:
uv run ruff check src tests scripts

verify: build test seo-cache-lint verify-examples browser-layout-test lint check-generated
verify: build test seo-cache-lint verify-examples quality-checks browser-layout-test lint check-generated

dev:
uv run pywrangler dev --port 9696
Expand Down
58 changes: 58 additions & 0 deletions docs/example-quality-rubric.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,62 @@ Flag these during review even when the code is correct:
- The page has no editorial progression: examples are technically related but ordered like a checklist rather than a learning path.
- The page reduces so aggressively that a necessary edge case or contrast disappears.
- `See also` links behave like tags instead of prerequisite, neighbor, or next-depth graph edges.
- A claim in the `:::note` block is not demonstrated by a cell on the same page.
- A confusable pair listed in the registry has only one side shown on its owning page.
- A canonical Python footgun listed in the registry has no page that shows the broken case and the fix.
- Pages whose titles differ only by a suffix or modifier (`iterators` vs `iterating-over-iterables`, `generators` vs `generator-expressions`) assert the relationship in prose without a cell that demonstrates it.

## Confusable-pair index

Each pair below names two or three concepts learners commonly confuse, and the single page that owns the contrast. The owning page must show both (or all) sides in cells, not merely mention them in prose. `scripts/check_confusable_pairs.py` enforces this.

| Pair | Owning page |
| --- | --- |
| `__str__` / `__repr__` | `special-methods.md` |
| `is` / `==` | `equality-and-identity.md` |
| list / tuple | `tuples.md` |
| `@classmethod` / `@staticmethod` / instance method | `classmethods-and-staticmethods.md` |
| `isinstance()` / `type() ==` | `runtime-type-checks.md` |
| generator / class iterator | `generators.md` |
| iterator / iterable | `iterator-vs-iterable.md` |
| mutable / immutable class attributes | `classes.md` |
| eager / lazy production | `generators.md` |
| `Protocol` / `ABC` | `abstract-base-classes.md` |
| `dataclass` / `NamedTuple` / `TypedDict` | `structured-data-shapes.md` |
| bound / unbound methods | `bound-and-unbound-methods.md` |
| `yield` / `return` | `generators.md` |
| shallow / deep copy | `copying-collections.md` |
| sync / async functions | `async-await.md` |

## Broad-surface checklists

For each title that names a broad surface area, the page must touch every form below or scope itself down explicitly with a `see_also` link to a focused neighbor. `scripts/check_broad_surface_tours.py` enforces this.

- **Special Methods** (`special-methods.md`): `__init__`, `__repr__`, `__str__`, `__eq__`, `__hash__`, `__lt__`, `__len__`, `__iter__`, `__contains__`, `__getitem__`, `__setitem__`, `__call__`, `__enter__`/`__exit__`, `__bool__`.
- **Operators** (`operators.md`): arithmetic, comparison, identity (`is`), membership (`in`), boolean short-circuit (`and`/`or`), bitwise, walrus (`:=`).
- **Type Hints** (`type-hints.md`): scalar annotations, container generics (`list[int]`), `|` unions, `Optional`, function signatures, `TypeAlias`, runtime visibility note.
- **Testing** (`testing.md`): `unittest.TestCase`, `assertEqual`/`assertRaises`, fixtures or `setUp`, parametrized cases or sub-tests, discovery convention.
- **Async Await** (`async-await.md`): `async def`, `await`, `asyncio.run`, `asyncio.gather`, `async for`, `async with`.
- **Packages** (`packages.md`): package layout, `__init__.py`, relative vs absolute imports, `__all__`, namespace packages.
- **Regular Expressions** (`regular-expressions.md`): `re.match`/`re.search`/`re.findall`, groups, named groups, `re.compile`, flags such as `re.IGNORECASE`/`re.MULTILINE`, substitution.
- **Literals** (`literals.md`): integer (decimal/hex/binary/underscored), float, string (raw/bytes/f-string), boolean, `None`, container literals.

## Footgun coverage

Each canonical Python surprise below must have a page that demonstrates the broken case and the fix. `scripts/check_footgun_coverage.py` enforces this.

| Footgun | Owning page |
| --- | --- |
| Mutable default class attribute (`items = []` on the class body) | `classes.md` |
| Mutable default function argument (`def f(items=[])`) | `functions.md` |
| Late-binding closure in a loop | `closures.md` |
| Integer identity caching (`is` for small ints) | `equality-and-identity.md` |
| Shallow vs deep copy on nested mutable structures | `copying-collections.md` |
| Generator one-pass exhaustion | `generators.md` |
| Dictionary mutation during iteration | `dicts.md` |
| Floating-point equality | `numbers.md` |
| `bool` as a subclass of `int` | `booleans.md` |
| Bare `except` swallowing `KeyboardInterrupt` / `SystemExit` | `exceptions.md` |

## Strengthening checklist

Expand All @@ -94,3 +150,5 @@ Before publishing or substantially editing an example, ask:
17. For broad pages, is this a map with categories and links, or should it be split?
18. Do edge cases appear close enough to the main idea that readers understand the boundary?
19. Do `See also` links express prerequisite, neighbor, or next-depth relationships rather than tags?
20. Is every claim in the `:::note` block demonstrated by a cell on this page?
21. If this page's title appears in the confusable-pair index or footgun registry, does the page show both sides (or both the broken and fixed forms)?
180 changes: 180 additions & 0 deletions docs/quality-registries.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Quality registries.
#
# Source of truth for the rubric checks in `scripts/check_*.py`.
# Each entry pins a contrast or footgun to a single owning page, so the
# catalog has exactly one home for the lesson and verifiers can prove it.

[[confusable_pairs]]
name = "__str__ vs __repr__"
owner = "special-methods"
tokens = ["__str__", "__repr__"]

[[confusable_pairs]]
name = "is vs =="
owner = "equality-and-identity"
tokens = [" is ", "=="]

[[confusable_pairs]]
name = "list vs tuple"
owner = "tuples"
tokens = ["list", "tuple"]

[[confusable_pairs]]
name = "classmethod vs staticmethod vs instance method"
owner = "classmethods-and-staticmethods"
tokens = ["@classmethod", "@staticmethod", "self"]

[[confusable_pairs]]
name = "isinstance vs type=="
owner = "runtime-type-checks"
tokens = ["isinstance(", "type("]

[[confusable_pairs]]
name = "generator vs class iterator"
owner = "generators"
tokens = ["yield", "__next__"]

[[confusable_pairs]]
name = "iterator vs iterable"
owner = "iterator-vs-iterable"
tokens = ["iterable", "iterator", "iter("]

[[confusable_pairs]]
name = "mutable vs immutable class attributes"
owner = "classes"
tokens = ["class attribute", "__init__"]

[[confusable_pairs]]
name = "eager vs lazy production"
owner = "generators"
tokens = ["return", "yield"]

[[confusable_pairs]]
name = "Protocol vs ABC"
owner = "abstract-base-classes"
tokens = ["Protocol", "ABC"]

[[confusable_pairs]]
name = "dataclass vs NamedTuple vs TypedDict"
owner = "structured-data-shapes"
tokens = ["@dataclass", "NamedTuple", "TypedDict"]

[[confusable_pairs]]
name = "bound vs unbound methods"
owner = "bound-and-unbound-methods"
tokens = ["bound method", "Class.method"]

[[confusable_pairs]]
name = "yield vs return"
owner = "generators"
tokens = ["yield", "return"]

[[confusable_pairs]]
name = "shallow vs deep copy"
owner = "copying-collections"
tokens = ["copy(", "deepcopy("]

[[confusable_pairs]]
name = "sync vs async functions"
owner = "async-await"
tokens = ["async def", "def "]

[broad_surface_tours]

[broad_surface_tours.special-methods]
required_tokens = [
"__init__", "__repr__", "__str__", "__eq__", "__hash__", "__lt__",
"__len__", "__iter__", "__contains__", "__getitem__", "__setitem__",
"__call__", "__enter__", "__exit__", "__bool__",
]

[broad_surface_tours.operators]
required_tokens = ["+", "==", " is ", " in ", "and", "or", "&", ":="]

[broad_surface_tours.type-hints]
required_tokens = ["list[", " | ", "Optional", "TypeAlias"]

[broad_surface_tours.testing]
required_tokens = ["TestCase", "assertEqual", "assertRaises", "setUp"]

[broad_surface_tours.async-await]
required_tokens = ["async def", "await", "asyncio.run", "asyncio.gather", "async for", "async with"]

[broad_surface_tours.packages]
required_tokens = ["__init__.py", "from .", "__all__"]

[broad_surface_tours.regular-expressions]
required_tokens = ["re.match", "re.search", "re.findall", "re.compile", "re.IGNORECASE", "re.sub"]

[broad_surface_tours.literals]
required_tokens = ["0x", "0b", "_", "f\"", "True", "None"]

[[footguns]]
name = "Mutable default class attribute"
owner = "classes"
broken_tokens = ["items = []"]
fixed_tokens = ["self.items = []"]

[[footguns]]
name = "Mutable default function argument"
owner = "functions"
broken_tokens = ["items=[]", "append_broken"]
fixed_tokens = ["items=None", "append_fixed"]

[[footguns]]
name = "Late-binding closure in a loop"
owner = "closures"
broken_tokens = ["lambda: i", "[2, 2, 2]"]
fixed_tokens = ["lambda i=i", "[0, 1, 2]"]

[[footguns]]
name = "Integer identity caching"
owner = "equality-and-identity"
broken_tokens = ["small_a is small_b", "big_a is big_b"]
fixed_tokens = ["big_a == big_b", "small-integer cache"]

[[footguns]]
name = "Shallow vs deep copy"
owner = "copying-collections"
broken_tokens = ["copy("]
fixed_tokens = ["deepcopy("]

[[footguns]]
name = "Generator one-pass exhaustion"
owner = "generators"
broken_tokens = ["yield"]
fixed_tokens = ["list("]

[[footguns]]
name = "Dictionary mutation during iteration"
owner = "dicts"
broken_tokens = ["for ", "del "]
fixed_tokens = ["list(", ".keys("]

[[footguns]]
name = "Floating-point equality"
owner = "numbers"
broken_tokens = ["0.1", "0.2"]
fixed_tokens = ["isclose", "math"]

[[footguns]]
name = "bool as a subclass of int"
owner = "booleans"
broken_tokens = ["isinstance(True, int)", "True + True"]
fixed_tokens = ["not isinstance(value, bool)", "is_strict_int"]

[[footguns]]
name = "Bare except swallowing KeyboardInterrupt"
owner = "exceptions"
broken_tokens = ["except Exception"]
fixed_tokens = ["except ValueError"]

[paired_pages]
# Pages whose titles differ only by a suffix or modifier; at least one
# member of each pair must demonstrate the relationship in a cell.
pairs = [
["iterators", "iterating-over-iterables"],
["iterators", "iterator-vs-iterable"],
["generators", "generator-expressions"],
["comprehensions", "comprehension-patterns"],
]
65 changes: 65 additions & 0 deletions scripts/check_broad_surface_tours.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#!/usr/bin/env python3
"""Verify that each broad-surface tour page covers its required tokens.

Reads `docs/quality-registries.toml`. A page may opt out of the strict
check by adding `scope_first_pass = true` to its frontmatter, in which
case it must instead carry `see_also` links pointing at focused
neighbors that the registry expects to exist.
"""
from __future__ import annotations

import re
import sys
import tomllib
from pathlib import Path

ROOT = Path(__file__).resolve().parents[1]
EXAMPLES_DIR = ROOT / "src" / "example_sources"
REGISTRY_PATH = ROOT / "docs" / "quality-registries.toml"


FRONTMATTER_RE = re.compile(r"^\+\+\+\n(.*?)\n\+\+\+\n", re.DOTALL)


def parse_frontmatter(text: str) -> dict:
match = FRONTMATTER_RE.match(text)
if not match:
return {}
return tomllib.loads(match.group(1))


def main() -> int:
data = tomllib.loads(REGISTRY_PATH.read_text())
tours = data.get("broad_surface_tours", {})
errors: list[str] = []
for slug, spec in tours.items():
page = EXAMPLES_DIR / f"{slug}.md"
if not page.exists():
errors.append(f"{REGISTRY_PATH}: broad-tour page missing: {slug}.md")
continue
text = page.read_text()
frontmatter = parse_frontmatter(text)
required = spec.get("required_tokens", [])
missing = [token for token in required if token not in text]
if frontmatter.get("scope_first_pass"):
see_also = frontmatter.get("see_also") or []
if not see_also:
errors.append(
f"{page}: scope_first_pass=true requires see_also links to focused neighbors"
)
continue
if missing:
errors.append(
f"{page}: broad-tour {slug!r} missing tokens: {missing}; "
"either add cells covering them or set scope_first_pass=true with see_also"
)
if errors:
for error in errors:
print(error, file=sys.stderr)
return 1
print(f"Broad-surface tour coverage OK ({len(tours)} pages).")
return 0


if __name__ == "__main__":
raise SystemExit(main())
Loading
Loading