Context
A 42-round RLCR session reached maxiter without completion. The methodology analysis identified structural issues preventing convergence despite consistent forward progress.
Key Findings
-
Asymptotic non-convergence: Reviews introduced more gaps per round (~5-8) than were resolved per round (~1-3), making convergence mathematically impossible regardless of iteration budget.
-
Severe diminishing returns: Early rounds (0-5) produced ~50 tests and major infrastructure; later rounds (31-42) averaged <1 test/round on narrow micro-fixes (comment rewording, count corrections, single assertion changes).
-
Over-sized scope: 8 acceptance criteria were too many for a single session — 4 were never substantively started despite the full 42-round budget.
-
No intermediate exit pathway: The only exits were "all ACs fully met" or "hit maxiter." No individual AC could be declared "sufficient."
Methodology Improvement Suggestions
1. Batch-width scaling for narrow fixes
When a round produces fewer than 3 meaningful tests/changes, allow bundling 3-5 trivial corrections into one round without full review ceremony.
2. Severity classification for review findings
Classify gaps as Critical (blocks acceptance), Major (should fix within 2 rounds), or Minor (can be batched). Only Critical items should prevent COMPLETE verdict.
3. Must-fix vs should-improve separation
Reviews should distinguish blocking gaps from enhancement suggestions. Only blocking gaps enter the next round's contract.
4. Invariant specifications in contracts
Instead of "Add feature X," specify "Add feature X such that invariant Y holds, verified by a test that would fail if Y is violated." This prevents multi-round fix-then-harden chains.
5. AC-level completion gates
Allow individual acceptance criteria to be marked "sufficient for MVP" and locked. Once gated, no new findings unless a regression is introduced.
6. Priority handshake (P0/P1/P2)
After each review, reviewer ranks next-round gaps. Implementer's contract commits to all P0s and optionally P1s. Creates alignment without reducing rigor.
7. Session scope limits
Scope RLCR sessions to 2-3 acceptance criteria maximum. If the plan has 8 ACs, split into multiple sessions with realistic 10-15 round budgets each.
8. Mandatory breadth checks
After N rounds (e.g., 15), force attention to unstarted ACs. After 2N rounds, allow conditional acceptance for "close enough" items.
9. Delta-only reporting for persistent items
After first mention, remaining items should only reappear if status changed, reprioritized, or now blocking. Reduces repetitive summary/review content.
Core Tension
The fundamental tension is between review rigor (which finds more gaps) and convergence (which requires gaps to decrease). These improvements address this by adding priority, scope limits, severity classification, and intermediate milestones — allowing rigorous review to coexist with achievable completion targets.
Quantitative Evidence
| Metric |
Value |
| Total rounds |
42 |
| Test delta |
+88 tests (~2/round average, heavily front-loaded) |
| ACs "fully met" per reviewer |
0/8 |
| Unique remaining items at end |
~8 (same as round 5) |
| ACs never started |
~4 |
| Rounds on single-topic fix chains |
~12 (28%) |
Context
A 42-round RLCR session reached maxiter without completion. The methodology analysis identified structural issues preventing convergence despite consistent forward progress.
Key Findings
Asymptotic non-convergence: Reviews introduced more gaps per round (~5-8) than were resolved per round (~1-3), making convergence mathematically impossible regardless of iteration budget.
Severe diminishing returns: Early rounds (0-5) produced ~50 tests and major infrastructure; later rounds (31-42) averaged <1 test/round on narrow micro-fixes (comment rewording, count corrections, single assertion changes).
Over-sized scope: 8 acceptance criteria were too many for a single session — 4 were never substantively started despite the full 42-round budget.
No intermediate exit pathway: The only exits were "all ACs fully met" or "hit maxiter." No individual AC could be declared "sufficient."
Methodology Improvement Suggestions
1. Batch-width scaling for narrow fixes
When a round produces fewer than 3 meaningful tests/changes, allow bundling 3-5 trivial corrections into one round without full review ceremony.
2. Severity classification for review findings
Classify gaps as Critical (blocks acceptance), Major (should fix within 2 rounds), or Minor (can be batched). Only Critical items should prevent COMPLETE verdict.
3. Must-fix vs should-improve separation
Reviews should distinguish blocking gaps from enhancement suggestions. Only blocking gaps enter the next round's contract.
4. Invariant specifications in contracts
Instead of "Add feature X," specify "Add feature X such that invariant Y holds, verified by a test that would fail if Y is violated." This prevents multi-round fix-then-harden chains.
5. AC-level completion gates
Allow individual acceptance criteria to be marked "sufficient for MVP" and locked. Once gated, no new findings unless a regression is introduced.
6. Priority handshake (P0/P1/P2)
After each review, reviewer ranks next-round gaps. Implementer's contract commits to all P0s and optionally P1s. Creates alignment without reducing rigor.
7. Session scope limits
Scope RLCR sessions to 2-3 acceptance criteria maximum. If the plan has 8 ACs, split into multiple sessions with realistic 10-15 round budgets each.
8. Mandatory breadth checks
After N rounds (e.g., 15), force attention to unstarted ACs. After 2N rounds, allow conditional acceptance for "close enough" items.
9. Delta-only reporting for persistent items
After first mention, remaining items should only reappear if status changed, reprioritized, or now blocking. Reduces repetitive summary/review content.
Core Tension
The fundamental tension is between review rigor (which finds more gaps) and convergence (which requires gaps to decrease). These improvements address this by adding priority, scope limits, severity classification, and intermediate milestones — allowing rigorous review to coexist with achievable completion targets.
Quantitative Evidence