This document defines the quality strategy for the repository as it exists today:
- a working browser inference baseline,
- an improved model and training/export pipeline,
- an upcoming Cloudflare deployment milestone.
The current repo relies mainly on:
- manual validation,
- build verification,
- inspection of generated training artifacts.
There is not yet a committed automated test suite or lint pipeline in the root project scripts, so the acceptance criteria below are written to match the actual stage of the codebase.
- drawing input and reset behavior
- preprocessing pipeline (
280x280->28x28) - model loading and warmup
- browser prediction flow
- top-class and confidence-bar rendering
- training artifact generation
- TF.js artifact regeneration
- Cloudflare Pages deployment through GitHub Actions
- advanced visualization modules
- intermediate activation UI
- playback controls
Minimum checks for the current browser app:
- Open the app.
- Confirm the model reaches the ready state.
- Draw a clear
0,1,7,8, and9. - Click
Predictfor each case. - Confirm:
- the preview grid updates,
- the top class updates,
- the confidence bars update.
- Click
Clear. - Confirm canvas, preview, and prediction state reset.
The current preprocessing pipeline should be checked for:
- deterministic output for repeated identical input,
- resilience to thin strokes,
- resilience to small gaps in a stroke,
- sensible centering inside the
28x28result, - non-crashing behavior for an empty canvas.
Current note:
- empty canvas currently becomes an all-zero matrix rather than a special no-input state.
The current model integration passes when:
loadModel()succeeds from/model/model.json,- warmup finishes without user-visible errors,
- prediction returns exactly
10confidences, - the top class matches the highest-confidence output,
- repeated predictions do not visibly degrade the app.
The model-improvement phase passes when:
training/python/train_cnn.pyruns from the documenteduvworkflow,training/python/artifacts/training-summary.jsonis generated,training/python/artifacts/cnn-weights.jsonis generated,training/export-python-model.jsregeneratespublic/model/*,- the regenerated browser artifacts still load in the frontend.
The current committed training summary reports approximately:
- best validation accuracy:
0.9905 - test accuracy:
0.9910
These values are not a substitute for browser validation, but they are part of the acceptance evidence for the Phase 2 model-improvement work.
Before any deploy work is considered ready:
- run
npm ci - run
npm run build - confirm
dist/model/model.jsonexists - confirm
dist/model/group1-shard1of1.binexists - open the local production preview if needed
For the upcoming deployment phase, acceptance requires:
- GitHub Actions workflow exists,
- workflow completes successfully on
main, - Cloudflare serves the app without 404s for model assets,
- draw -> predict works in production,
- workflow rerun produces a safe redeploy.
When automated tests are added, prioritize:
- grayscale conversion and normalization math,
- bounding-box and centering logic,
- tensor shape construction,
- probability ranking logic,
- model asset existence checks in production builds.
- P0: app unusable or production deploy broken
- P1: core prediction flow incorrect or unstable
- P2: secondary issue with workaround
- P3: cosmetic or low-impact UX issue
Release rule:
- P0 and P1 must be resolved before production rollout.
The current repo is ready for the next milestone only when:
- browser baseline checks pass,
- model artifacts are valid and load correctly,
- training/export flow remains reproducible,
- deployment docs and implementation agree on Cloudflare Pages.
When advanced visualization work begins, a separate acceptance layer should be added for:
- activation extraction,
- stage synchronization,
- visualization correctness,
- playback controls.
Those checks are intentionally not treated as current baseline acceptance criteria, because those features are not implemented yet.