Skip to content

HueCodes/sudoku-vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sudoku Vision

Solve sudoku puzzles from photos using computer vision and deep learning. Takes a photo of a sudoku puzzle, detects the grid, recognizes digits with a CNN, validates constraints, and solves it with a C backtracking solver.

                        sample_1.jpg
    +-------+-------+-------+         +-------+-------+-------+
    | . . 3 | . 2 . | 6 . . |         | 4 8 3 | 9 2 1 | 6 5 7 |
    | 9 . . | 3 . 5 | . . 1 |  solve  | 9 6 7 | 3 4 5 | 8 2 1 |
    | . . 1 | 8 . 6 | 4 . . |  ---->  | 2 5 1 | 8 7 6 | 4 9 3 |
    +-------+-------+-------+         +-------+-------+-------+
    | . . 8 | 1 . 2 | 9 . . |         | 5 4 8 | 1 3 2 | 9 7 6 |
    | 7 . . | . . . | . . 8 |         | 7 2 9 | 5 6 4 | 1 3 8 |
    | . . 6 | 7 . 8 | 2 . . |         | 1 3 6 | 7 9 8 | 2 4 5 |
    +-------+-------+-------+         +-------+-------+-------+
    | . . 2 | 6 . 9 | 5 . . |         | 3 7 2 | 6 8 9 | 5 1 4 |
    | 8 . . | 2 . 3 | . . 9 |         | 8 1 4 | 2 5 3 | 7 6 9 |
    | . . 5 | . 1 . | 3 . . |         | 6 9 5 | 4 1 7 | 3 8 2 |
    +-------+-------+-------+         +-------+-------+-------+
         recognized grid                        solution

Architecture

  Photo
    |
    v
+-------------------+     +-------------------+     +-------------------+
|   Grid Detection   | --> | Digit Recognition | --> |     Validation     |
|                     |     |                     |     |                     |
| - Contour analysis  |     | - EmptyClassifier   |     | - Row/col/box       |
| - Hough lines       |     |   (binary, 20K)     |     |   constraint check  |
| - Harris + RANSAC   |     | - DigitCNNv3        |     | - Beam search       |
| - Rotation correct. |     |   (residual, 280K)  |     |   conflict resolver |
| - Quality scoring   |     | - Top-k alternatives|     | - Constraint prop.  |
+-------------------+     +-------------------+     +-------------------+
                                                              |
                                                              v
                                                    +-------------------+
                                                    |    C Solver        |
                                                    |                     |
                                                    | - Backtracking     |
                                                    | - ~20us/solve      |
                                                    | - WASM for web     |
                                                    +-------------------+

Quick Start

# Install
pip install -e .

# Solve a puzzle from a photo
sudoku-vision solve photo.jpg

# JSON output for scripting
sudoku-vision solve photo.jpg --output json

# Use as a library
python -c "from sudoku_vision import solve_from_image; print(solve_from_image('photo.jpg'))"

Model Performance

Benchmarked on models/digit_cnn_v3_final.pt with 162 real cell samples:

Metric Result Target
Overall digit accuracy 89.5% >= 90%
Empty cell recall 93.0% >= 95%
ECE (calibration) 0.065 < 0.10
Grid detection rate 100% > 85%
E2E solve rate 80% >= 80%
Pipeline latency ~140ms < 5s

Per-digit accuracy varies with sample size. Digits with fewer than 10 real samples (2, 3, 5, 7, 8, 9) show higher variance. Collecting more labeled data would improve these numbers.

Project Structure

sudoku-vision/
  sudoku_vision/           # Python package
    __init__.py            #   Public API: solve_from_image()
    cli.py                 #   CLI: solve, detect, benchmark
    config.py              #   Centralized configuration
    cv/                    #   Computer vision
      preprocess_v2.py     #     Multi-strategy preprocessing (CLAHE, shadow removal)
      grid_v2.py           #     Multi-method grid detection
      grid_quality.py      #     Grid quality scoring
      extract.py           #     Cell extraction (81 cells from warped grid)
    ml/                    #   Machine learning
      model_v3.py          #     DigitCNNv3 (residual + SE blocks), EmptyClassifier
      classifier.py        #     Two-stage classifier (empty + digit)
      preprocessing.py     #     CellPreprocessor (consistent train/inference)
      datasets.py          #     SyntheticDataset, RealDataset
      train_v2.py          #     Training with augmentation, mixup, label smoothing
      train_progressive.py #     3-phase progressive training
      export.py            #     ONNX export
      convert_coreml.py    #     CoreML export (iOS)
    pipeline/              #   End-to-end orchestration
      run_v2.py            #     Main pipeline
      validator.py         #     Constraint validation
      conflict_resolver.py #     Beam search correction
      constraint_resolver.py # Sudoku constraint propagation
  solver/                  # C sudoku solver
    src/main.c             #   CLI + benchmark
    src/sudoku.c           #   Backtracking solver
    src/wasm_api.c         #   WebAssembly bindings
  tests/                   # Test suite
  tools/                   # Data labeling and extraction utilities
  models/                  # Trained models (gitignored)

Tech Stack

  • Python 3.10+ -- package, CLI, pipeline orchestration
  • PyTorch -- CNN training and inference (DigitCNNv3, EmptyClassifier)
  • OpenCV -- image preprocessing, grid detection, perspective transform
  • C99 -- backtracking solver (~20us/solve, 1400 solves/sec)
  • ONNX -- model export for web inference
  • CoreML -- model export for iOS inference
  • WebAssembly -- solver compiled to WASM for browser use

Development

# Setup
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
make solver

# Run tests (skips tests requiring model files)
make test

# Run all tests including model benchmarks
make test-all

# Lint and format
make lint
make format

# Export models
make export-onnx

# Run accuracy benchmarks
make benchmark

Training

# Progressive training: MNIST -> synthetic -> real fine-tuning
python -m sudoku_vision.ml.train_progressive --output-dir models/

# Train empty cell classifier
python -m sudoku_vision.ml.train_empty --data-dir data/real

# Calibrate model confidence (temperature scaling)
python -m sudoku_vision.ml.calibrate --model models/digit_cnn_v3_final.pt

Known Limitations

  • Per-digit accuracy on real data is limited by small labeled dataset (162 samples). Digits 2 and 5 are below 70% accuracy due to having only 5-6 real samples each.
  • Empty cell recall (93%) is below the 95% target. The EmptyClassifier would benefit from more diverse empty cell examples.
  • Heavily skewed or low-contrast photos may fail grid detection.
  • The solver assumes a valid sudoku with a unique solution.

License

MIT

About

App for scanning and solving sudoku puzzles with computer vision

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors