Virchow2 Spatial Transcriptomics Super-Resolution at 2µm

MSE vs Poisson Comparison

PIGR (Epithelial) — Poisson +0.20 PCC

The Problem: Why 2µm Prediction is Hard

Background

Methods like Img2ST-Net predict gene expression from H&E histology at ~50µm spot resolution (standard Visium). With Visium HD's 2µm bins, we can push to subcellular resolution — but this introduces severe challenges.

The sparsity problem: At 2µm, most bins contain zero UMI counts:

Patient	UMI>0 Fraction	Resolution
P1	3.4%	2µm (128×128 per patch)
P2	5.2%	2µm (128×128 per patch)
P5	6.1%	2µm (128×128 per patch)

Computed from raw count matrices in processed_crc_raw_counts/

What Fails: MSE Loss on Z-Scored Data

The standard approach — normalizing counts to z-scores and using MSE loss — fails at 2µm:

MSE encourages: minimize (prediction - target)²

With ~95% zeros in ground truth:
  → Predicting 0 everywhere minimizes loss!
  → Model learns flat, uninformative predictions

This is regression to the mean: the model hedges by predicting low values everywhere.

The Solution: Poisson Loss on Raw Counts

Spatial transcriptomics produces count data (non-negative integers). The Poisson distribution naturally models count data:

Poisson NLL Loss: L = λ - k × log(λ)
  where k = observed counts, λ = predicted rate

Why this works:

Model outputs log(λ) (rate parameter)
Predicting λ→0 when k>0 → infinite loss (can't explain observed counts)
Predicting high λ when k=0 → moderate penalty (expected sometimes)

The model is forced to predict high values where counts exist, not hedge with averages.

Evidence: Poisson vs MSE (Same Test Patient P5)

Both models trained on P1+P2, tested on P5:

Model	Loss	Data	8µm PCC	4µm PCC	2µm PCC
v6.3b	MSE	z-scored	0.442	0.325	0.193
v7	Poisson	raw counts	0.526	0.461	0.355
		Improvement	1.19×	1.42×	1.84×

Metrics from results/*/best_metrics.json, computed as masked PCC averaged over 50 genes

Poisson loss provides 1.84× better correlation at 2µm resolution.

Visual Comparison

PIGR gene: WSI mosaic of 36 high-signal patches (UMI > 50). MSE produces noisy predictions (PCC=0.09, SSIM=0.04). Poisson captures tissue structure (PCC=0.26, SSIM=0.14). Gray regions = non-tissue mask.

Results

Evaluation Methodology

Metrics: Masked PCC and masked SSIM (only tissue regions, mask coverage ≥92%)
SSIM: Windowed (7×7), computed on normalized 0-1 range within tissue mask
Resolution: Predictions at native 2µm (128×128), coarser via sum-pooling
Test set: Patient P5 (570 patches), trained on P1+P2

WSI-Level Performance (Test Patient P5)

Gene	8µm PCC	8µm SSIM	4µm PCC	4µm SSIM	2µm PCC	2µm SSIM
MT-CYB	0.775	0.652	0.679	0.716	0.501	0.842
MT-CO2	0.775	0.681	0.681	0.726	0.505	0.823
MT-ATP6	0.769	0.709	0.688	0.754	0.527	0.845
MT-CO3	0.768	0.686	0.676	0.755	0.500	0.867
MT-ND4	0.745	0.652	0.660	0.715	0.496	0.830
CEACAM5	0.657	0.688	0.510	0.710	0.318	0.799
PIGR	0.643	0.770	0.532	0.802	0.364	0.857

From results/v7_poisson_testP5_20251221_085015/wsi_figures/visualization_metrics.json

Key insight: SSIM increases at finer resolutions — structural patterns are preserved even when exact count matching (PCC) is harder.

Aggregate Metrics (50 genes)

Resolution	Mean PCC	Mean SSIM	Coverage
8µm	0.526	0.171	N/A
4µm	0.461	0.290	92.3%
2µm	0.355	0.548	92.1%

From best_metrics.json

The Model Genuinely Super-Resolves

Single patch: 2µm predictions better capture gland ring structures, blocky/low-res at 8µm.

Architecture

H&E Image (224×224 pixels, ~256µm patch)
       ↓
Virchow2 Encoder (frozen, 632M params)
       ↓ [1280-dim patch embeddings]
Hist2ST Decoder (CNN + Transformer + GNN)
       ↓
log(λ) predictions (128×128 × 50 genes)
       ↓ exp()
Expected counts (λ) at 2µm resolution

Training Configuration

Parameter	Value
Loss	Poisson NLL
Multi-scale	2µm → 4µm → 8µm → 16µm (sum-pooling)
Loss balancing	GradNorm (α=1.5, lr=0.025)
Epochs	30 (early stopped at 21, best at epoch 11)
Batch size	8 × 4 gradient accumulation
Learning rate	5e-5 with 2-epoch warmup
Optimizer	AdamW

Why Multi-Scale Supervision?

Training only at 2µm fails even with Poisson loss — the signal is too sparse. We use count-conserving sum-pooling:

# Sum-pooling preserves total counts
labels_4um = F.avg_pool2d(labels_2um, 2, 2) * 4   # 64×64
labels_8um = F.avg_pool2d(labels_2um, 4, 4) * 16  # 32×32
labels_16um = F.avg_pool2d(labels_2um, 8, 8) * 64 # 16×16

GradNorm dynamically weights losses so coarser (cleaner) signals guide early training.

Gallery

Predicting WSI 2µm-resolution gene expression from H&E histology using Virchow2 + Poisson Loss

The subcellular resolution improvement is less striking visually at the WSI level, but 2um wins at the structural SSIM metric across genes

MT-ATP6 gene expression: H&E input → Model predictions at 8µm, 4µm, and 2µm resolution vs. ground truth Visium HD data.

--- ### Mitochondrial Genes

MT-CYB (Complex III)

MT-CO2 (Complex IV)

MT-CO3 (Complex IV)

MT-ND4 (Complex I)

Epithelial Markers

CEACAM5 (Tumor marker)

PIGR (Secretory epithelium)

Usage

Training

python scripts/train_poisson_v7.py \
    --data_dir /path/to/processed_crc_raw_counts \
    --raw_data_dir /path/to/crc_hd \
    --test_patient P5 \
    --epochs 40 \
    --batch_size 8 \
    --lr 5e-5 \
    --use_gradnorm

Visualization

# WSI multi-scale figures
python scripts/visualize_v7_multiscale.py \
    --model_dir results/v7_poisson_testP5_YYYYMMDD_HHMMSS \
    --genes MT-ATP6 PIGR CEACAM5

# Patch-level comparisons (with PCC and SSIM)
python scripts/visualize_v7_patches.py \
    --model_dir results/v7_poisson_testP5_YYYYMMDD_HHMMSS \
    --genes MT-ATP6 PIGR

Data

Training uses the 10x Genomics Visium HD CRC dataset:

Resolution: 2µm bins (vs 8µm standard Visium)
Genes: Top 50 by variance
Patients: P1, P2 (train), P5 (test)
Preprocessing: Raw UMI counts, no normalization

Limitations

Single fold: Results are P1+P2 → P5 only; LOOCV needed for generalization claims
CRC only: Tested on colorectal cancer; other tissue types may differ
Top 50 genes: High-variance genes selected; rare transcripts not evaluated

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
figures		figures
results		results
scripts		scripts
.gitignore		.gitignore
PUBLICATION_ROADMAP.md		PUBLICATION_ROADMAP.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Virchow2 Spatial Transcriptomics Super-Resolution at 2µm

MSE vs Poisson Comparison

The Problem: Why 2µm Prediction is Hard

Background

What Fails: MSE Loss on Z-Scored Data

The Solution: Poisson Loss on Raw Counts

Evidence: Poisson vs MSE (Same Test Patient P5)

Visual Comparison

Results

Evaluation Methodology

WSI-Level Performance (Test Patient P5)

Aggregate Metrics (50 genes)

The Model Genuinely Super-Resolves

Architecture

Training Configuration

Why Multi-Scale Supervision?

Gallery

Epithelial Markers

Usage

Training

Visualization

Data

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Virchow2 Spatial Transcriptomics Super-Resolution at 2µm

MSE vs Poisson Comparison

The Problem: Why 2µm Prediction is Hard

Background

What Fails: MSE Loss on Z-Scored Data

The Solution: Poisson Loss on Raw Counts

Evidence: Poisson vs MSE (Same Test Patient P5)

Visual Comparison

Results

Evaluation Methodology

WSI-Level Performance (Test Patient P5)

Aggregate Metrics (50 genes)

The Model Genuinely Super-Resolves

Architecture

Training Configuration

Why Multi-Scale Supervision?

Gallery

Epithelial Markers

Usage

Training

Visualization

Data

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages