Add variance-budget and scale-invariant fit training diagnostics by bwengals · Pull Request #40 · pymc-devs/ptgp

bwengals · 2026-06-23T21:58:26Z

Adds training diagnostics that describe how a GP splits the response variance across the mean function, the GP signal, and the observation noise, plus a scale-invariant likelihood-based "excess fit" metric for both VFE and the exact GP.

Motivation

excess_fit_per_n was sensitive to the mean and scale of y: the GP defaults to a zero mean function and y enters the fit uncentered, and the metric referenced sigma**2, so the log-determinant's scale dependence was not cancelled. Its documented "goes to 0 at the noise floor" was also wrong (it sat at -0.5).

What's added

variance_budget(gp, X, y) (model-agnostic): decomposes the response variance into mean / GP-signal / noise via the law of total variance, Var(y) = Var(m(X)) + mean(diag(K)) + mean(sigma(X)**2). Returns the three contributions, their fractions (sum to 1), and var_ratio = total / Var(y) for calibration. Invariant to the mean and scale of y; works for any mean function, composed (sum) kernels, and scalar or heteroskedastic sigma.
vfe_diagnostics: gains the four budget fields, and excess_fit_per_n is redefined to fit_per_n + 0.5*log(2*pi*Var(y - m(X))) + 0.5. Referencing the residual variance instead of sigma**2 cancels the log-determinant's scale term, so the metric is scale-invariant and reads 0 against a constant-mean Gaussian.
unapproximated_diagnostics(gp, X, y) (new): the exact-GP analogue, built on marginal_log_likelihood. Reports mll/fit/logdet, per-point fit and complexity, the same scale-invariant excess_fit_per_n, and the variance budget.

All new fields flow through compile_scipy_diagnostics, tracked_minimize, and to_idata automatically via the namedtuple plumbing.

Heteroskedastic sigma

Every metric handles sigma as a scalar or an X-dependent length-N vector (sigma * ones(N) then mean(sigma**2)).

Test plan

pytest tests/ 275 passed
pre-commit run --all-files clean (runs on every commit)
scripts/run_mypy.py 37/37 pass

📚 Documentation preview 📚: https://ptgp--40.org.readthedocs.build/en/40/

bwengals added 4 commits June 23, 2026 14:31

Add variance_budget: mean/GP/noise variance decomposition

d2d1b7c

Make excess_fit_per_n scale-invariant; add budget to vfe_diagnostics

a33d0d5

Add unapproximated_diagnostics for the exact GP

3714a9d

Test variance-budget fields reach idata optimizer_result

c0f36d1

jessegrabowski approved these changes Jun 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add variance-budget and scale-invariant fit training diagnostics#40

Add variance-budget and scale-invariant fit training diagnostics#40
bwengals wants to merge 4 commits into
mainfrom
gp-variance-diagnostics

bwengals commented Jun 23, 2026 •

edited by github-actions Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

bwengals commented Jun 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

What's added

Heteroskedastic sigma

Related

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bwengals commented Jun 23, 2026 •

edited by github-actions Bot

Loading