Make sure GP model are stored as CPU-only models in MLflow by RemiLehe · Pull Request #420 · BLAST-AI-ML/synapse

RemiLehe · 2026-04-08T00:44:25Z

Summary

GP models trained on GPU failed to load on CPU-only machines (e.g. the dashboard's synapse-gui environment) with: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False
Root cause: during calibration, posterior() sets a prediction_strategy on each sub-GP holding cached CUDA tensors (Cholesky factors). These are plain Python attributes, so gpytorch's _apply() does not move them when .cpu() is called, and they get serialized as CUDA tensors into MLflow.
Fix: clear prediction_strategy on each sub-GP before model.cpu() in build_lume_model. NN/ensemble_NN are unaffected as they have no such cache.

During calibration, posterior() sets a prediction_strategy on each sub-GP that holds cached CUDA tensors (Cholesky factors). These are plain Python attributes, so gpytorch's _apply() does not move them when .cpu() is called, causing deserialization to fail on machines without CUDA. Fix by clearing prediction_strategy before .cpu(). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>

RemiLehe changed the title ~~Make sure GP model are stored as CPU-only models in MLflow~~ [WIP] Make sure GP model are stored as CPU-only models in MLflow Apr 8, 2026

RemiLehe commented May 2, 2026

View reviewed changes

Comment thread ml/train_model.py Outdated

Apply suggestions from code review

df222af

Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>

RemiLehe changed the title ~~[WIP] Make sure GP model are stored as CPU-only models in MLflow~~ Make sure GP model are stored as CPU-only models in MLflow May 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sure GP model are stored as CPU-only models in MLflow#420

Make sure GP model are stored as CPU-only models in MLflow#420
RemiLehe wants to merge 2 commits into
BLAST-AI-ML:mainfrom
RemiLehe:fix-gp-cuda-serialization

RemiLehe commented Apr 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RemiLehe commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RemiLehe commented Apr 8, 2026 •

edited

Loading