Skip to content

Comments

[1843][DRAFT] Visualize Denoised Latents in Diffusion Engine#1845

Closed
moritzhauschulz wants to merge 244 commits intoecmwf:developfrom
moritzhauschulz:mh/develop/1843_viz_denoised_image
Closed

[1843][DRAFT] Visualize Denoised Latents in Diffusion Engine#1845
moritzhauschulz wants to merge 244 commits intoecmwf:developfrom
moritzhauschulz:mh/develop/1843_viz_denoised_image

Conversation

@moritzhauschulz
Copy link
Contributor

Description

This is a DRAFT PR to understand the changes made in an attempt to visualize denoised images with the existing inference and evaluate pipelines. This is currently still causing errors.

Issue Number

Relates to #1843

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

sophie-xhonneux and others added 30 commits October 30, 2025 17:27
Implemented Identity class

TODO: implement EMATeacher
The big question on the EMA teacher side to me is how to allow for a
fleixble teacher and student architecture that can differ

We updated some APIs of the abstract base class to allow the ema_model
forward, subject to change given the loss calculator, which is imho the
second big question mark
Easier to read and as batchsize gets more complicated in SSL this will
be a useful abstraction
It runs so far. Next steps:
 - Route all the config options
 - Start writing the loss functions to understand the state requirements
…andom and healpix masking. Open issues with _coords_local, centroids and probably other things.
TODO:
- Forecast still needs to be adapted
- Some more cleanup of variable naming, return values etc
Jubeku and others added 16 commits January 14, 2026 16:09
* initial commit [draft]

* adapt noise conditioner to make it closer to DiT

* adapt dimensionalities – code runs with default config

* lint

* fix: add conditional prediction mode handling

This commit resolves architectural incompatibilities when integrating
diffusion-based forecast engines:

1. FSDP Sharding: DiffusionForecastEngine wraps ForecastingEngine
   as `self.net`, but trainer code assumed direct `fe_blocks` access. Fixed by:
   - Adding fe_diffusion_model conditional check in init_model_and_shard()
   - Routing to model.forecast_engine.net.fe_blocks for diffusion mode

2. Model Initialization: Reordered ForecastingEngine creation to handle both
   standard and diffusion-wrapped variants with proper fallback.

3. Target Format Handling: Autoencoder mode uses different target
   structure than diffusion mode. Added conditional formatting:
   - Diffusion: targets = {"targets": [targets], "aux_outputs": aux}
   - Autoencoder: targets = {"physical": batch[0]}

4. Config Updates: added file config/diffusion_config.yml for diffusion
   model config

* added forecast engine argument

* removed unecessary logging

* reverting back to the previous config

* replaced getattr by get

* modification of forecasting engine initialization

---------

Co-authored-by: moritzhauschulz <moritz.hauschulz@gmail.com>
Co-authored-by: Matthias Karlbauer <matthias.karlbauer@ecmwf.int>
@moritzhauschulz moritzhauschulz marked this pull request as draft February 17, 2026 08:25
@moritzhauschulz moritzhauschulz changed the base branch from develop to mk/develop/1300_assemble_diffusion_model February 17, 2026 08:34
@moritzhauschulz moritzhauschulz changed the base branch from mk/develop/1300_assemble_diffusion_model to develop February 18, 2026 10:35
@github-actions github-actions bot added the model Related to model training or definition (not generic infra) label Feb 19, 2026
@moritzhauschulz moritzhauschulz changed the title [1843][DRAFT] Visualize Denoised Images [1843][DRAFT] Visualize Denoised Latents in Diffusion Engine Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Related to model training or definition (not generic infra)

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

9 participants