HealDA layers and DiT patches for PNM Core by aayushg55 · Pull Request #1371 · NVIDIA/physicsnemo

aayushg55 · 2026-02-04T22:59:02Z

PhysicsNeMo Pull Request

Description

Adds necessary layers and patches the existing DiT to enable HealDA integration into PhysicsNemo.

DiT

Makes the DiT conditioning module for timestep and condition modular, allowing for custom conditioning_embedder modules. The existing embedding method embedded the timestep and condition separately and then added the two (following the original DiT implementation), as opposed to the EDM/SongUNet style approach, where the two are embedded jointly.
Adds DropPath to the DiT
Adds a final_dropout toggle to the Mlp
Adds options for qk normalization in DiT attention with timm and TE (Note: TE does not support qk_norm_affine unlike timm)

HealDA

Scatter aggregation method for tokenizing sparse data onto a dense grid

Breaking Changes/Bug fixes:

TE MultiHeadAttention defaults to sequence first sbhd qkv_format format matching PyTorch. This does not match the PNM documentation of TESelfAttention, suggesting batch-first (B, L, D) or the timm default of batch-first, so the qkv_format argument is exposed and the default is changed to bshd (batch-first) for consistency.
TE MultiHeadAttention does not support projection dropout, and so the appropriate dropout layer was skipped/not applied previously, making the timm and TE backends incompatible. Added a dropout layer following the attn_op for consistency. Breaking change for any model trained using TE backend with proj_drop_rate > 0.
Previously in the DiTBlock, the dropout param in the Mlp was hardcoded to 0, so mlp_drop_rate was ignored. Changed to propagate this correctly. Breaking change to any model trained with mlp_drop_rate>0

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.
If I am implementing a new model or modifying any existing model, I have followed the Models Implementation Coding Standards.

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

greptile-apps · 2026-02-05T18:44:38Z

Greptile Overview

Greptile Summary

This PR adds HealDA integration layers and patches to the DiT model for PhysicsNeMo Core. The changes include new HEALPix-based tokenizer/detokenizer modules, modular conditioning embedders, and several enhancements to the DiT architecture.

Key Changes:

New HealDA modules: HPXPatchTokenizer, HPXPatchDetokenizer, CalendarEmbedding, FrequencyEmbedding, and ScatterAggregator
Modular conditioning system: replaced hardcoded timestep/condition embedding with pluggable ConditioningEmbedderBase classes (DiTConditionEmbedder, EDMConditionEmbedder, ZeroConditioningEmbedder)
Added DropPath support to DiT blocks with linear scheduling
Added QK normalization options (RMSNorm, LayerNorm) for attention modules
Added final_dropout toggle to Mlp module
Fixed bug where mlp_drop_rate was ignored in DiTBlock (previously hardcoded to 0)
Fixed bug where TE MultiHeadAttention projection dropout was not applied
Changed TE qkv_format default from sbhd to bshd for batch-first consistency

Breaking Changes (as noted in PR description):

TE backend now defaults to bshd format instead of sbhd
TE backend now correctly applies projection dropout (affects models trained with proj_drop_rate > 0)
DiTBlock now correctly propagates mlp_drop_rate (affects models trained with mlp_drop_rate > 0)

Issues Found:

QK norm implementation for LayerNorm doesn't pass the qk_norm_affine parameter (line 220 in layers.py)
pos_embed initialization assumes it's always a Parameter, but it can be scalar 0 when pos_embed != "learnable" (line 827 in layers.py)
Empty ValueError message in CalendarEmbedding (line 104-105 in embedding.py)

Important Files Changed

Filename	Overview
physicsnemo/experimental/models/dit/layers.py	Added new conditioning embedders (DiT, EDM, Zero), qk_norm support, DropPath, PerSampleDropout; fixed TE proj_drop and mlp_drop bugs; changed TE qkv_format default
physicsnemo/experimental/models/healda/embedding.py	New FrequencyEmbedding and CalendarEmbedding modules for time-based embeddings

greptile-apps

_{2 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-05T18:44:49Z

Additional Comments (1)

physicsnemo/experimental/models/dit/layers.py
attempting to initialize pos_embed parameter without checking if it exists - when pos_embed != "learnable", self.pos_embed is set to scalar 0. at line 817, so this line will error

        if isinstance(self.pos_embed, nn.Parameter):
            nn.init.normal_(self.pos_embed, std=0.02)

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

NickGeneva · 2026-02-08T20:49:22Z

Moved HPX layers into module folder, can update this one now to use the embedding layers there

#1377

…lda-core

aayushg55 · 2026-02-09T18:46:26Z

Thanks @NickGeneva, updated to use the new hpx embedding layers.

…onEmbedder, pass timestep_embed_kwargs

NickGeneva · 2026-02-09T21:30:35Z

Over all I think this looks good, only big pending item left imois verifying that stormscope checkpoints remain operational. Other than than the changes reasonable extensions that dont have major api breaks.

…lda-core

pzharrington · 2026-02-10T18:41:58Z

/blossom-ci

NickGeneva · 2026-02-10T19:54:55Z

/blossom-ci

NickGeneva · 2026-02-10T21:48:27Z

/blossom-ci

NickGeneva · 2026-02-10T23:03:52Z

/blossom-ci

NickGeneva · 2026-02-10T23:11:38Z

/blossom-ci

* Add HealDA layers and DiT patches * renamed the conditioning_embedder for clarity * delete duplicate embedding module * update docs * updated test * update license * Propogate qk_norm_affine to layerNorm in timm Attention Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> * make ValueError more descriptive * fix tokenizer pos_embed weight init * remove hpx layer additions and import from new hpx module * move conditioning embedder to separate file * pass list of drop path rates instead of hardcoded linear schedule * changed default of ditBlock kwargs from None to {} * change ConditioningEmbedder configuration to enum, cleanup EDMConditionEmbedder, pass timestep_embed_kwargs * cleanup cond embedder tests * testing checkpoint compat * add docstring to mlp * Test fix * Update * Test fix --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Peter Harrington <pharrington@nvidia.com> Co-authored-by: Nicholas Geneva <5533524+NickGeneva@users.noreply.github.com> Co-authored-by: Nicholas Geneva <ngeneva@nvidia.com>

Add HealDA layers and DiT patches

c1cd877

NickGeneva self-requested a review February 4, 2026 23:02

aayushg55 added 6 commits February 4, 2026 15:56

renamed the conditioning_embedder for clarity

3ea4be6

delete duplicate embedding module

4fba572

update docs

6526fd7

updated test

6f923ce

Merge pnm-upstream/main

b0fcc4e

update license

85836f0

aayushg55 marked this pull request as ready for review February 5, 2026 18:41

greptile-apps Bot reviewed Feb 5, 2026

View reviewed changes

Comment thread physicsnemo/experimental/models/dit/layers.py Outdated

Comment thread physicsnemo/experimental/models/healda/embedding.py Outdated

Comment thread physicsnemo/experimental/models/dit/layers.py

aayushg55 and others added 3 commits February 5, 2026 10:51

Propogate qk_norm_affine to layerNorm in timm Attention

f9bd13e

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

make ValueError more descriptive

3aa03fc

fix tokenizer pos_embed weight init

f6fdab3

pzharrington self-requested a review February 6, 2026 18:43

NickGeneva reviewed Feb 9, 2026

View reviewed changes

Comment thread physicsnemo/experimental/models/dit/dit.py Outdated

Merge branch 'main' of https://github.com/NVIDIA/physicsnemo into hea…

3197432

…lda-core

NickGeneva reviewed Feb 9, 2026

View reviewed changes

aayushg55 added 2 commits February 9, 2026 09:59

remove hpx layer additions and import from new hpx module

8c48c98

move conditioning embedder to separate file

ae9773b

aayushg55 added 2 commits February 9, 2026 11:04

pass list of drop path rates instead of hardcoded linear schedule

f75ac11

changed default of ditBlock kwargs from None to {}

a2efd49

pzharrington reviewed Feb 9, 2026

View reviewed changes

Comment thread physicsnemo/experimental/models/dit/dit.py

aayushg55 added 2 commits February 9, 2026 12:46

change ConditioningEmbedder configuration to enum, cleanup EDMConditi…

d2ea39e

…onEmbedder, pass timestep_embed_kwargs

cleanup cond embedder tests

b3406af

testing checkpoint compat

ac42b55

pzharrington reviewed Feb 10, 2026

View reviewed changes

Comment thread physicsnemo/nn/module/mlp_layers.py

aayushg55 added 2 commits February 10, 2026 10:11

Merge branch 'main' of https://github.com/NVIDIA/physicsnemo into hea…

e842c2a

…lda-core

add docstring to mlp

f86394e

pzharrington approved these changes Feb 10, 2026

View reviewed changes

pzharrington enabled auto-merge February 10, 2026 18:42

NickGeneva approved these changes Feb 10, 2026

View reviewed changes

Merge branch 'main' into healda-core

39da986

NickGeneva added 2 commits February 10, 2026 21:14

Test fix

9a03541

Update

2afe940

NickGeneva and others added 2 commits February 10, 2026 14:59

Merge branch 'main' into healda-core

aa8d74d

Test fix

ad99cec

Merge branch 'main' into healda-core

1497399

pzharrington added this pull request to the merge queue Feb 11, 2026

Merged via the queue into NVIDIA:main with commit 37b3e5c Feb 11, 2026
4 checks passed

aayushg55 deleted the healda-core branch February 11, 2026 01:16

Conversation

aayushg55 commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Review Process

Uh oh!

greptile-apps Bot commented Feb 5, 2026

Greptile Overview

Greptile Summary

Important Files Changed

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented Feb 5, 2026

Uh oh!

NickGeneva commented Feb 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aayushg55 commented Feb 9, 2026

Uh oh!

Uh oh!

NickGeneva commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

pzharrington commented Feb 10, 2026

Uh oh!

NickGeneva commented Feb 10, 2026

Uh oh!

NickGeneva commented Feb 10, 2026

Uh oh!

NickGeneva commented Feb 10, 2026

Uh oh!

NickGeneva commented Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aayushg55 commented Feb 4, 2026 •

edited

Loading

NickGeneva commented Feb 9, 2026 •

edited

Loading