Skip to content

feat(dp, pt): add force l2 norm loss & mae loss#5294

Open
iProzd wants to merge 4 commits intodeepmodeling:masterfrom
iProzd:0305_l2norm
Open

feat(dp, pt): add force l2 norm loss & mae loss#5294
iProzd wants to merge 4 commits intodeepmodeling:masterfrom
iProzd:0305_l2norm

Conversation

@iProzd
Copy link
Collaborator

@iProzd iProzd commented Mar 5, 2026

Summary by CodeRabbit

  • New Features

    • Added use_mae_loss to switch energy/force/virial losses between MAE and MSE.
    • Added f_use_norm to enable vector‑norm based force/virial loss computation.
    • Validation preventing invalid flag combinations (f_use_norm requires MAE or Huber).
  • Tests

    • Updated loss tests to cover MAE and force‑norm modes; adjusted test skipping logic.
  • Documentation

    • Exposed and documented new loss configuration options.
  • Chores

    • Persisted new flags in configuration serialization.

Copilot AI review requested due to automatic review settings March 5, 2026 16:48
@dosubot dosubot bot added the new feature label Mar 5, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 5, 2026

📝 Walkthrough

Walkthrough

Adds two flags to energy loss code: use_mae_loss (replaces use_l1_all) to switch MAE vs L2/Huber pathways for energy/force/virial, and f_use_norm to enable vector-norm MAE for forces. Adds validation, updates loss branches, test parameterization, and serialization across backends.

Changes

Cohort / File(s) Summary
Core DPModel energy loss
deepmd/dpmodel/loss/ener.py
Adds use_mae_loss and f_use_norm to constructor and serialization. Enforces f_use_norm only when use_huber or use_mae_loss true. Refactors energy/force/virial loss branches to support MAE vs L2/Huber and norm-based force MAE; updates reported metrics.
PyTorch energy loss
deepmd/pt/loss/ener.py
Replaces use_l1_all with use_mae_loss; adds f_use_norm. Implements component-wise MAE or vector-norm MAE for forces/virial when enabled, integrates with Huber branch, updates display metrics and serialization.
Paddle energy loss
deepmd/pd/loss/ener.py
Adds f_use_norm and use_mae_loss handling (backwards-compatible alias for use_l1_all), stores self.f_use_norm, and serializes both fields.
TensorFlow energy loss
deepmd/tf/loss/ener.py
Extends EnerStdLoss init signature with use_mae_loss and f_use_norm, stores them on instance, and includes them in serialization.
Argument validation & docs
deepmd/utils/argcheck.py
Adds documentation and command/config argument entries for use_mae_loss (alias use_l1_all) and f_use_norm for loss_ener.
Consistent loss tests
source/tests/consistent/loss/test_ener.py
Adds skip_tf and skip_pd properties, extends test parameterization to include use_mae_loss and f_use_norm, enforces validation (requires huber or mae for f_use_norm), and aggregates mae_ metrics in results.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • wanghan-iapcm
  • njzjz
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: addition of force L2 norm loss and MAE loss functionality across multiple loss modules (dp, pt, pd, tf).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt/loss/ener.py`:
- Around line 56-60: The constructor added new state flags (use_mae_loss,
f_use_norm, use_huber, huber_delta) but serialize() does not emit them, so
deserialization resets behavior; update the class's serialize() to include these
fields (use_mae_loss, f_use_norm, use_huber, huber_delta) in the serialized
dict/object and also ensure the corresponding deserialization/load path reads
them back into the instance (same names as the constructor args) so the loss
behavior is preserved after reload; reference the serialize() method and the
constructor parameter names when making the change.

In `@source/tests/consistent/loss/test_ener.py`:
- Around line 88-95: In skip_tf and skip_pd properties replace the full tuple
unpacking that creates unused locals with wildcard placeholders so Ruff won't
flag RUF059; locate the properties skip_tf and skip_pd and change the unpacking
from "(use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) =
self.param" to only bind the used elements (e.g. "(_, _, use_mae_loss,
f_use_norm) = self.param" or similar) so only use_mae_loss and f_use_norm are
named.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4125c703-fe2c-4e34-b371-8b1aa866ee9a

📥 Commits

Reviewing files that changed from the base of the PR and between 4a29836 and beebc5b.

📒 Files selected for processing (4)
  • deepmd/dpmodel/loss/ener.py
  • deepmd/pt/loss/ener.py
  • deepmd/utils/argcheck.py
  • source/tests/consistent/loss/test_ener.py

Comment on lines +56 to 60
use_mae_loss: bool = False,
inference: bool = False,
use_huber: bool = False,
f_use_norm: bool = False,
huber_delta: float = 0.01,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Persist new flags in serialize() to prevent silent behavior changes after reload.

use_mae_loss and f_use_norm are now constructor state, but they are not emitted in serialize(). After deserialization, both revert to defaults, changing loss behavior unexpectedly.

💡 Suggested patch
 def serialize(self) -> dict:
@@
         return {
             "@class": "EnergyLoss",
             "@version": 2,
@@
             "use_huber": self.use_huber,
             "huber_delta": self.huber_delta,
+            "use_mae_loss": self.use_mae_loss,
+            "f_use_norm": self.f_use_norm,
         }

Also applies to: 147-154

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt/loss/ener.py` around lines 56 - 60, The constructor added new state
flags (use_mae_loss, f_use_norm, use_huber, huber_delta) but serialize() does
not emit them, so deserialization resets behavior; update the class's
serialize() to include these fields (use_mae_loss, f_use_norm, use_huber,
huber_delta) in the serialized dict/object and also ensure the corresponding
deserialization/load path reads them back into the instance (same names as the
constructor args) so the loss behavior is preserved after reload; reference the
serialize() method and the constructor parameter names when making the change.

Comment on lines +88 to +95
(use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param
# Skip TF for MAE loss tests (not implemented in TF backend)
return CommonTest.skip_tf or use_mae_loss or f_use_norm

@property
def skip_pd(self) -> bool:
(use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param
# Skip Paddle for MAE loss tests (not implemented in Paddle backend)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix unused unpacked variables to satisfy Ruff (RUF059).

A few tuple-unpacked values are never used in skip/setup paths.

✂️ Suggested cleanup
-        (use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param
+        (_, _, use_mae_loss, f_use_norm) = self.param
@@
-        (use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param
+        (_, _, use_mae_loss, f_use_norm) = self.param
@@
-        (use_huber, enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param
+        (use_huber, _enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param

As per coding guidelines **/*.py: Always run ruff check . and ruff format . before committing changes or CI will fail.

Also applies to: 111-111

🧰 Tools
🪛 Ruff (0.15.2)

[warning] 88-88: Unpacked variable use_huber is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 88-88: Unpacked variable enable_atom_ener_coeff is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 94-94: Unpacked variable use_huber is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)


[warning] 94-94: Unpacked variable enable_atom_ener_coeff is never used

Prefix it with an underscore or any other dummy variable pattern

(RUF059)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/consistent/loss/test_ener.py` around lines 88 - 95, In skip_tf
and skip_pd properties replace the full tuple unpacking that creates unused
locals with wildcard placeholders so Ruff won't flag RUF059; locate the
properties skip_tf and skip_pd and change the unpacking from "(use_huber,
enable_atom_ener_coeff, use_mae_loss, f_use_norm) = self.param" to only bind the
used elements (e.g. "(_, _, use_mae_loss, f_use_norm) = self.param" or similar)
so only use_mae_loss and f_use_norm are named.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions github-actions bot added the Python label Mar 5, 2026
@codecov
Copy link

codecov bot commented Mar 6, 2026

Codecov Report

❌ Patch coverage is 97.33333% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.32%. Comparing base (73bb1b7) to head (945d423).

Files with missing lines Patch % Lines
deepmd/dpmodel/loss/ener.py 97.56% 1 Missing ⚠️
deepmd/pt/loss/ener.py 96.29% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5294      +/-   ##
==========================================
+ Coverage   82.31%   82.32%   +0.01%     
==========================================
  Files         768      768              
  Lines       77056    77097      +41     
  Branches     3660     3659       -1     
==========================================
+ Hits        63428    63472      +44     
+ Misses      12458    12453       -5     
- Partials     1170     1172       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pd/loss/ener.py`:
- Around line 60-61: The code stores the f_use_norm flag but silently ignores it
in the force-loss computation; add a fail-fast check for f_use_norm in the
EnerLoss class (constructor or at start of forward()) so that if f_use_norm is
True you raise a clear exception (e.g. NotImplementedError or ValueError)
explaining vector-norm force loss is not implemented, instead of silently
proceeding; update all places that set or pass f_use_norm (the constructor
signature and where it is stored) to ensure the flag triggers this check, and
ensure the error message references f_use_norm and suggests implementing
vector-norm force loss or setting f_use_norm=False.

In `@deepmd/tf/loss/ener.py`:
- Around line 152-155: The TF backend currently stores use_mae_loss and
f_use_norm on self but build() never uses them, so accept-but-ignore can mislead
users; update the class constructor (where self.use_huber, self.huber_delta,
self.use_mae_loss, self.f_use_norm are set) to validate and explicitly reject
unsupported flags: if use_mae_loss or f_use_norm is True, raise a ValueError (or
TypeError) with a clear message referencing the TF backend and the unsupported
options; mention the symbols use_mae_loss, f_use_norm and the build() method in
the error/message so callers know these flags are not implemented in the TF
implementation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 62d5f40b-7bc8-4e83-846a-cc5d6e489ca0

📥 Commits

Reviewing files that changed from the base of the PR and between a4d97a1 and 945d423.

📒 Files selected for processing (2)
  • deepmd/pd/loss/ener.py
  • deepmd/tf/loss/ener.py

Comment on lines +60 to +61
f_use_norm: bool = False,
use_mae_loss: bool | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when f_use_norm=True in the PD backend.

Line 158 stores f_use_norm, but forward() still computes force loss component-wise and never switches to a vector-norm formulation. Right now f_use_norm=True is silently ignored.

Proposed fix
         self.use_l1_all = use_l1_all
         self.inference = inference
         self.use_huber = use_huber
         self.huber_delta = huber_delta
         self.f_use_norm = f_use_norm
+        if self.f_use_norm:
+            raise RuntimeError("f_use_norm is not implemented in the PD backend.")
         if self.use_huber and (
             self.has_pf or self.has_gf or self.relative_f is not None
         ):

Also applies to: 114-119, 158-158

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pd/loss/ener.py` around lines 60 - 61, The code stores the f_use_norm
flag but silently ignores it in the force-loss computation; add a fail-fast
check for f_use_norm in the EnerLoss class (constructor or at start of
forward()) so that if f_use_norm is True you raise a clear exception (e.g.
NotImplementedError or ValueError) explaining vector-norm force loss is not
implemented, instead of silently proceeding; update all places that set or pass
f_use_norm (the constructor signature and where it is stored) to ensure the flag
triggers this check, and ensure the error message references f_use_norm and
suggests implementing vector-norm force loss or setting f_use_norm=False.

Comment on lines 152 to +155
self.use_huber = use_huber
self.huber_delta = huber_delta
self.use_mae_loss = use_mae_loss
self.f_use_norm = f_use_norm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject unsupported MAE / force-norm flags in the TF backend.

Lines 154-155 persist use_mae_loss and f_use_norm, but build() never reads either flag. Passing either option will silently fall back to the existing L2/Huber behavior, which makes the config lie about the loss being optimized.

Proposed fix
         self.use_huber = use_huber
         self.huber_delta = huber_delta
         self.use_mae_loss = use_mae_loss
         self.f_use_norm = f_use_norm
+        if self.use_mae_loss or self.f_use_norm:
+            raise RuntimeError(
+                "use_mae_loss and f_use_norm are not implemented in the TF backend."
+            )
         if self.use_huber and (
             self.has_pf or self.has_gf or self.relative_f is not None
         ):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/tf/loss/ener.py` around lines 152 - 155, The TF backend currently
stores use_mae_loss and f_use_norm on self but build() never uses them, so
accept-but-ignore can mislead users; update the class constructor (where
self.use_huber, self.huber_delta, self.use_mae_loss, self.f_use_norm are set) to
validate and explicitly reject unsupported flags: if use_mae_loss or f_use_norm
is True, raise a ValueError (or TypeError) with a clear message referencing the
TF backend and the unsupported options; mention the symbols use_mae_loss,
f_use_norm and the build() method in the error/message so callers know these
flags are not implemented in the TF implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants