Skip to content

Numerically stable sigmoid function #436

Open
lukelowry wants to merge 3 commits into
developfrom
lukel/sigmoid-dev
Open

Numerically stable sigmoid function #436
lukelowry wants to merge 3 commits into
developfrom
lukel/sigmoid-dev

Conversation

@lukelowry
Copy link
Copy Markdown
Collaborator

@lukelowry lukelowry commented Jun 5, 2026

Description

Updates CommonMath smooth step function behavior and documentation, and removes a stale expected-failure marker for the New England contingency-analysis example now that it passes.

Proposed changes

  • Replace the logistic sigmoid with the stable tanh form.
  • Define the sigmoid scale MU in one place and use it consistently in sigmoid, ramp, and dsigmoid.
  • Correct dsigmoid to include the sigmoid scale factor.
  • Update CommonMath.md equations for the tanh sigmoid and smooth ramp form.
  • Remove WILL_FAIL TRUE from newengland_ca, since ContingencyAnalysis now completes successfully.

Checklist

  • All tests pass for the focused coverage listed below.
  • Code compiles cleanly with flags -Wall -Wpedantic -Wconversion -Wextra.
  • The new code follows GridKit™ style guidelines.
  • There are unit tests for the new code.
  • The new code is documented.
  • The feature branch is rebased with respect to the target branch.

Changelog changes N/A

Further comments

The ContingencyAnalysis fix was completely unexpected. I needed to fix this function for my feature branch of the REECB implementation to run correctly. I was happy to see that this fixed another issue.

@PhilipFackler How often did you encounter this when implementing ContingencyAnalysis? I am curious, that would be useful for related commentary in the paper.

I'd also like to note here that this likely means we can fix/remove the inconsistent scaling that exists in IEEET1 and TGOV1 and some other models, I think. The failures were originally thought to be a solver issue but it was a NaN/inf issue. Before this change I was unable to make MU much larger than 240 but now I can increase it to be very large, so I can make the piecewise approximations as very sharp which is helpful for validation purposes

@lukelowry lukelowry requested review from PhilipFackler and pelesh June 5, 2026 12:26
@lukelowry lukelowry added the bug Something isn't working label Jun 5, 2026
@lukelowry lukelowry force-pushed the lukel/sigmoid-dev branch from 17bb4ac to 7bc160b Compare June 5, 2026 18:40
Copy link
Copy Markdown
Collaborator

@nkoukpaizan nkoukpaizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments.

The sensitivity of tanh to mu is different from exp (and to the abs forms), so it makes sense that NaNs occur at different values of mu.

@lukelowry please make sure to document (plot) the different forms we are exploring somewhere (PR, technical document or paper) for future reference.

Comment thread GridKit/CommonMath.hpp
{
namespace Math
{
template <typename RealT>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some documentation here for what MU is and how it's later used, since the name is not self-explanatory.

Comment thread GridKit/CommonMath.hpp
{
using RealT = typename GridKit::ScalarTraits<ScalarT>::RealT;
return FOUR<RealT> * sigmoid(x) * (ONE<RealT> - sigmoid(x));
return MU<RealT> * sigmoid(x) * (ONE<RealT> - sigmoid(x));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The factor 4 was to get a unit magnitude (true derivative times $4/\mu$. Since dsigmoid is no longer used, I recommend removing the function altogether.

Comment thread GridKit/CommonMath.md
\sigma(x) &= \dfrac{1}{1+e^{-\mu x}} \\
\rho(x) &= \dfrac{(\mu x+\lvert\mu x\rvert)/2+\log(1+e^{-\lvert\mu x\rvert})}{\mu} \\
\sigma(x) &= \dfrac{1}{2}\left(1+\tanh\left(\dfrac{\mu x}{2}\right)\right) \\
\rho(x) &= \dfrac{x+\lvert x\rvert}{2}+\dfrac{\ln(1+e^{-\mu\lvert x\rvert})}{\mu} \\
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this in a previous merge, but the explanation of the "softplus" form for $\rho$ is no longer in the documentation. I'm now noticing this with the tanh versus exp . Could $\rho$ also be written with the same elemental functions as $\sigma$?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants