Numerically stable sigmoid function #436
Conversation
17bb4ac to
7bc160b
Compare
nkoukpaizan
left a comment
There was a problem hiding this comment.
A few minor comments.
The sensitivity of tanh to mu is different from exp (and to the abs forms), so it makes sense that NaNs occur at different values of mu.
@lukelowry please make sure to document (plot) the different forms we are exploring somewhere (PR, technical document or paper) for future reference.
| { | ||
| namespace Math | ||
| { | ||
| template <typename RealT> |
There was a problem hiding this comment.
Add some documentation here for what MU is and how it's later used, since the name is not self-explanatory.
| { | ||
| using RealT = typename GridKit::ScalarTraits<ScalarT>::RealT; | ||
| return FOUR<RealT> * sigmoid(x) * (ONE<RealT> - sigmoid(x)); | ||
| return MU<RealT> * sigmoid(x) * (ONE<RealT> - sigmoid(x)); |
There was a problem hiding this comment.
The factor 4 was to get a unit magnitude (true derivative times dsigmoid is no longer used, I recommend removing the function altogether.
| \sigma(x) &= \dfrac{1}{1+e^{-\mu x}} \\ | ||
| \rho(x) &= \dfrac{(\mu x+\lvert\mu x\rvert)/2+\log(1+e^{-\lvert\mu x\rvert})}{\mu} \\ | ||
| \sigma(x) &= \dfrac{1}{2}\left(1+\tanh\left(\dfrac{\mu x}{2}\right)\right) \\ | ||
| \rho(x) &= \dfrac{x+\lvert x\rvert}{2}+\dfrac{\ln(1+e^{-\mu\lvert x\rvert})}{\mu} \\ |
There was a problem hiding this comment.
Missed this in a previous merge, but the explanation of the "softplus" form for tanh versus exp . Could
Description
Updates
CommonMathsmooth step function behavior and documentation, and removes a stale expected-failure marker for the New England contingency-analysis example now that it passes.Proposed changes
tanhform.MUin one place and use it consistently insigmoid,ramp, anddsigmoid.dsigmoidto include the sigmoid scale factor.CommonMath.mdequations for the tanh sigmoid and smooth ramp form.WILL_FAIL TRUEfromnewengland_ca, sinceContingencyAnalysisnow completes successfully.Checklist
-Wall -Wpedantic -Wconversion -Wextra.Changelog changes N/A
Further comments
The ContingencyAnalysis fix was completely unexpected. I needed to fix this function for my feature branch of the
REECBimplementation to run correctly. I was happy to see that this fixed another issue.@PhilipFackler How often did you encounter this when implementing
ContingencyAnalysis? I am curious, that would be useful for related commentary in the paper.I'd also like to note here that this likely means we can fix/remove the inconsistent scaling that exists in
IEEET1andTGOV1and some other models, I think. The failures were originally thought to be a solver issue but it was aNaN/infissue. Before this change I was unable to makeMUmuch larger than240but now I can increase it to be very large, so I can make the piecewise approximations as very sharp which is helpful for validation purposes