convergence issue in power_upper_limit and improve robustness in #942#954
Conversation
|
@Omiii-215 thanks for your PR. I understand the issue. However, I would like to avoid regressions, so we should make sure that the code still works where it already worked. Also, I would like to have a clear example of the code failing for one or more power values. My suggestion is to:
|
|
@matteobachetti The existing test My fix correctly identifies this as having an upper limit of 0.0. The previous result (0.058) was likely due to the numerical instability of the minimizer in this range, so I believe the new result is actually the correct one. |
|
i'll update the PR with tests once you confirm the failures in the issue #956 |
|
@Omiii-215 I don't see why the upper limit should be 0. It's an upper limit, so the highest signal power expected to produce the measured power. It should not be 0. Or am I missing something? |
|
@matteobachetti I agree that by assumption an upper limit shouldn't be zero it usually implies "we can't rule out a signal up to X". However, in this specific case ( Regarding the "systematic" pass: It wasn't random instability, but a systematic artifact. The old value ( Proof: |
|
@Omiii-215 ok, thanks, I see and agree. A few remaining nitpicks. I propose the following as a robust test of functionality: def ncx2_ppf(xs, n=1, c=0.99, summed_flag=True):
if isinstance(xs, Iterable):
return [ncx2_ppf(x, n, c, summed_flag) for x in xs]
if not summed_flag:
xs = xs * n
rv = stats.ncx2(2 * n, xs)
val = rv.ppf(1 - c)
if not summed_flag:
val = val / n
return val
(...)
@pytest.mark.parametrize("n", [1, 10, 100, 1000])
@pytest.mark.parametrize("c", [0.50, 0.95, 0.99])
@pytest.mark.parametrize("summed_flag", [True, False])
def test_power_upper_limit_roundtrip(self, n, c, summed_flag):
# our pmeas should be the lower limit of a certain preal. Let us go backwards, and simulate preal.
preal = 10 ** np.random.uniform(-5, 3, 100)
# and create the pmeas from there
pmeas = ncx2_ppf(preal, n=n, c=c, summed_flag=summed_flag)
# Now calculate the upper limit, which should coincide with preal.
uplim = [power_upper_limit(p, c=c, n=n, summed_flag=summed_flag) for p in pl]
assert np.allclose(uplim, preal, rtol=0.1)
@pytest.mark.parametrize("n", [1, 100])
@pytest.mark.parametrize("c", [0.50, 0.99])
@pytest.mark.parametrize("summed_flag", [True, False])
def test_power_upper_limit_low_val(self, n, c, summed_flag):
# consistent with power==0 if the measured power is below the given confidence limit
# for no power
maxval = ncx2_ppf(0, n=n, c=c, summed_flag=summed_flag)
vals = np.random.uniform(0, maxval, 10)
for val in vals:
assert np.isclose(power_upper_limit(val, c=c, n=n, summed_flag=summed_flag), 0.0)The function itself, I would also simplify it a bit. I don't think we should set an upper limit on the maximum power, and return nan. At most, we could emit an exception if the power minimzation fails. I suggest to revise your function along the following lines, maintaining your main ideas of using brentq instead of minimize and having a low limit where the upper limit is 0, but changing little of the rest: def power_upper_limit(pmeas, n=1, c=0.95, summed_flag=True):
"""Upper limit on signal power, given a measured power in the PDS/Z search.
Adapted from Vaughan et al. 1994, noting that, after appropriate
normalization of the spectral stats, the distribution of powers in the PDS
and the Z^2_n searches is always described by a noncentral chi squared
distribution.
Note that Vaughan+94 gives p(pmeas | preal), while we are interested in
p(real | pmeas), which is not described by the NCX2 stat. Rather than
integrating the CDF of this probability distribution, we start from a
reasonable approximation and fit to find the preal that gives pmeas as
a (e.g.95%) confidence limit.
As Vaughan+94 shows, this power is always larger than the observed one.
This is because we are looking for the maximum signal power that,
combined with noise powers, would give the observed power. This involves
the possibility that noise powers partially cancel out some signal power.
Parameters
----------
pmeas: float
The measured value of power
Other Parameters
----------------
n: int
The number of summed powers to obtain pmeas. It can be multiple
harmonics of the PDS, adjacent bins in a PDS summed to collect all the
power in a QPO, the n in Z^2_n or the number of averaged PDS
c: float
The confidence value for the probability (e.g. 0.95 = 95%)
summed_flag: bool
If True, pmeas is the sum of n powers. If False, pmeas is the average
of n powers. This is relevant when dealing with averaged PDS, where
the powers are averaged rather than summed. For example, Z^2_n searches
deal with summed powers (i.e. summed_flag=True), while if power spectrum
is averaged to improve the statistics the summed_flag should be set to False.
Returns
-------
psig: float
The signal power that could produce P>pmeas with 1 - c probability
Examples
--------
>>> pup = power_upper_limit(40, 1, 0.99)
>>> assert np.isclose(pup, 75, atol=2)
"""
def ppf(x):
rv = stats.ncx2(2 * n, x)
return rv.ppf(1 - c)
def isf(x):
rv = stats.ncx2(2 * n, x)
return rv.ppf(c)
if summed_flag:
pow = pmeas
else:
pow = pmeas * n
def func_to_minimize(x):
return ppf(x) - pow
rv = stats.chi2(2 * n)
plow = rv.ppf(1 - c)
if pow < plow:
# Any power below plow is consistent with noise,
# so the upper limit on the signal power is 0.
return 0.0
initial = isf(pow)
sol = brentq(func_to_minimize, 0, initial * 2)
if summed_flag:
return sol
else:
return sol / nThe function now is robustly giving the correct result, where the old function failed at multiple places: from collections.abc import Iterable
from stingray.stats import power_upper_limit
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
def ppf(xs, n=1, c=0.99, summed_flag=True):
if isinstance(xs, Iterable):
return [ppf(x, n, c, summed_flag) for x in xs]
if not summed_flag:
xs = xs * n
rv = stats.ncx2(2 * n, xs)
val = rv.ppf(1 - c)
if not summed_flag:
val = val / n
return val
n=100
c=0.99
preal = 10**np.random.uniform(-5, 3, 100)
pl = ppf(preal, n=n, c=c)
pl_plus = np.concatenate((np.random.uniform(np.min(pl) / 1000, np.min(pl), 20), pl))
uplim = [power_upper_limit(p, c=c, n=n) for p in pl_plus]
plt.scatter(preal, pl, label="forward (measured vs real)", s=40)
plt.scatter(uplim, pl_plus, label="back (measured vs upper lim)", s=20)
plt.legend()
plt.title(f"Roundtrip of power upper limit (n={n}, c={c})")
plt.show()
|
|
updated |
|
The changelog file should be named |
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## main #954 +/- ##
===========================================
- Coverage 96.14% 80.31% -15.83%
===========================================
Files 48 48
Lines 9899 9893 -6
===========================================
- Hits 9517 7946 -1571
- Misses 382 1947 +1565 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
matteobachetti
left a comment
There was a problem hiding this comment.
Thanks @Omiii-215 , for finding the issue, correcting it... and patiently explaining why it was an issue.
Please fix the small remaining failures (run black, rename the changelog, etc.), then I'll merge.
|
Thankyou @matteobachetti All the test should pass now |

power_upper_limit
failed to converge for low power values or summed powers because
minimize
used restrictive bounds and poor initial guesses.
minimize
approach (Nelder-Mead) with valid root-finding (brentq) which exploits the problem's monotonic nature for guaranteed convergence.
test_stats.py
specifically targeting low-power inputs that previously caused failures.
Resolve #956