Feature/differential sensitivity#253
Conversation
|
The PR should be working, but it needs some testing. Besides me, I assigned @adesai90 so that he can check that the code is now producing the differential sensitivity that he's willing to compute. Maybe we could use Figure 5 of this paper to cross-check that the values we get are approximately correct, although that paper used only 7 years of data, whereas we have 10 (partially reprocessed) years in the public data release. |
|
Thinking about this a bit more... The energy range should be a property of the signal generator that we can change at run time without re-instantiating the analysis object. The public data implementation would naturally support an architecture in which the energy_range is treated as a kwarg that is passed at runtime. However, the function that calculates the factor to convert a number of signal events into a flux is a property of the analysis object, which doesn't use the signal generator. What we should do:
Note that the signal generator is created when the |
|
@adesai90, the conversion from ns to flux should be fixed now! There was some code restructuring involved. So here's a summary of how it works now: You pass the energy range to the Internally, this sets the signal generator's which correctly generates the signal events: So there is no The returns now different values: I also added 2 utility methods that are just wrappers around
|
Thanks, Just as an update. I tested it with 8 years of data and dec=0 at index of 2.0 to get the following result This can be compared to Figure 4 from https://arxiv.org/pdf/1811.07979 (attached here) |
I think the issue here is that you should not multiply your flux norm at 1TeV by By multiplying the flux norm by that factor, you are introducing an energy dependence on a If you wanna transform your flux norm from 1/GeV/cm^2/s/sr to TeV/cm^2/s, you can simply multiply your flux norm by 1e3. This automatically also corresponds to a flux norm at 1TeV times E^2, because you are implicitly also multiplying by (1TeV/1TeV)^2. And because your spectrum is now flat, the flux norm is the same also in the tested energy bin. Then, hopefully, your differential sensitivity will really look similar to Rene's! |
There was a problem hiding this comment.
Pull request overview
This PR adds support for injecting / evaluating signal within a user-defined true-neutrino energy range (GeV), primarily targeting differential sensitivity workflows (issue #249), and refactors flux normalization conversion to use a signal-generator-provided scaling factor.
Changes:
- Introduces a shared
energy_rangecapability and propagates it through analysis + public-data signal generation components. - Adds energy-range correction factors for public-data signal generators and enforces consistent configuration across datasets when computing flux scaling.
- Refactors sensitivity/discovery utilities to use
calculate_fluxmodel_scaling_factor()as a per-event scaling factor and apply* mean_ns.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
tests/core/test_signal_generator.py |
Adds tests for dataset-to-dataset energy_range consistency and kwarg policy around MultiDatasetSignalGenerator. |
skyllh/core/signal_generation.py |
Adds HasEnergyRange capability interface and clarifies energy-range units (GeV) for signal generation methods. |
skyllh/core/analysis.py |
Adds Analysis.energy_range property + propagation to energy-range-capable components; updates flux scaling conversion API and adds mu2flux/flux2mu. |
skyllh/core/signal_generator.py |
Adds fluxmodel_scaling_factor() to multi-dataset generators and integrates optional per-dataset/source correction factors. |
skyllh/i3/signal_generation.py |
Documentation clarification that energy_range is in true neutrino energy (GeV). |
skyllh/i3/utils/analysis.py |
Updates sensitivity/discovery curve utilities to compute flux scaling via calculate_fluxmodel_scaling_factor() * mean_ns. |
skyllh/analyses/i3/publicdata_ps/time_integrated_ps.py |
Adds energy_range parameter to analysis factory and applies it to the constructed analysis. |
skyllh/analyses/i3/publicdata_ps/time_integrated_ps_function_energy_spectrum.py |
Updates flux/ns conversion helpers to use the new scaling-factor API. |
skyllh/analyses/i3/publicdata_ps/signal_generator.py |
Adds energy_range property to the public-data dataset signal generator; rebuilds source-dependent structures and computes energy-range correction factors. |
Comments suppressed due to low confidence (1)
skyllh/core/signal_generator.py:389
MultiDatasetSignalGenerator.generate_signal_eventsno longer accepts thesrc_detsigyield_weights_servicekeyword that is part of theSignalGenerator.generate_signal_eventsinterface (it used to be swallowed by**kwargs). This makes the subclass signature incompatible with the base interface and breaks callers that pass that kwarg. Consider addingsrc_detsigyield_weights_service=Noneto the signature (even if unused) or otherwise maintaining signature compatibility while still rejecting unsupported kwargs likeenergy_range.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # ref_N is the expected number of detected signal events for the reference flux normalization. | ||
| ref_N = np.sum(a_jk_eff) | ||
|
|
||
| if per_source: | ||
| a_k = np.sum(a_jk_eff, axis=0) | ||
| return a_k / (ref_N**2) | ||
|
|
||
| scaling_factor = 1.0 / ref_N | ||
|
|
There was a problem hiding this comment.
ref_N = np.sum(a_jk_eff) can become 0 (e.g. if an energy_range cut yields zero expected events), which will cause a division by zero in both the global and per-source branches. Please add a guard for ref_N <= 0 and raise a clear ValueError indicating that the configured signal hypothesis/energy_range produces zero detectable signal.
| logger.warning( | ||
| 'The fluxmodel_scaling_factor method of the MCMultiDatasetSignalGenerator class is currently only ' | ||
| 'implemented for the case of a single source or for multiple sources with the same flux model.' | ||
| 'Multiple sources withdifferent flux models are currently not correctly handled.' | ||
| ) |
There was a problem hiding this comment.
Spelling/formatting in the warning message: "withdifferent" is missing a space, and there should be a space between sentences after "flux model.".
| # Convert the energy boundaries to the closest SM bin edges. | ||
| idx0 = self.sm.get_log10_true_e_idx(np.log10(r[0])) | ||
| idx1 = self.sm.get_log10_true_e_idx(np.log10(r[1])) | ||
| r_log10 = (self.sm.true_e_bin_edges[idx0], self.sm.true_e_bin_edges[idx1]) | ||
|
|
||
| if r_log10[0] >= r_log10[1]: | ||
| raise ValueError( | ||
| 'The first element of the energy_range sequence must be strictly smaller than the second element!' | ||
| ) |
There was a problem hiding this comment.
The bin-alignment of energy_range appears to compute the upper edge incorrectly: get_log10_true_e_idx returns a bin index, but true_e_bin_edges[idx1] is the lower edge of that bin. This makes ranges collapse when both bounds fall in the same bin (raising erroneously) and prevents selecting up to the last bin edge. Consider using true_e_bin_edges[idx1 + 1] (with clipping) for the upper boundary so the aligned range fully covers the requested E_max.
There was a problem hiding this comment.
implemented in ad090bf
It makes sense to me as we also do this:
if self._energy_range_log10 is None:
return (self.sm.true_e_bin_edges[0], self.sm.true_e_bin_edges[-1])
@chiarabellenghi please take a look.
I guess neither implementation passes in case the energy_range is outside true_e_bin_edges as the get_log10_true_e_idx raises an error instead of clipping indices
There was a problem hiding this comment.
and also if passed energy_range exactly matches the SM binning this will take an additional bin, annoying 😓 . Now I don't know what's better
There was a problem hiding this comment.
implemented in ad090bf
It makes sense to me as we also do this:
if self._energy_range_log10 is None: return (self.sm.true_e_bin_edges[0], self.sm.true_e_bin_edges[-1])@chiarabellenghi please take a look.
I guess neither implementation passes in case the energy_range is outside true_e_bin_edges as the
get_log10_true_e_idxraises an error instead of clipping indices
I think we can simply make this error clearer by saying what the full range supported by the SM is
There was a problem hiding this comment.
and also if passed
energy_rangeexactly matches the SM binning this will take an additional bin, annoying 😓 . Now I don't know what's better
Fixed in a683414. Now the sm helper function handles the two cases separately using an additional input flag.
Also added a dedicated unittest.
| def _calculate_flux_weight(self, src_idx, log_e_min, log_e_max): | ||
| """Calculates an unnormalized detectable flux weight for one source | ||
| and one true-energy range. | ||
| """ | ||
| src = self._shg_mgr.source_list[src_idx] | ||
| fluxmodel = self.shg_mgr.get_fluxmodel_by_src_idx(src_idx=src_idx) | ||
|
|
||
| effA = PDAeff( | ||
| pathfilenames=self.ds.get_abs_pathfilename_list(self.ds.get_aux_data_definition('eff_area_datafile')) | ||
| ) | ||
|
|
There was a problem hiding this comment.
_calculate_flux_weight constructs a new PDAeff instance for every call, and _calculate_energy_range_correction_factors calls it twice per source. Since PDAeff.__init__ loads the effective area file(s), this can become a significant performance bottleneck. Consider caching a single PDAeff instance (or reusing already loaded effective-area data) and reusing it across sources/ranges.
There was a problem hiding this comment.
ah, yes. This is a good point
| ) | ||
|
|
||
| mean_ns_arr[sin_dec_idx, iter_idx] = mean_ns | ||
| mean_ns_err_arr[sin_dec_idx, iter_idx] = mean_ns_err | ||
| flux_scaling_arr[sin_dec_idx, iter_idx] = ana.calculate_fluxmodel_scaling_factor( | ||
| mean_ns=mean_ns, fitparam_values=np.array(fitparam_values) | ||
| ) | ||
| flux_scaling_arr[sin_dec_idx, iter_idx] = ana.calculate_fluxmodel_scaling_factor() * mean_ns | ||
|
|
There was a problem hiding this comment.
fitparam_values is still a required parameter of estimate_ps_sin_dec_sensitivity_curve, but after switching to ana.calculate_fluxmodel_scaling_factor() it is no longer used (and the docstring still claims it is used for the conversion). Either remove/deprecate the parameter or update the docstring/signature to reflect the new behavior to avoid confusion.
There was a problem hiding this comment.
Yep... and this entire library has to be cleaned up at some point...
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
|
|
||
| # Set the energy range for signal generation if it is explicitly given. | ||
| ana.energy_range = energy_range |
There was a problem hiding this comment.
This comment says the energy range is set "if it is explicitly given", but the code assigns ana.energy_range = energy_range unconditionally (including when it is None). Consider guarding this with if energy_range is not None: (or update the comment), to avoid marking the analysis energy_range as explicitly set and to avoid unnecessary component rebuilds.
| ana.energy_range = energy_range | |
| if energy_range is not None: | |
| ana.energy_range = energy_range |
|
|
||
| with self.assertRaises(TypeError): | ||
| sig_gen.generate_signal_events(rss=rss, mean=10, poisson=False, energy_range=(1e2, 1e3)) | ||
|
|
There was a problem hiding this comment.
This test enforces that MultiDatasetSignalGenerator.generate_signal_events rejects an energy_range kwarg. The PR description currently states that energy_range can be passed via sig_kwargs to trial/sensitivity helpers; that no longer matches the behavior. Please update the PR description (and/or user-facing docs) to reflect that energy_range must be set on the analysis/signal generator, not passed as sig_kwargs.
| import os | ||
| import unittest | ||
|
|
||
| import numpy as np | ||
|
|
There was a problem hiding this comment.
These new public-data tests are placed under tests/publicdata_ps/, but the repository test runner (tests/run.sh) only discovers tests/core and tests/i3, so they won’t run in CI as-is. Either move them under a discovered package (e.g. tests/i3/...) or update the test runner to include this directory; if enabled in CI, also consider skipping when dataset downloads aren’t available (Config defaults download_from_origin=True).
tomaskontrimas
left a comment
There was a problem hiding this comment.
Looks good to me! 🥳







Feature to address issue #249
This PR implements the option to inject a neutrino signal in a user-defined energy range.
The energy_range is now a property of the public data signal generator and of the analysis object.