-
Notifications
You must be signed in to change notification settings - Fork 0
Improve numerical stability of variance calculation in envelope #107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -74,10 +74,9 @@ def from_values( | |
| if not values: | ||
| raise ValueError("Cannot create envelope from empty values list") | ||
|
|
||
| # Single pass to calculate everything efficiently | ||
| # First pass: build frequency map and compute mean | ||
| frequencies = {} | ||
| total_sum = 0.0 | ||
| sum_squares = 0.0 | ||
| min_capacity = float("inf") | ||
| max_capacity = float("-inf") | ||
|
|
||
|
|
@@ -87,17 +86,22 @@ def from_values( | |
|
|
||
| # Update statistics | ||
| total_sum += value | ||
| sum_squares += value * value | ||
| min_capacity = min(min_capacity, value) | ||
| max_capacity = max(max_capacity, value) | ||
|
|
||
| # Calculate derived statistics | ||
| n = len(values) | ||
| mean_capacity = total_sum / n | ||
|
|
||
| # Use computational formula for variance: Var(X) = E[X²] - (E[X])² | ||
| variance = (sum_squares / n) - (mean_capacity * mean_capacity) | ||
| stdev_capacity = variance**0.5 | ||
| # Second pass over unique values: compute variance using the | ||
| # numerically stable formula sum((x - mean)^2) / n. | ||
| # Iterating over the frequency map is efficient when there are | ||
| # many duplicate values (common in Monte Carlo results). | ||
| variance_sum = 0.0 | ||
| for value, count in frequencies.items(): | ||
| diff = value - mean_capacity | ||
| variance_sum += count * diff * diff | ||
| stdev_capacity = (variance_sum / n) ** 0.5 | ||
|
Comment on lines
+96
to
+104
|
||
|
|
||
| # Process flow summaries if provided | ||
| flow_summary_stats = {} | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description mentions refactoring
Envelope.from_values(), but the code change is inCapacityEnvelope.from_values(). Please update the description (or code) to match the actual API being modified to avoid confusion for reviewers and future readers.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@claude need an update?