refac: Harmonize linopy operations and introduce a new predictable and strict convention#591
refac: Harmonize linopy operations and introduce a new predictable and strict convention#591
Conversation
…sets and supersets
Add le(), ge(), eq() methods to LinearExpression and Variable classes, mirroring the pattern of add/sub/mul/div methods. These methods support the join parameter for flexible coordinate alignment when creating constraints.
Consolidate repetitive alignment handling in _add_constant and _apply_constant_op into a single _align_constant method. This eliminates code duplication and makes the alignment behavior (handling join parameter, fill_value, size-aware defaults) testable and maintainable in one place.
numpy_to_dataarray no longer inflates ndim beyond arr.ndim, fixing lower-dim numpy arrays as constraint RHS. Also reject higher-dim constant arrays (numpy/pandas) consistently with DataArray behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use "exact" join for +/- (raises ValueError on mismatch), "inner" join for *// (intersection), and "exact" for constraint DataArray RHS. Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq()) accept explicit join= parameter as escape hatch. - Remove shape-dependent "override" heuristic from merge() and _align_constant() - Add join parameter support to to_constraint() for DataArray RHS - Forbid extra dimensions on constraint RHS - Update tests with structured raise-then-recover pattern - Update coordinate-alignment notebook with examples and migration guide Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences. |
…ords. Here's what changed: - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"]) - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...) - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...) - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...) - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...) - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...) - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...) - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...) - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...) - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10
|
The convention should be Why
cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity # dims: (tech=["wind", "solar"], region=["A", "B"])
cost * capacity # ✓ tech matches exactly, region broadcasts freely
capacity.sel(tech=["wind", "solar"]) * renewable_costNo operation should introduce new dimensions Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to cost_expr # dims: (tech, time)
regional_expr # dims: (tech, time, region)
cost_expr + regional_expr # ✗ silently expands to (tech, time, region)
capacity # dims: (tech, region, time)
risk # dims: (tech, scenario)
risk * capacity # ✗ silently expands to (tech, region, time, scenario)An explicit pre-check on all operations: asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")Summary
|
Let's clearly differentiate between dimensions and labels. labelsI agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0. x + y.drop_extras() to say that you want an I have in a different project used | 0 to indicate keep_extras ie (x + y | 0). dimensionsi am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe). your first example actually implicitly assumes broadcasting. |
Dimensions and broadcastingI agree that auto broadcasting is helpful in some cases. So the full convention requires two separate things: labelsI'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled. import linopy
# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned
# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_alignedCombining disjoint expressions would then still need the explicit methods though. |
|
The proposed convention for all arithmetic operations in linopy: I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing |
|
I thought about the pipe operator: Would this be an issue for you? |
…erations-mixed # Conflicts: # linopy/expressions.py # linopy/variables.py # test/conftest.py # test/test_constraints.py # test/test_linear_expression.py
- merge() in v1 mode now pre-validates that shared user-dimension coordinates match exactly, then uses outer join for xr.concat (helper dims like _term/_factor are excluded from the check) - Removed redundant pre-checks from LinearExpression.__add__ and QuadraticExpression.__add__ — merge handles enforcement now - Added scalar fast path in _apply_constant_op (mul/div skip alignment) - Wrapped AlignmentError import in try/except for xarray compat - Fixed missing space in __div__ error message - Added .fillna() as escape hatch option 5 in notebook - Updated merge docstring with convention behavior summary - Added explanatory comments (stacklevel, numpy_to_dataarray filtering) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Deduplicate convention-specific test files into single files Merge 4 pairs of v1/legacy test files into single files, eliminating ~2600 lines of duplicated test code. Convention-specific alignment tests are kept in separate classes (V1/Legacy) with autouse fixtures, while shared tests run under the module-level v1 convention. - test_typing_legacy.py -> merged into test_typing.py (parametrized) - test_common_legacy.py -> merged into test_common.py (legacy align test) - test_constraints_legacy.py -> merged into test_constraints.py (legacy alignment class) - test_linear_expression_legacy.py -> merged into test_linear_expression.py (legacy alignment + join classes) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR review: consistency, dedup fixtures, missing test - Add legacy_convention fixture to conftest.py; use it consistently instead of manual try/finally blocks (#1) - Parametrize test_constant_with_extra_dims_broadcasts with convention fixture so it runs under both conventions (#2) - Add missing test_quadratic_add_expr_join_inner to TestJoinParameterLegacy (#3) - Extract shared fixtures into _CoordinateAlignmentFixtures and _ConstraintAlignmentFixtures mixin classes to eliminate fixture duplication between V1/Legacy alignment test classes (#4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
) * Restore master tests, add autouse convention fixture - Restore test files to match master exactly (legacy behavior) - Delete legacy duplicate test files - Add autouse parametrized convention fixture: every test runs under both 'legacy' and 'v1' conventions by default - Add legacy_convention/v1_convention opt-out fixtures for convention-specific tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Mark legacy-only tests, add v1 counterparts for differing behavior Tests that differ between conventions are split: - Legacy-only: marked with legacy_convention fixture (skipped under v1) - V1-only: marked with v1_convention fixture (skipped under legacy) - All other tests: run under both conventions via autouse fixture Files changed: - test_common.py: split test_align into legacy/v1 versions - test_constraints.py: mark TestConstraintCoordinateAlignment as legacy-only, add TestConstraintCoordinateAlignmentV1, split higher-dim RHS tests - test_linear_expression.py: mark TestCoordinateAlignment as legacy-only, add TestCoordinateAlignmentV1, split sum/join tests - test_piecewise_constraints.py: mark legacy-only (implementation not yet v1-compatible) - test_sos_reformulation.py: mark legacy-only (implementation not yet v1-compatible) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
) * Restore master tests, add autouse convention fixture - Restore test files to match master exactly (legacy behavior) - Delete legacy duplicate test files - Add autouse parametrized convention fixture: every test runs under both 'legacy' and 'v1' conventions by default - Add legacy_convention/v1_convention opt-out fixtures for convention-specific tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Mark legacy-only tests, add v1 counterparts for differing behavior Tests that differ between conventions are split: - Legacy-only: marked with legacy_convention fixture (skipped under v1) - V1-only: marked with v1_convention fixture (skipped under legacy) - All other tests: run under both conventions via autouse fixture Files changed: - test_common.py: split test_align into legacy/v1 versions - test_constraints.py: mark TestConstraintCoordinateAlignment as legacy-only, add TestConstraintCoordinateAlignmentV1, split higher-dim RHS tests - test_linear_expression.py: mark TestCoordinateAlignment as legacy-only, add TestCoordinateAlignmentV1, split sum/join tests - test_piecewise_constraints.py: mark legacy-only (implementation not yet v1-compatible) - test_sos_reformulation.py: mark legacy-only (implementation not yet v1-compatible) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix mypy error, strengthen tests, and add convention test coverage - Fix mypy: use typed list[JoinOptions] for loop variable in test_linear_expression.py - Strengthen assert_linequal in test_algebraic_properties.py to verify coefficients and vars - Fix Variable.reindex_like() to handle DataArray inputs correctly - Add test_convention.py covering config validation, deprecation warnings, scalar fast path, NaN edge cases, convention switching, error messages, and Variable.reindex/reindex_like Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add reindex/reindex_like tests for Expression and Constraint, fix DataArray bug - Fix LinearExpression.reindex_like() to handle DataArray inputs (same bug as Variable) - Add TestExpressionReindex: subset, superset, fill_value, type preservation, reindex_like with Expression/Variable/DataArray/Dataset - Add TestConstraintReindex: subset, superset, reindex_like with Dataset/DataArray Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…parts (#611) * Restore master tests, add autouse convention fixture - Restore test files to match master exactly (legacy behavior) - Delete legacy duplicate test files - Add autouse parametrized convention fixture: every test runs under both 'legacy' and 'v1' conventions by default - Add legacy_convention/v1_convention opt-out fixtures for convention-specific tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Mark legacy-only tests, add v1 counterparts for differing behavior Tests that differ between conventions are split: - Legacy-only: marked with legacy_convention fixture (skipped under v1) - V1-only: marked with v1_convention fixture (skipped under legacy) - All other tests: run under both conventions via autouse fixture Files changed: - test_common.py: split test_align into legacy/v1 versions - test_constraints.py: mark TestConstraintCoordinateAlignment as legacy-only, add TestConstraintCoordinateAlignmentV1, split higher-dim RHS tests - test_linear_expression.py: mark TestCoordinateAlignment as legacy-only, add TestCoordinateAlignmentV1, split sum/join tests - test_piecewise_constraints.py: mark legacy-only (implementation not yet v1-compatible) - test_sos_reformulation.py: mark legacy-only (implementation not yet v1-compatible) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix mypy error, strengthen tests, and add convention test coverage - Fix mypy: use typed list[JoinOptions] for loop variable in test_linear_expression.py - Strengthen assert_linequal in test_algebraic_properties.py to verify coefficients and vars - Fix Variable.reindex_like() to handle DataArray inputs correctly - Add test_convention.py covering config validation, deprecation warnings, scalar fast path, NaN edge cases, convention switching, error messages, and Variable.reindex/reindex_like Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add reindex/reindex_like tests for Expression and Constraint, fix DataArray bug - Fix LinearExpression.reindex_like() to handle DataArray inputs (same bug as Variable) - Add TestExpressionReindex: subset, superset, fill_value, type preservation, reindex_like with Expression/Variable/DataArray/Dataset - Add TestConstraintReindex: subset, superset, reindex_like with Dataset/DataArray Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Suppress LinopyDeprecationWarning in tests, add v1 constraint counterparts - Convention fixture now filters LinopyDeprecationWarning under legacy, reducing test warnings from 9262 to 213. Dedicated tests in test_convention.py still verify warnings are emitted. - test_repr.py: suppress module-level deprecation warnings from collection-time model setup. - TestConstraintCoordinateAlignmentV1: add comprehensive v1 counterparts covering all comparison operators (<=, >=, ==), subset/superset/expr raises, explicit join= escape hatches, assign_coords pattern, and higher-dim DataArray broadcast vs mismatch behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…tion
Introduce @pytest.mark.legacy_only and @pytest.mark.v1_only markers as the
single, consistent mechanism for convention-specific tests. This replaces
five different patterns (class-level autouse fixtures, function fixture
params, module-level autouse fixtures, and inconsistent naming variants)
with one visible, declarative approach.
Changes:
- conftest.py: register markers, skip logic in convention fixture
- Remove all _legacy_only/_v1_only/_use_legacy/_use_v1 autouse fixtures
- Remove legacy_convention/v1_convention fixture params from signatures
- Module-level: pytestmark = pytest.mark.{legacy,v1}_only
- Class-level: @pytest.mark.{legacy,v1}_only decorator
- Function-level: @pytest.mark.{legacy,v1}_only decorator
- Supports pytest -m "legacy_only" / -m "v1_only" for filtering
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
This is definitely a worthwhile update. How we've handled it in calliope is to allow for dims to have mismatched labels but to require a default value for each parameter, expr, variable which will be used when there is a mismatch. This aligns with Raising exceptions on misalignment can be a pain to deal with. We often have patchwork data that we want to combine with a clear fill value in mind that is equivalent to "no effect". I can see alignment being used a reasonable amount to deal with this. The issue is that you end up with lots of intermediate arrays in memory for you to achieve alignment. Having a pre-defined default or the ability to define a default on applying arithmetic operations would keep in-memory arrays to a minimum. This would be easy to then integrate with #561. On fill values: this PR only makes them available on parameters ("data") but I'd say they're also useful for decision variables and expressions, too. E.g., when creating a total cost expression you might want to combine several instances of |
…eview Merge separate TestConstraintCoordinateAlignmentLegacy/V1 and TestCoordinateAlignmentLegacy/V1 classes into unified classes where legacy and v1 test methods sit side-by-side, grouped by scenario. Test names now describe behavior (e.g. test_mul_subset_fills_zeros for legacy, test_mul_subset_raises for v1) rather than using class-level Legacy/V1 suffixes, making it clear what each convention expects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 22 missing v1_only counterparts across test_linear_expression.py:
- TestSuperset: v1 raises for add/sub/mul commutativity and div
- TestDisjoint: v1 raises for div
- TestCommutativity: parametrized v1 raises for all ops
- TestQuadratic: v1 raises for sub, reverse mul, reverse add
- TestMissingValues: v1 NaN propagates for sub, div, commutativity, quadexpr
- TestExpressionWithNaN: v1 NaN propagates for add/mul/div array,
sub/div scalar
- Add v1 negative assertions in test_linear_expression_sum_v1 and
test_linear_expression_sum_with_const_v1 (assert mismatched coords
raise before showing assign_coords workaround)
- Add TestNoDeprecationWarnings v1 counterpart in test_convention.py
- Fix test_align_v1: use pytest.raises(ValueError) instead of bare Exception
- Remove redundant test_superset_comparison_raises (covered by parametrized
test_superset_comparison_var_raises)
- Remove v1_only marker from TestScalarFastPath (convention-independent)
- Un-mark test_variable_to_linexpr_nan_coefficient as legacy_only
(to_linexpr fills NaN under both conventions)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use drop=True on scalar isel calls to prevent residual scalar coordinates from causing exact-join mismatches under the v1 arithmetic convention. Also align binary_hi coordinates with delta_lo in incremental PWL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@brynpickering As i understand your comment, you are mostly ok with the convention, but have thoughts on nan data. Am I right? THis is what im currently not that opinionated about, so im happy to hear your suggestions. |
NaN in linopy v1 means "absent term" — it marks individual terms as missing without masking entire coordinates. User-supplied NaN at API boundaries (constants, factors, constraint RHS) raises ValueError; masking must be explicit via .sel() or mask=. Implementation: - FILL_VALUE["coeffs"] changed from NaN to 0 (structural "no term") - NaN validation added in _add_constant, _apply_constant_op, to_constraint - Piecewise internals use .fillna(0) on breakpoint data - Tests updated to expect ValueError for NaN operands under v1 Key design decisions: - NaN enters only via mask= or structural ops (shift, reindex, where) - Combining expressions: absent terms do not poison valid terms (xr.sum skipna=True preserves valid contributions) - A coordinate is fully absent only when ALL terms have vars=-1 AND const is NaN — this is what isnull() checks - lhs >= rhs ≡ lhs - rhs >= 0, so RHS follows the same rules as constants Documentation: - New missing-data.ipynb: convention principles, fillna patterns, masking with .sel() and mask=, migration guide from legacy - New nan-edge-cases.ipynb: investigation of shift, roll, where, reindex, isnull, arithmetic on shifted expressions, sanitize_missings - arithmetic-convention.ipynb: updated to reference missing-data notebook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shift, where, reindex, reindex_like, unstack produce absent terms. roll, sel, isel, drop_sel, expand_dims, broadcast_like do not. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Coeffs=0 was an implicit choice about the neutral element for multiplication. NaN is more honest — it means "absent", which is what FILL_VALUE is for. Both NaN and 0 coeffs get filtered by filter_nulls_polars at solve time, so behavior is unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New cases: x + y.shift(1) + 5, x + (y+5).shift(1) + 5 (shifted const is lost), x.shift(1) + y.shift(1) (fully absent coordinate). Updated FILL_VALUE docs to reflect coeffs=NaN (not 0). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@brynpickering Thanks for the detailed feedback. Fill values depend on context (efficiency=1 vs cost=0) — this is id like to make it a user choice via .fillna(value) rather than picking a default. Misalignment doesn't have to raise. The escape hatches are lightweight: total = cost_a.add(cost_b, join="outer") # absent terms → zero at solve time
scaled = var * efficiency.fillna(1) # you choose the fillAs far as I know, Memory overhead is the same either way — fillna() on a parameter is not much larger than the parameter itself, and join="outer" does alignment inside xarray's concat anyway. But maybe im wrong here. For expressions with mismatched labels, join="outer" already works: absent terms carry NaN coefficients that are filtered out when flattening for the solver. No explicit fill value needed. Maybe adding a fill_value= parameter on .mul() / .add() resolves your issue if the fillna + join= pattern proves too verbose in practice? |
|
@coroa @FabianHofmann Im quite happy with this. But i think it desperately needs discussion. |
|
@FBumann your fillna suggestion is fine for parameters, but doesn't address nan filling later down the line of variables or expressions. As far as I understand linopy objects, it isn't possible to nan fill in the same way. This risks NaNs making their way into arithmetic when one would rather have some numeric value that has no effect to be there. |
Harmonize linopy arithmetic with legacy/v1 convention transition
This PR harmonizes coordinate alignment and NaN handling in linopy arithmetic, with a non-breaking transition layer.
linopy.options["arithmetic_convention"]Two modes:
"legacy"(default) — reproduces current master behavior exactly. EmitsLinopyDeprecationWarningon every legacy codepath."v1"— strict exact-join semantics: mismatched coordinates raiseValueErrorwith helpful messages. NaN in user data raises immediately.v1 behavior
Exact coordinate matching — Arithmetic operators (
+,-,*,/) require matching coordinates on shared dimensions. Mismatched coordinates raiseValueErrorwith suggestions:Named methods with explicit join —
.add(),.sub(),.mul(),.div(),.le(),.ge(),.eq()acceptjoin=parameter:Free broadcasting — Constants can introduce new dimensions. All algebraic laws hold.
v1 NaN convention
NaN means "absent term" — never a numeric value.
How NaN enters
Only two sources:
mask=argument at construction (add_variables,add_constraints).shift(),.where(),.reindex(),.reindex_like(),.unstack()(with missing combinations)Operations that do not produce NaN:
.roll()(circular),.sel()/.isel()(subset),.drop_sel()(drops),.expand_dims()/.broadcast_like()(broadcast existing data).How NaN propagates
NaN marks an individual term as absent, not the entire coordinate. When expressions are combined (e.g.,
x*2 + y.shift(time=1)), each term is independent — an absent term fromy.shiftdoes not mask the validxterm at the same coordinate.A coordinate is only fully absent when all terms have
vars=-1andconstis NaN. This is whatisnull()checks.What raises
Any user-supplied NaN at an API boundary — in constants, factors, or constraint RHS — raises
ValueErrorimmediately:Users handle NaN explicitly:
Why this is consistent
lhs >= rhs≡lhs - rhs >= 0— RHS follows the same rules as any constantshift,mask=) is structural; user NaN is always an errorImplementation details
FILL_VALUE = {"vars": -1, "coeffs": NaN, "const": NaN}— all float fields use NaN for absence, integer fields use -1_add_constant,_apply_constant_op,to_constraint.fillna(0)on breakpoint dataLegacy behavior (default, backward-compatible)
merge() behavior
merge()enforces exact matching on shared user-dimension coordinates in v1 mode. Helper dims (_term,_factor) and the concat dim are excluded from this check. The actualxr.concatusesjoin="outer".In legacy mode, merge uses
overridewhen shared user dims have matching sizes,outerotherwise.Source changes
config.pyLinopyDeprecationWarning,arithmetic_conventionsettingexpressions.pymerge()pre-validates user-dim coords under v1;to_constrainthas separate legacy/v1 paths; NaN validation at API boundariescommon.pyalign()reads convention (legacy→inner, v1→exact)variables.py__mul__, explicitTypeErrorin__div__,.reindex()methodspiecewise.py.fillna(0)on breakpoint data for v1 compatibilitymonkey_patch_xarray.pymodel.pyDocumentation
arithmetic-convention.ipynbmissing-data.ipynb_nan-edge-cases.ipynbTest structure
@pytest.mark.v1_onlyand@pytest.mark.legacy_onlyfor convention-specific testsconftest.pywith auto-convention switchingRollout plan
"legacy"— nothing breakslinopy.options["arithmetic_convention"] = "v1""v1"and drop legacy modeOpen questions
from_tuples/linexpr()— Currently follows the global convention. In practice always called with same-coord variables, so convention doesn't matter. Low-priority.nan_as_maskoption — A future global setting could allow NaN in user data to be treated as masking (equivalent to.fillna(0)+mask=data.notnull()). Deferred for team discussion.Test plan
🤖 Generated with Claude Code