Skip to content

refac: Harmonize linopy operations and introduce a new predictable and strict convention#591

Open
FBumann wants to merge 71 commits intomasterfrom
harmonize-linopy-operations-mixed
Open

refac: Harmonize linopy operations and introduce a new predictable and strict convention#591
FBumann wants to merge 71 commits intomasterfrom
harmonize-linopy-operations-mixed

Conversation

@FBumann
Copy link
Collaborator

@FBumann FBumann commented Feb 20, 2026

Harmonize linopy arithmetic with legacy/v1 convention transition

This PR harmonizes coordinate alignment and NaN handling in linopy arithmetic, with a non-breaking transition layer.

linopy.options["arithmetic_convention"]

Two modes:

  • "legacy" (default) — reproduces current master behavior exactly. Emits LinopyDeprecationWarning on every legacy codepath.
  • "v1" — strict exact-join semantics: mismatched coordinates raise ValueError with helpful messages. NaN in user data raises immediately.
linopy.options["arithmetic_convention"] = "v1"

v1 behavior

Exact coordinate matching — Arithmetic operators (+, -, *, /) require matching coordinates on shared dimensions. Mismatched coordinates raise ValueError with suggestions:

x[i=0,1,2] + y[i=1,2,3]  # ValueError: use .add(y, join="inner")

Named methods with explicit join.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq() accept join= parameter:

x.add(y, join="inner")      # intersection
x.add(y, join="outer")      # union with fill
x.add(y, join="left")       # keep x's coordinates
x.add(y, join="override")   # positional alignment

Free broadcasting — Constants can introduce new dimensions. All algebraic laws hold.


v1 NaN convention

NaN means "absent term" — never a numeric value.

How NaN enters

Only two sources:

  1. mask= argument at construction (add_variables, add_constraints)
  2. Structural operations: .shift(), .where(), .reindex(), .reindex_like(), .unstack() (with missing combinations)

Operations that do not produce NaN: .roll() (circular), .sel() / .isel() (subset), .drop_sel() (drops), .expand_dims() / .broadcast_like() (broadcast existing data).

How NaN propagates

NaN marks an individual term as absent, not the entire coordinate. When expressions are combined (e.g., x*2 + y.shift(time=1)), each term is independent — an absent term from y.shift does not mask the valid x term at the same coordinate.

A coordinate is only fully absent when all terms have vars=-1 and const is NaN. This is what isnull() checks.

What raises

Any user-supplied NaN at an API boundary — in constants, factors, or constraint RHS — raises ValueError immediately:

x + data_with_nans        # ValueError
x * factor_with_nans      # ValueError
expr >= rhs_with_nans     # ValueError

Users handle NaN explicitly:

# Fill before arithmetic — you choose the fill value
x + data.fillna(0)        # NaN = "no offset"
x * factor.fillna(0)      # NaN = "exclude this term"
x * factor.fillna(1)      # NaN = "no scaling"

# Mask constraints — .sel() (preferred) or mask=
valid = rhs.notnull()
m.add_constraints(expr.sel(i=valid) <= rhs.sel(i=valid), name="c")
# or
m.add_constraints(expr <= rhs.fillna(0), mask=rhs.notnull(), name="c")

Why this is consistent

  • lhs >= rhslhs - rhs >= 0 — RHS follows the same rules as any constant
  • No dual role for NaN: internal NaN (from shift, mask=) is structural; user NaN is always an error
  • Absent terms, not absent coordinates: combining a valid expression with a partially-absent one preserves the valid part

Implementation details

  • FILL_VALUE = {"vars": -1, "coeffs": NaN, "const": NaN} — all float fields use NaN for absence, integer fields use -1
  • NaN checks in _add_constant, _apply_constant_op, to_constraint
  • Piecewise internals use .fillna(0) on breakpoint data

Legacy behavior (default, backward-compatible)

  • Size-aware alignment: override when sizes match, left-join otherwise
  • NaN as neutral element: filled with 0 (add/sub/mul) or 1 (div)
  • Constraint RHS: NaN means "no constraint", preserved through subtraction

merge() behavior

merge() enforces exact matching on shared user-dimension coordinates in v1 mode. Helper dims (_term, _factor) and the concat dim are excluded from this check. The actual xr.concat uses join="outer".

In legacy mode, merge uses override when shared user dims have matching sizes, outer otherwise.

Source changes

File Change
config.py LinopyDeprecationWarning, arithmetic_convention setting
expressions.py All arithmetic paths branch on convention; merge() pre-validates user-dim coords under v1; to_constraint has separate legacy/v1 paths; NaN validation at API boundaries
common.py align() reads convention (legacy→inner, v1→exact)
variables.py Scalar fast path in __mul__, explicit TypeError in __div__, .reindex() methods
piecewise.py .fillna(0) on breakpoint data for v1 compatibility
monkey_patch_xarray.py DataArray/Dataset arithmetic with linopy types
model.py Convention-aware model methods

Documentation

Notebook Content
arithmetic-convention.ipynb Coordinate alignment rules, join parameter, migration guide
missing-data.ipynb NaN convention principles, fillna patterns, masking with .sel() and mask=, legacy comparison
_nan-edge-cases.ipynb Dev notebook: investigation of shift, roll, where, reindex, isnull, arithmetic on shifted expressions, FILL_VALUE internals

Test structure

  • Marker-based separation: @pytest.mark.v1_only and @pytest.mark.legacy_only for convention-specific tests
  • Shared fixtures in conftest.py with auto-convention switching
  • Tests validate: strict matching, explicit join, NaN raises, NaN propagation in shifted expressions, piecewise/SOS under both conventions

Rollout plan

  1. This PR: Default "legacy" — nothing breaks
  2. Downstream: Users opt in with linopy.options["arithmetic_convention"] = "v1"
  3. linopy v1: Flip default to "v1" and drop legacy mode

Open questions

  • from_tuples / linexpr() — Currently follows the global convention. In practice always called with same-coord variables, so convention doesn't matter. Low-priority.
  • nan_as_mask option — A future global setting could allow NaN in user data to be treated as masking (equivalent to .fillna(0) + mask=data.notnull()). Deferred for team discussion.
  • Pipe operator — Only linopy objects, or also constants? (follow-up PR)

Test plan

  • All tests pass under both conventions (3450 passed)
  • Legacy tests validate backward compatibility
  • v1 tests validate strict coordinate matching and NaN raises
  • Piecewise and SOS tests run under both conventions
  • Documentation notebooks execute cleanly

🤖 Generated with Claude Code

FabianHofmann and others added 19 commits February 9, 2026 14:28
Add le(), ge(), eq() methods to LinearExpression and Variable classes,
mirroring the pattern of add/sub/mul/div methods. These methods support
the join parameter for flexible coordinate alignment when creating constraints.
Consolidate repetitive alignment handling in _add_constant and
_apply_constant_op into a single _align_constant method. This
eliminates code duplication and makes the alignment behavior
(handling join parameter, fill_value, size-aware defaults) testable
and maintainable in one place.
numpy_to_dataarray no longer inflates ndim beyond arr.ndim, fixing
lower-dim numpy arrays as constraint RHS. Also reject higher-dim
constant arrays (numpy/pandas) consistently with DataArray behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use "exact" join for +/- (raises ValueError on mismatch), "inner" join
for *// (intersection), and "exact" for constraint DataArray RHS.
Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq())
accept explicit join= parameter as escape hatch.

- Remove shape-dependent "override" heuristic from merge() and
  _align_constant()
- Add join parameter support to to_constraint() for DataArray RHS
- Forbid extra dimensions on constraint RHS
- Update tests with structured raise-then-recover pattern
- Update coordinate-alignment notebook with examples and migration guide

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 20, 2026

@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences.
Tests need some work though. And migration as well.
Looking forward to your opinion on the convention

FBumann and others added 2 commits February 20, 2026 13:51
…ords. Here's what changed:

  - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"])
  - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...)
  - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful
  - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...)
  - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...)
  - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...)
  - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...)
  - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...)
  - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...)
  - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...)
  - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant
  - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10
@FBumann
Copy link
Collaborator Author

FBumann commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Why "exact" instead of "inner" for * and /

"exact" still broadcasts freely over dimensions that only exist on one side — it only enforces strict matching on shared dimensions. So the common scaling pattern works fine:

cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity  # dims: (tech=["wind", "solar"], region=["A", "B"])

cost * capacity  # ✓ tech matches exactly, region broadcasts freely

"inner" is dangerous: if coords on a shared dimension don't match due to a typo or upstream change, it silently drops values. The explicit and safe way to subset before multiplying is:

capacity.sel(tech=["wind", "solar"]) * renewable_cost

No operation should introduce new dimensions

Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to + and - as to * and / — new dimensions silently expand the optimization problem in unintended ways:

cost_expr      # dims: (tech, time)
regional_expr  # dims: (tech, time, region)

cost_expr + regional_expr  # ✗ silently expands to (tech, time, region)

capacity  # dims: (tech, region, time)
risk      # dims: (tech, scenario)
risk * capacity  # ✗ silently expands to (tech, region, time, scenario)

An explicit pre-check on all operations:

asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
    raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")

Summary

Operation Convention
+, -, *, / "exact" on shared dims; neither side may introduce dims the other doesn't have

@coroa
Copy link
Member

coroa commented Feb 27, 2026

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Let's clearly differentiate between dimensions and labels.

labels

I agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions
strange at the beginning, but they grew on me:

x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0.

x + y.drop_extras() to say that you want an outer inner join.
x.drop_extras() + y does the same, though.

I have in a different project used | 0 to indicate keep_extras ie (x + y | 0).

dimensions

i am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe).

your first example actually implicitly assumes broadcasting.

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

Dimensions and broadcasting

I agree that auto broadcasting is helpful in some cases.
I'm happy with allowing broadcasting of constants. We could allow this always...?
But I would enforce that the constant never has more dims than the variable/expression.
Or is there a use case for this?

So the full convention requires two separate things:
1. "exact" join — shared dims must have matching coords (xarray handles this)
2. Subset dim check — the constant side’s dims must be a subset of the variable/expression (custom pre-check needed)

labels

I'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled.
I would rather enforce to reindex or fill data to the correct index.
I think aligning is the correct approach:

import linopy

# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned

# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_aligned

Combining disjoint expressions would then still need the explicit methods though.
I'm interested about your take on this

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

The proposed convention for all arithmetic operations in linopy:
1. "exact" join by default — shared coords must match exactly, raises on mismatch
2. Subset dim check — constants may introduce dimensions the variable/expression doesn’t have
3. No implicit inner join — use .sel() explicitly instead
4. Outer join with fill — use x + (y | 0) or .add(join="outer", fill_value=0)
The escape hatches in order of preference: .sel() for subsetting, | 0 for inline fill, named method .add(join=...) for everything else. No context manager needed.​​​​​​​​​​​​​​​​

I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing

@FBumann
Copy link
Collaborator Author

FBumann commented Feb 28, 2026

I thought about the pipe operator:
I think it should only work with linopy internal types (Variables/expression), not constants (scalar, numpy, pandas, dataarray), as this would need monkey patching a lot and hard to get stable.

Would this be an issue for you?

FBumann and others added 3 commits March 10, 2026 17:20
…erations-mixed

# Conflicts:
#	linopy/expressions.py
#	linopy/variables.py
#	test/conftest.py
#	test/test_constraints.py
#	test/test_linear_expression.py
- merge() in v1 mode now pre-validates that shared user-dimension
  coordinates match exactly, then uses outer join for xr.concat
  (helper dims like _term/_factor are excluded from the check)
- Removed redundant pre-checks from LinearExpression.__add__ and
  QuadraticExpression.__add__ — merge handles enforcement now
- Added scalar fast path in _apply_constant_op (mul/div skip alignment)
- Wrapped AlignmentError import in try/except for xarray compat
- Fixed missing space in __div__ error message
- Added .fillna() as escape hatch option 5 in notebook
- Updated merge docstring with convention behavior summary
- Added explanatory comments (stacklevel, numpy_to_dataarray filtering)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
FBumann and others added 5 commits March 10, 2026 21:38
* Deduplicate convention-specific test files into single files

Merge 4 pairs of v1/legacy test files into single files, eliminating
~2600 lines of duplicated test code. Convention-specific alignment tests
are kept in separate classes (V1/Legacy) with autouse fixtures, while
shared tests run under the module-level v1 convention.

- test_typing_legacy.py -> merged into test_typing.py (parametrized)
- test_common_legacy.py -> merged into test_common.py (legacy align test)
- test_constraints_legacy.py -> merged into test_constraints.py (legacy alignment class)
- test_linear_expression_legacy.py -> merged into test_linear_expression.py (legacy alignment + join classes)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address PR review: consistency, dedup fixtures, missing test

- Add legacy_convention fixture to conftest.py; use it consistently
  instead of manual try/finally blocks (#1)
- Parametrize test_constant_with_extra_dims_broadcasts with convention
  fixture so it runs under both conventions (#2)
- Add missing test_quadratic_add_expr_join_inner to
  TestJoinParameterLegacy (#3)
- Extract shared fixtures into _CoordinateAlignmentFixtures and
  _ConstraintAlignmentFixtures mixin classes to eliminate fixture
  duplication between V1/Legacy alignment test classes (#4)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
)

* Restore master tests, add autouse convention fixture

- Restore test files to match master exactly (legacy behavior)
- Delete legacy duplicate test files
- Add autouse parametrized convention fixture: every test runs
  under both 'legacy' and 'v1' conventions by default
- Add legacy_convention/v1_convention opt-out fixtures for
  convention-specific tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Mark legacy-only tests, add v1 counterparts for differing behavior

Tests that differ between conventions are split:
- Legacy-only: marked with legacy_convention fixture (skipped under v1)
- V1-only: marked with v1_convention fixture (skipped under legacy)
- All other tests: run under both conventions via autouse fixture

Files changed:
- test_common.py: split test_align into legacy/v1 versions
- test_constraints.py: mark TestConstraintCoordinateAlignment as
  legacy-only, add TestConstraintCoordinateAlignmentV1, split
  higher-dim RHS tests
- test_linear_expression.py: mark TestCoordinateAlignment as
  legacy-only, add TestCoordinateAlignmentV1, split sum/join tests
- test_piecewise_constraints.py: mark legacy-only (implementation
  not yet v1-compatible)
- test_sos_reformulation.py: mark legacy-only (implementation
  not yet v1-compatible)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
)

* Restore master tests, add autouse convention fixture

- Restore test files to match master exactly (legacy behavior)
- Delete legacy duplicate test files
- Add autouse parametrized convention fixture: every test runs
  under both 'legacy' and 'v1' conventions by default
- Add legacy_convention/v1_convention opt-out fixtures for
  convention-specific tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Mark legacy-only tests, add v1 counterparts for differing behavior

Tests that differ between conventions are split:
- Legacy-only: marked with legacy_convention fixture (skipped under v1)
- V1-only: marked with v1_convention fixture (skipped under legacy)
- All other tests: run under both conventions via autouse fixture

Files changed:
- test_common.py: split test_align into legacy/v1 versions
- test_constraints.py: mark TestConstraintCoordinateAlignment as
  legacy-only, add TestConstraintCoordinateAlignmentV1, split
  higher-dim RHS tests
- test_linear_expression.py: mark TestCoordinateAlignment as
  legacy-only, add TestCoordinateAlignmentV1, split sum/join tests
- test_piecewise_constraints.py: mark legacy-only (implementation
  not yet v1-compatible)
- test_sos_reformulation.py: mark legacy-only (implementation
  not yet v1-compatible)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix mypy error, strengthen tests, and add convention test coverage

- Fix mypy: use typed list[JoinOptions] for loop variable in test_linear_expression.py
- Strengthen assert_linequal in test_algebraic_properties.py to verify coefficients and vars
- Fix Variable.reindex_like() to handle DataArray inputs correctly
- Add test_convention.py covering config validation, deprecation warnings, scalar fast path,
  NaN edge cases, convention switching, error messages, and Variable.reindex/reindex_like

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add reindex/reindex_like tests for Expression and Constraint, fix DataArray bug

- Fix LinearExpression.reindex_like() to handle DataArray inputs (same bug as Variable)
- Add TestExpressionReindex: subset, superset, fill_value, type preservation,
  reindex_like with Expression/Variable/DataArray/Dataset
- Add TestConstraintReindex: subset, superset, reindex_like with Dataset/DataArray

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…parts (#611)

* Restore master tests, add autouse convention fixture

- Restore test files to match master exactly (legacy behavior)
- Delete legacy duplicate test files
- Add autouse parametrized convention fixture: every test runs
  under both 'legacy' and 'v1' conventions by default
- Add legacy_convention/v1_convention opt-out fixtures for
  convention-specific tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Mark legacy-only tests, add v1 counterparts for differing behavior

Tests that differ between conventions are split:
- Legacy-only: marked with legacy_convention fixture (skipped under v1)
- V1-only: marked with v1_convention fixture (skipped under legacy)
- All other tests: run under both conventions via autouse fixture

Files changed:
- test_common.py: split test_align into legacy/v1 versions
- test_constraints.py: mark TestConstraintCoordinateAlignment as
  legacy-only, add TestConstraintCoordinateAlignmentV1, split
  higher-dim RHS tests
- test_linear_expression.py: mark TestCoordinateAlignment as
  legacy-only, add TestCoordinateAlignmentV1, split sum/join tests
- test_piecewise_constraints.py: mark legacy-only (implementation
  not yet v1-compatible)
- test_sos_reformulation.py: mark legacy-only (implementation
  not yet v1-compatible)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix mypy error, strengthen tests, and add convention test coverage

- Fix mypy: use typed list[JoinOptions] for loop variable in test_linear_expression.py
- Strengthen assert_linequal in test_algebraic_properties.py to verify coefficients and vars
- Fix Variable.reindex_like() to handle DataArray inputs correctly
- Add test_convention.py covering config validation, deprecation warnings, scalar fast path,
  NaN edge cases, convention switching, error messages, and Variable.reindex/reindex_like

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add reindex/reindex_like tests for Expression and Constraint, fix DataArray bug

- Fix LinearExpression.reindex_like() to handle DataArray inputs (same bug as Variable)
- Add TestExpressionReindex: subset, superset, fill_value, type preservation,
  reindex_like with Expression/Variable/DataArray/Dataset
- Add TestConstraintReindex: subset, superset, reindex_like with Dataset/DataArray

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Suppress LinopyDeprecationWarning in tests, add v1 constraint counterparts

- Convention fixture now filters LinopyDeprecationWarning under legacy,
  reducing test warnings from 9262 to 213.  Dedicated tests in
  test_convention.py still verify warnings are emitted.
- test_repr.py: suppress module-level deprecation warnings from
  collection-time model setup.
- TestConstraintCoordinateAlignmentV1: add comprehensive v1 counterparts
  covering all comparison operators (<=, >=, ==), subset/superset/expr
  raises, explicit join= escape hatches, assign_coords pattern, and
  higher-dim DataArray broadcast vs mismatch behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…tion

Introduce @pytest.mark.legacy_only and @pytest.mark.v1_only markers as the
single, consistent mechanism for convention-specific tests.  This replaces
five different patterns (class-level autouse fixtures, function fixture
params, module-level autouse fixtures, and inconsistent naming variants)
with one visible, declarative approach.

Changes:
- conftest.py: register markers, skip logic in convention fixture
- Remove all _legacy_only/_v1_only/_use_legacy/_use_v1 autouse fixtures
- Remove legacy_convention/v1_convention fixture params from signatures
- Module-level: pytestmark = pytest.mark.{legacy,v1}_only
- Class-level: @pytest.mark.{legacy,v1}_only decorator
- Function-level: @pytest.mark.{legacy,v1}_only decorator
- Supports pytest -m "legacy_only" / -m "v1_only" for filtering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@brynpickering
Copy link

This is definitely a worthwhile update. How we've handled it in calliope is to allow for dims to have mismatched labels but to require a default value for each parameter, expr, variable which will be used when there is a mismatch. This aligns with keep_extras in pyoframe except that the default value can change (e.g., for technology efficiency, you would rather have a default value of 1 than of 0). It is the same as the fill_value proposed on aligning mentioned here, but extended to include vars and exprs.

Raising exceptions on misalignment can be a pain to deal with. We often have patchwork data that we want to combine with a clear fill value in mind that is equivalent to "no effect". I can see alignment being used a reasonable amount to deal with this. The issue is that you end up with lots of intermediate arrays in memory for you to achieve alignment. Having a pre-defined default or the ability to define a default on applying arithmetic operations would keep in-memory arrays to a minimum. This would be easy to then integrate with #561.

On fill values: this PR only makes them available on parameters ("data") but I'd say they're also useful for decision variables and expressions, too. E.g., when creating a total cost expression you might want to combine several instances of var1 * cost_param1 + var2 * cost_param2 + ... where dimension labels don't align between var1 and param1 nor between var1 * cost_param1 and var2 * cost_param2 but that you just always want the default to be zero whenever there is misalignment since you'll later be summing all this to define your objective function.

FBumann and others added 3 commits March 11, 2026 13:31
…eview

Merge separate TestConstraintCoordinateAlignmentLegacy/V1 and
TestCoordinateAlignmentLegacy/V1 classes into unified classes where
legacy and v1 test methods sit side-by-side, grouped by scenario.

Test names now describe behavior (e.g. test_mul_subset_fills_zeros for
legacy, test_mul_subset_raises for v1) rather than using class-level
Legacy/V1 suffixes, making it clear what each convention expects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add 22 missing v1_only counterparts across test_linear_expression.py:
  - TestSuperset: v1 raises for add/sub/mul commutativity and div
  - TestDisjoint: v1 raises for div
  - TestCommutativity: parametrized v1 raises for all ops
  - TestQuadratic: v1 raises for sub, reverse mul, reverse add
  - TestMissingValues: v1 NaN propagates for sub, div, commutativity, quadexpr
  - TestExpressionWithNaN: v1 NaN propagates for add/mul/div array,
    sub/div scalar
- Add v1 negative assertions in test_linear_expression_sum_v1 and
  test_linear_expression_sum_with_const_v1 (assert mismatched coords
  raise before showing assign_coords workaround)
- Add TestNoDeprecationWarnings v1 counterpart in test_convention.py
- Fix test_align_v1: use pytest.raises(ValueError) instead of bare Exception
- Remove redundant test_superset_comparison_raises (covered by parametrized
  test_superset_comparison_var_raises)
- Remove v1_only marker from TestScalarFastPath (convention-independent)
- Un-mark test_variable_to_linexpr_nan_coefficient as legacy_only
  (to_linexpr fills NaN under both conventions)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use drop=True on scalar isel calls to prevent residual scalar coordinates
from causing exact-join mismatches under the v1 arithmetic convention.
Also align binary_hi coordinates with delta_lo in incremental PWL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 11, 2026

@brynpickering As i understand your comment, you are mostly ok with the convention, but have thoughts on nan data. Am I right? THis is what im currently not that opinionated about, so im happy to hear your suggestions.

FBumann and others added 5 commits March 12, 2026 08:22
NaN in linopy v1 means "absent term" — it marks individual terms as
missing without masking entire coordinates. User-supplied NaN at API
boundaries (constants, factors, constraint RHS) raises ValueError;
masking must be explicit via .sel() or mask=.

Implementation:
- FILL_VALUE["coeffs"] changed from NaN to 0 (structural "no term")
- NaN validation added in _add_constant, _apply_constant_op, to_constraint
- Piecewise internals use .fillna(0) on breakpoint data
- Tests updated to expect ValueError for NaN operands under v1

Key design decisions:
- NaN enters only via mask= or structural ops (shift, reindex, where)
- Combining expressions: absent terms do not poison valid terms
  (xr.sum skipna=True preserves valid contributions)
- A coordinate is fully absent only when ALL terms have vars=-1 AND
  const is NaN — this is what isnull() checks
- lhs >= rhs ≡ lhs - rhs >= 0, so RHS follows the same rules as constants

Documentation:
- New missing-data.ipynb: convention principles, fillna patterns,
  masking with .sel() and mask=, migration guide from legacy
- New nan-edge-cases.ipynb: investigation of shift, roll, where,
  reindex, isnull, arithmetic on shifted expressions, sanitize_missings
- arithmetic-convention.ipynb: updated to reference missing-data notebook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shift, where, reindex, reindex_like, unstack produce absent terms.
roll, sel, isel, drop_sel, expand_dims, broadcast_like do not.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Coeffs=0 was an implicit choice about the neutral element for
multiplication. NaN is more honest — it means "absent", which is
what FILL_VALUE is for. Both NaN and 0 coeffs get filtered by
filter_nulls_polars at solve time, so behavior is unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New cases: x + y.shift(1) + 5, x + (y+5).shift(1) + 5 (shifted
const is lost), x.shift(1) + y.shift(1) (fully absent coordinate).
Updated FILL_VALUE docs to reflect coeffs=NaN (not 0).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 12, 2026

@brynpickering Thanks for the detailed feedback.

Fill values depend on context (efficiency=1 vs cost=0) — this is id like to make it a user choice via .fillna(value) rather than picking a default.

Misalignment doesn't have to raise. The escape hatches are lightweight:

total = cost_a.add(cost_b, join="outer")  # absent terms → zero at solve time
scaled = var * efficiency.fillna(1)        # you choose the fill

As far as I know, Memory overhead is the same either way — fillna() on a parameter is not much larger than the parameter itself, and join="outer" does alignment inside xarray's concat anyway. But maybe im wrong here.

For expressions with mismatched labels, join="outer" already works: absent terms carry NaN coefficients that are filtered out when flattening for the solver. No explicit fill value needed.

Maybe adding a fill_value= parameter on .mul() / .add() resolves your issue if the fillna + join= pattern proves too verbose in practice?

@FBumann FBumann requested review from FabianHofmann and coroa March 12, 2026 08:45
@FBumann
Copy link
Collaborator Author

FBumann commented Mar 12, 2026

@coroa @FabianHofmann Im quite happy with this. But i think it desperately needs discussion.

@FBumann FBumann marked this pull request as ready for review March 12, 2026 08:46
@FBumann FBumann changed the title refac: Harmonize linopy operations with breaking convention refac: Harmonize linopy operations and introduce a new predictable and strict convention Mar 12, 2026
@brynpickering
Copy link

@FBumann your fillna suggestion is fine for parameters, but doesn't address nan filling later down the line of variables or expressions. As far as I understand linopy objects, it isn't possible to nan fill in the same way. This risks NaNs making their way into arithmetic when one would rather have some numeric value that has no effect to be there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants