refac: Harmonize linopy operations and introduce a new predictable and strict convention by FBumann · Pull Request #591 · PyPSA/linopy

FBumann · 2026-02-20T09:03:16Z

Harmonize linopy arithmetic with legacy/v1 convention transition

Why: silent bugs in the current (legacy) arithmetic

linopy's current coordinate alignment has several classes of silent correctness bugs. None of them raise errors — they produce results that look reasonable but are quietly wrong.

1. Positional alignment ignores labels (#586, #550, #257)

When two operands have the same shape on a shared dimension, linopy matches them by position, ignoring coordinate labels entirely:

x = m.add_variables(coords=[["costs", "penalty"]], name="x")
factors = xr.DataArray([2.0, 1.0], dims=["effect"],
                       coords={"effect": ["penalty", "costs"]})
x * factors
# Result: x["costs"] * 2.0, x["penalty"] * 1.0  — SWAPPED!
# Expected: x["costs"] * 1.0, x["penalty"] * 2.0

This affects all arithmetic operators and constraint creation. The optimization solves without error, but with the wrong constraints.

2. Subset constants break associativity (#572)

When a constant has different-sized coordinates, a left-join drops coordinates that might be needed by a later operation:

# a has 3 time steps, b has 5, factor has 5
a + factor + b    # factor at time=3,4 → 40, 50  ✓
a + b + factor    # factor at time=3,4 → LOST     ✗

(a + factor) + b ≠ a + (b + factor) — the result depends on operand order.

3. User NaN silently swallowed (#620)

NaN in user-supplied data (often indicating missing or erroneous data) is silently filled with inconsistent neutral elements:

data = xr.DataArray([1.0, np.nan, 3.0], ...)
x + data    # NaN → 0 (silent)
x * data    # NaN → 0 (kills the variable!)
x / data    # NaN → 1 (leaves variable unchanged!)

The fill values differ across operations with no consistent principle.

4. Absent slots indistinguishable from zero (#620)

Variable.to_linexpr() does not mark absent variable slots (from .shift(), .where()) as NaN, so multiplication and fillna() cannot distinguish "absent" from "zero":

xs = x.shift(time=1)   # time=0 is absent
xs * 3                  # absent slot looks like 0 — not NaN
xs.fillna(42)           # no-op — nothing to fill

5. Inconsistent Variable vs Expression paths (#569, #571)

x * subset_constant and (1 * x) * subset_constant previously gave different results (one crashed, one didn't).

What: the v1 convention

This PR introduces linopy.options["arithmetic_convention"] with two modes and a non-breaking transition:

"legacy" (default) — reproduces current master behavior exactly. Emits LinopyDeprecationWarning on every legacy codepath.
"v1" — strict coordinate matching, explicit NaN handling. Silent bugs become loud errors.

linopy.options["arithmetic_convention"] = "v1"

Strict coordinate matching

Arithmetic operators (+, -, *, /) require matching coordinates on shared dimensions. Mismatched coordinates raise ValueError with suggestions:

x[i=0,1,2] + y[i=1,2,3]  # ValueError: use .add(y, join="inner")

Named methods with explicit join — .add(), .sub(), .mul(), .div(), .le(), .ge(), .eq() accept join= parameter:

x.add(y, join="inner")      # intersection
x.add(y, join="outer")      # union with fill
x.add(y, join="left")       # keep x's coordinates
x.add(y, join="override")   # positional alignment (opt-in only)

Named methods with fill_value — .add(), .sub(), .mul(), .div() accept fill_value= for explicit NaN filling before the operation (see PR #620):

expr.add(5, fill_value=0)      # fill NaN const with 0 before adding
expr.mul(factor, fill_value=1) # fill NaN const with 1 before multiplying

Free broadcasting — Constants can introduce new dimensions without restriction.

Algebraic laws

All standard algebraic laws hold under v1 and under legacy for same-coordinate operands:

Commutativity: a + b == b + a, a * c == c * a
Associativity: (a + b) + c == a + (b + c)
Distributivity: c * (a + b) == c*a + c*b, (a + b) / c == a/c + b/c
Identity, negation, zero

Legacy breaks associativity only when operands have mismatched coordinate ranges (the subset-constant case above). The v1 convention prevents this by requiring explicit coordinate handling.

v1 NaN convention

NaN means "absent term" — never a numeric value.

NaN enters only from mask= at construction or structural operations (.shift(), .where(), .reindex(), .reindex_like(), .unstack()). Operations like .roll(), .sel(), .isel() do not produce NaN.

Arithmetic with absent slots:

Operation	Absent slot	Why
`shifted + 5`	`+5`	Fills const with 0 (additive identity), required for associativity
`shifted - 5`	`-5`	Same — subtraction is addition of negation
`shifted * 3`	absent	NaN propagates — no correct implicit fill (0 kills, 1 preserves)
`shifted / 2`	absent	Same as multiplication

When merging expressions (e.g., x + y.shift(time=1)), NaN marks individual terms, not entire coordinates. A coordinate is only fully absent when all terms are absent — isnull() checks this.

User-supplied NaN raises ValueError — users must handle NaN explicitly before arithmetic:

x + data.fillna(0)           # NaN = "no offset"
x * factor.fillna(1)         # NaN = "no scaling"
expr.mul(3, fill_value=0)    # fill_value= shorthand on named methods

fillna() — Variable.fillna(numeric) returns LinearExpression, Variable.fillna(Variable) returns Variable, Expression.fillna(value) fills const at absent slots.

Source changes

File	Change
`config.py`	`LinopyDeprecationWarning`, `arithmetic_convention` setting
`expressions.py`	All arithmetic paths branch on convention; `merge()` pre-validates user-dim coords under v1; `merge()` preserves NaN const when all inputs NaN; `_add_constant` fills const with additive identity; `to_constraint` has separate legacy/v1 paths; NaN validation at API boundaries; `fill_value=` on `.add()/.sub()/.mul()/.div()`
`common.py`	`align()` reads convention (legacy→inner, v1→exact)
`variables.py`	Scalar fast path in `__mul__`, explicit `TypeError` in `__div__`, `.reindex()` methods, `Variable.fillna(numeric)` returns `LinearExpression`, `to_linexpr()` sets `const=NaN` at absent slots in v1
`piecewise.py`	`.fillna(0)` on breakpoint data for v1 compatibility
`monkey_patch_xarray.py`	DataArray/Dataset arithmetic with linopy types
`model.py`	Convention-aware model methods

Documentation

Notebook	Content
`arithmetic-convention.ipynb`	Coordinate alignment rules, join parameter, migration guide
`missing-data.ipynb`	NaN convention principles, fillna patterns, masking with .sel() and mask=, legacy comparison
`_nan-edge-cases.ipynb`	Dev notebook: shift, roll, where, reindex, isnull, absent slot arithmetic, fillna/fill_value API, FILL_VALUE internals

Test structure

Marker-based separation: @pytest.mark.v1_only and @pytest.mark.legacy_only for convention-specific tests
Shared fixtures in conftest.py with auto-convention switching
test_algebraic_properties.py (92 tests): formal specification of algebraic laws — commutativity, associativity, distributivity, identity, negation, zero, division/subtraction distributivity, multi-step constant folding, mixed-type commutativity, expression-expression laws. All pass under both conventions except 8 NaN-propagation tests (v1 only).
test_legacy_violations.py (23 tests): catalog of concrete legacy bugs with paired legacy/v1 tests — positional alignment, subset associativity, user NaN handling, variable/expression inconsistency, absent-slot propagation. Each test traces to a specific issue number.
Additional convention-specific tests in test_linear_expression.py, test_constraints.py, test_convention.py

Rollout plan

This PR: Default "legacy" — nothing breaks
Downstream: Users opt in with linopy.options["arithmetic_convention"] = "v1"
linopy v1: Flip default to "v1" and drop legacy mode

Open questions

from_tuples / linexpr() — Currently follows the global convention. In practice always called with same-coord variables, so convention doesn't matter. Low-priority.
Pipe operator — Only linopy objects, or also constants? (follow-up PR)

Sub-PRs

Fix NaN propagation consistency and add fill_value API #620 — Fix NaN propagation consistency and add fill_value API

Test plan

All tests pass under both conventions
Legacy tests validate backward compatibility
v1 tests validate strict coordinate matching and NaN raises
Algebraic properties verified (92 tests): all laws hold under both conventions
Legacy violations documented (23 tests): concrete bugs with issue references
Piecewise and SOS tests run under both conventions
Documentation notebooks execute cleanly

🤖 Generated with Claude Code

…sets and supersets

Add le(), ge(), eq() methods to LinearExpression and Variable classes, mirroring the pattern of add/sub/mul/div methods. These methods support the join parameter for flexible coordinate alignment when creating constraints.

Consolidate repetitive alignment handling in _add_constant and _apply_constant_op into a single _align_constant method. This eliminates code duplication and makes the alignment behavior (handling join parameter, fill_value, size-aware defaults) testable and maintainable in one place.

numpy_to_dataarray no longer inflates ndim beyond arr.ndim, fixing lower-dim numpy arrays as constraint RHS. Also reject higher-dim constant arrays (numpy/pandas) consistently with DataArray behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use "exact" join for +/- (raises ValueError on mismatch), "inner" join for *// (intersection), and "exact" for constraint DataArray RHS. Named methods (.add(), .sub(), .mul(), .div(), .le(), .ge(), .eq()) accept explicit join= parameter as escape hatch. - Remove shape-dependent "override" heuristic from merge() and _align_constant() - Add join parameter support to to_constraint() for DataArray RHS - Forbid extra dimensions on constraint RHS - Update tests with structured raise-then-recover pattern - Update coordinate-alignment notebook with examples and migration guide Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

FBumann · 2026-02-20T12:49:45Z

@FabianHofmann Im quite happy with the notebook now. It showcases the convention and its consequences.
Tests need some work though. And migration as well.
Looking forward to your opinion on the convention

…ords. Here's what changed: - test_linear_expression_sum / test_linear_expression_sum_with_const: v.loc[:9].add(v.loc[10:], join="override") → v.loc[:9] + v.loc[10:].assign_coords(dim_2=v.loc[:9].coords["dim_2"]) - test_add_join_override → test_add_positional_assign_coords: uses v + disjoint.assign_coords(...) - test_add_constant_join_override → test_add_constant_positional: now uses different coords [5,6,7] + assign_coords to make the test meaningful - test_same_shape_add_join_override → test_same_shape_add_assign_coords: uses + c.to_linexpr().assign_coords(...) - test_add_constant_override_positional → test_add_constant_positional_different_coords: expr + other.assign_coords(...) - test_sub_constant_override → test_sub_constant_positional: expr - other.assign_coords(...) - test_mul_constant_override_positional → test_mul_constant_positional: expr * other.assign_coords(...) - test_div_constant_override_positional → test_div_constant_positional: expr / other.assign_coords(...) - test_variable_mul_override → test_variable_mul_positional: a * other.assign_coords(...) - test_variable_div_override → test_variable_div_positional: a / other.assign_coords(...) - test_add_same_coords_all_joins: removed "override" from loop, added assign_coords variant - test_add_scalar_with_explicit_join → test_add_scalar: simplified to expr + 10

FBumann · 2026-02-27T16:22:24Z

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Why "exact" instead of "inner" for * and /

"exact" still broadcasts freely over dimensions that only exist on one side — it only enforces strict matching on shared dimensions. So the common scaling pattern works fine:

cost = xr.DataArray([10, 20], coords=[("tech", ["wind", "solar"])])
capacity  # dims: (tech=["wind", "solar"], region=["A", "B"])

cost * capacity  # ✓ tech matches exactly, region broadcasts freely

"inner" is dangerous: if coords on a shared dimension don't match due to a typo or upstream change, it silently drops values. The explicit and safe way to subset before multiplying is:

capacity.sel(tech=["wind", "solar"]) * renewable_cost

No operation should introduce new dimensions

Neither side of any arithmetic operation should be allowed to introduce dimensions the other doesn't have. The same problem applies to + and - as to * and / — new dimensions silently expand the optimization problem in unintended ways:

cost_expr      # dims: (tech, time)
regional_expr  # dims: (tech, time, region)

cost_expr + regional_expr  # ✗ silently expands to (tech, time, region)

capacity  # dims: (tech, region, time)
risk      # dims: (tech, scenario)
risk * capacity  # ✗ silently expands to (tech, region, time, scenario)

An explicit pre-check on all operations:

asymmetric_dims = set(other.dims).symmetric_difference(set(self.dims))
if asymmetric_dims:
    raise ValueError(f"Operation introduces new dimensions: {asymmetric_dims}")

Summary

Operation	Convention
`+`, `-`, `*`, `/`	`"exact"` on shared dims; neither side may introduce dims the other doesn't have

coroa · 2026-02-27T19:58:35Z

The convention should be "exact" for all of +, -, *, /, with an additional check that neither side may introduce dimensions the other doesn't have — also for all operations.

Let's clearly differentiate between dimensions and labels.

labels

I agree with "exact" for labels by default, but we need an easy way to have inner or outer joining characteristics. I found the pyoframe conventions
strange at the beginning, but they grew on me:

x + y.keep_extras() to say that an outer join is in order and mismatches should fill with 0.

x + y.drop_extras() to say that you want an ~~outer~~ inner join.
x.drop_extras() + y does the same, though.

I have in a different project used | 0 to indicate keep_extras ie (x + y | 0).

dimensions

i am actually fond of the ability to auto broadcast over different dimensions. and would want to keep that (actually my main problem with pyoframe).

your first example actually implicitly assumes broadcasting.

FBumann · 2026-02-28T10:37:03Z

Dimensions and broadcasting

I agree that auto broadcasting is helpful in some cases.
I'm happy with allowing broadcasting of constants. We could allow this always...?
But I would enforce that the constant never has more dims than the variable/expression.
Or is there a use case for this?

So the full convention requires two separate things:
1. "exact" join — shared dims must have matching coords (xarray handles this)
2. Subset dim check — the constant side’s dims must be a subset of the variable/expression (custom pre-check needed)

labels

I'm not sure if I like this approach, as it's needs careful state management of the flags on expressions. The flag (keep or drop extras) needs to be handled.
I would rather enforce to reindex or fill data to the correct index.
I think aligning is the correct approach:

import linopy

# outer join — fill gaps with 0 before adding
x_aligned, y_aligned = linopy.align(x, y, join="outer", fill_value=0)
x_aligned + y_aligned

# inner join — drop non-matching coords before adding
x_aligned, y_aligned = linopy.align(x, y, join="inner")
x_aligned + y_aligned

Combining disjoint expressions would then still need the explicit methods though.
I'm interested about your take on this

FBumann · 2026-02-28T11:40:51Z

The proposed convention for all arithmetic operations in linopy:
1. "exact" join by default — shared coords must match exactly, raises on mismatch
2. Subset dim check — constants may introduce dimensions the variable/expression doesn’t have
3. No implicit inner join — use .sel() explicitly instead
4. Outer join with fill — use x + (y | 0) or .add(join="outer", fill_value=0)
The escape hatches in order of preference: .sel() for subsetting, | 0 for inline fill, named method .add(join=...) for everything else. No context manager needed.

I'm not sure how to implement the | operator yet. Might need some sort of flag/state for defered indexing

FBumann · 2026-02-28T18:59:57Z

I thought about the pipe operator:
I think it should only work with linopy internal types (Variables/expression), not constants (scalar, numpy, pandas, dataarray), as this would need monkey patching a lot and hard to get stable.

Would this be an issue for you?

…ations-mixed

…aram Problem ------- In v1, `expr.mul(da, join="outer")` silently filled missing entries in the factor DataArray with 0 (mul) or 1 (div) before applying the operation. This conflates "absent" with "zero" — the same issue v1 was designed to fix for expression merging. The user had no control over this behavior. Solution -------- In v1, `_multiply_by_constant` and `_divide_by_constant` now default to `fill_value=NaN`, so misaligned factors trigger the existing NaN validation error. Users opt in to filling explicitly: expr.mul(da, join="outer", fill_value=0) # missing → 0 (kills term) expr.mul(da, join="outer", fill_value=1) # missing → 1 (preserves) expr.div(da, join="left", fill_value=1) # missing → 1 (no scaling) Design decisions: - add/sub: always fill with 0 (additive identity) — safe, required for associativity. No fill_value parameter exposed. - mul/div with scalars: no alignment needed, fill_value not applicable. - mul/div with expr*expr: fill_value raises TypeError if specified (same pattern as join= already does). - Legacy behavior: completely unchanged (fill_value defaults to 0/1). - Variable.mul/div: forward fill_value to expression layer. - Variable.add/sub: no fill_value (to_linexpr() always produces const=0, so there is no pre-existing NaN to fill). Changes ------- linopy/expressions.py: - _multiply_by_constant / _divide_by_constant: accept fill_value, default to NaN in v1 (raises on misalignment), 0/1 in legacy. - mul() / div(): pass fill_value through to constant path; raise TypeError if fill_value specified for expr*expr. - add() / sub(): removed fill_value param (was useless — self.fillna ran before merge, never affected join-introduced NaN). linopy/variables.py: - Variable.mul/div: expose fill_value, forwarded to expression layer. - Variable.add/sub: no fill_value parameter. test/test_linear_expression.py: - Split mul/div outer-join tests into legacy_only (silent fill) and v1_only (raises without fill_value, works with explicit fill_value). - Added test for fill_value on expr*expr raising TypeError. test/test_algebraic_properties.py: - Replaced old fill_value tests (which tested removed self.fillna behavior) with v1 tests for new semantics: misaligned DataArray raises, works with explicit fill_value. examples/arithmetic-convention.ipynb: - NaN convention section: document fill_value= on mul/div. - Join parameter section: updated table and added fill_value guidance. - Summary table: added mul/div misalignment row. - Fixed v1-join-outer cell to use fill_value=0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

brynpickering · 2026-03-18T12:33:12Z

@FBumann

if you find pain points with having to align/fill data much more, we can maybeimprove the convention or add some shortcuts.

I don't think the complexity is strictly bad, it's just something that I can see users battling with.

One case I was analysing was where you minimise the size of Variable objs by removing some dim items but have input parameters with data shared across all dim items:

m = Model()
tech_a, tech_b, tech_cost = ["wind", "solar"], ["gas"], ["wind", "solar", "gas"]
cap_a = m.add_variables(lower=0, coords=[tech_a], name="cap_a")
cap_b = m.add_variables(lower=0, coords=[tech_b], name="cap_b")
cap_c = m.add_variables(lower=0, coords=[tech_cost], name="cap_c")
cost_a = xr.DataArray([7, 9, float("nan")], coords=[("dim_0", tech_cost)])
cost_b = xr.DataArray([float("nan"), float("nan"), 11], coords=[("dim_0", tech_cost)])
cost_c = xr.DataArray([13, 17, 19], coords=[("dim_0", tech_cost)])

# could use dropna; could use outer or inner join when multiplying costs with caps
combined = (
    cap_a.mul(cost_a.fillna(0), join="left")
    .add(cap_b.mul(cost_b.fillna(0), join="left"), join="outer")
    .add(cap_c.mul(cost_c.fillna(0), join="left"), join="outer")
)
combined

Creates:

LinearExpression [dim_0: 3]:
----------------------------
[gas]: +11 cap_b[gas] + 19 cap_c[gas]
[solar]: +9 cap_a[solar] + 17 cap_c[solar]
[wind]: +7 cap_a[wind] + 13 cap_c[wind]

It works, it's just quite verbose since the default is join="exact" so you must define all those joins.

I then extended it slightly by adding an extra parameter with reduced dimension size, and it becomes harder to get it to produce an expected result:

m = Model()
tech_a, tech_b, tech_cost = ["wind", "solar"], ["gas"], ["wind", "solar", "gas"]
cap_a = m.add_variables(lower=0, coords=[tech_a], name="cap_a")
cap_b = m.add_variables(lower=0, coords=[tech_b], name="cap_b")
cap_c = m.add_variables(lower=0, coords=[tech_cost], name="cap_c")
cost_a = xr.DataArray([7, 9, float("nan")], coords=[("dim_0", tech_cost)])
cost_b = xr.DataArray([float("nan"), float("nan"), 11], coords=[("dim_0", tech_cost)])
cost_c = xr.DataArray([13, 17, 19], coords=[("dim_0", tech_cost)])
rate = xr.DataArray([1.04], coords=[("dim_0", ["gas"])])

# could use dropna; must use outer join when multiplying costs with caps
combined = (
    cap_a.mul(cost_a.fillna(0), join="outer")
    .add(cap_b.mul(cost_b.fillna(0), join="outer"), join="inner")
    .add(
        cap_c.mul(cost_c.fillna(0), join="outer")
        .mul(rate, join="outer", fill_value=1),
        join="outer"
    )
)
combined

This doesn't produce the expected result, but instead:

LinearExpression [dim_0: 3]:
----------------------------
[gas]: +11 cap_b[gas] + 19.76 cap_c[gas]
[solar]: +9 cap_a[solar] + 0 cap_c[solar]
[wind]: +7 cap_a[wind] + 0 cap_c[wind]

To get it to work, you have to first prepare the parameter:

m = Model()
tech_a, tech_b, tech_cost = ["wind", "solar"], ["gas"], ["wind", "solar", "gas"]
cap_a = m.add_variables(lower=0, coords=[tech_a], name="cap_a")
cap_b = m.add_variables(lower=0, coords=[tech_b], name="cap_b")
cap_c = m.add_variables(lower=0, coords=[tech_cost], name="cap_c")
cost_a = xr.DataArray([7, 9, float("nan")], coords=[("dim_0", tech_cost)])
cost_b = xr.DataArray([float("nan"), float("nan"), 11], coords=[("dim_0", tech_cost)])
cost_c = xr.DataArray([13, 17, 19], coords=[("dim_0", tech_cost)])
rate = xr.DataArray([1.04], coords=[("dim_0", ["gas"])])
cost_c_rate = cost_c * rate.reindex_like(cost_c).fillna(1)

# could use dropna; must use outer join when multiplying costs with caps
combined = (
    cap_a.mul(cost_a.fillna(0), join="outer")
    .add(cap_b.mul(cost_b.fillna(0), join="outer"), join="inner")
    .add(cap_c.mul(cost_c_rate, join="outer"), join="outer")
)
combined

What exactly would you like to achieve there?

Not combining a variable with another variable, but with a parameter, it's just a shortcut for var.to_linexpr.mul(..., fill_value=...) as var.mul(..., fill_value=...) since you already provide an equivalent shortcut for join.

Explicitly pass fill_value=0 (or 1, etc.) to opt into filling

That seems appropriate. We don't want to be automatically filling with any default value where this could have an impact on the optimisation problem.

FBumann · 2026-03-18T13:50:50Z

@brynpickering Great Feedback!
For the first use case, i think there is no way around handling the nans.
The verbosity of the expression can be reduced by letting linopy handle the alignment.

m = Model()
tech_a, tech_b, tech_cost = ["wind", "solar"], ["gas"], ["wind", "solar", "gas"]
cap_a = m.add_variables(lower=0, coords=[tech_a], name="cap_a")
cap_b = m.add_variables(lower=0, coords=[tech_b], name="cap_b")
cap_c = m.add_variables(lower=0, coords=[tech_cost], name="cap_c")
cost_a = xr.DataArray([7, 9, float("nan")], coords=[("dim_0", tech_cost)])
cost_b = xr.DataArray([float("nan"), float("nan"), 11], coords=[("dim_0", tech_cost)])
cost_c = xr.DataArray([13, 17, 19], coords=[("dim_0", tech_cost)])
rate = xr.DataArray([1.04], coords=[("dim_0", ["gas"])])

# Align all variables and costs to the union of tech coordinates
cap_a_al, cap_b_al, cap_c_al, cost_a_al, cost_b_al, cost_c_al = linopy.align(
    cap_a, cap_b, cap_c, cost_a.fillna(0), cost_b.fillna(0), cost_c.fillna(0), join="outer",
)

# Now all operands have tech=["gas", "solar", "wind"] — exact join works
combined = (
    cap_a_al * cost_a_al
    + cap_b_al * cost_b_al
    + cap_c_al * cost_c_al
)

Regarding your secnond example

I couldnt replicate the faulty expression.

m = Model()
tech_a, tech_b, tech_cost = ["wind", "solar"], ["gas"], ["wind", "solar", "gas"]
cap_a = m.add_variables(lower=0, coords=[tech_a], name="cap_a")
cap_b = m.add_variables(lower=0, coords=[tech_b], name="cap_b")
cap_c = m.add_variables(lower=0, coords=[tech_cost], name="cap_c")
cost_a = xr.DataArray([7, 9, float("nan")], coords=[("dim_0", tech_cost)])
cost_b = xr.DataArray([float("nan"), float("nan"), 11], coords=[("dim_0", tech_cost)])
cost_c = xr.DataArray([13, 17, 19], coords=[("dim_0", tech_cost)])
rate = xr.DataArray([1.04], coords=[("dim_0", ["gas"])])

# could use dropna; must use outer join when multiplying costs with caps
combined = (
    cap_a.mul(cost_a.fillna(0), join="outer")
    .add(cap_b.mul(cost_b.fillna(0), join="outer"), join="inner")
    .add(
        cap_c.mul(cost_c.fillna(0), join="outer")
        .mul(rate, join="outer", fill_value=1),
        join="outer"
    )
)
combined

worked for me

In general, i see nan as a data problem that should be solved before writing arithmetics. And using linopy.align seems like a good pattern to me.

The only alternative i can think of is to treat all nan values as absent terms (var * nan). This could cover up missing data issues, but validation this is not inherrently our responsibility.
This would effectively mean nan work like a mask. This would resolve all the fillna issues you described and make the code cleaner i guess.
But it might hold some limitations of its own regarding algebraic conventions (assciativity etc.).

FBumann · 2026-03-18T15:08:36Z

@brynpickering I worked on it a bit and treating nan as absent terms works quite well.
Regarding your examples, it removes the need for fillna(), except for the efficiency, where not calling fillna(1) silently masks out the constraint, which is intended, but a dangerous caveat.
Further, this would be a silent change in behaviour which would not raise.
See #627 for details.

I tried to address this, but its not that easy and makes the convention less clear. My resulting code in #627 pretty much only reduces the amount of fillna(0) calls in user code. But i thik its not work the complexity in the mental model and api complexity in linopy.

Im quite happy with this branch #591

brynpickering · 2026-03-18T18:42:24Z

@FBumann agree that pre-alignment is cleanest. Thanks for the example. It's effectively what we do in the calliope optimisation backend, which is to align all relevant arrays to consistent sets for each constraint and expression. It could cause memory spikes in cases where alignment creates a very large (usually very sparse) array, which might soon after be collapsed with an aggregation. However, that's more of an issue for the user to handle in how they define their math.

RE that second example. You're right, I can't reproduce it locally now either. Not sure what happened there.

FBumann · 2026-03-18T19:06:26Z

@brynpickering Thanks for the back and forth.
This really helps

I also ensure the alignment and indexing before working with linopy. This is a pain point which gets resolved a lot by the strictness of this convention. Linopy raises instead of guessing what might be correct

Documents patterns for combining variables with different coordinate subsets and shared cost parameters, addressing feedback from PR #591. Shows four approaches: fillna+joins, dropna, scoped costs, and linopy.align() pre-alignment. Includes partial scaling factor examples. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

MaykThewessen · 2026-05-21T10:35:06Z

Downstream PyPSA-Eur-style consumer perspective. We run an 8760h × NL+neighbors LP with several custom constraint modules that all sit in the v1-arithmetic blast radius described in the PR body. Sharing concrete patterns in case they're useful as regression cases.

Patterns we'd test against v1:

NaN-mostly per-snapshot equality pins. We pin BESS terminal SoC via n.storage_units_t.state_of_charge_set as a DataFrame(NaN, columns=batteries) with exactly one non-NaN entry per chunk. PyPSA forwards this to linopy as a constraint RHS. Under the legacy "NaN silently swallowed → 0" rule (case 3 in the PR), the wrong semantics would force SoC=0 every hour. Today this works because PyPSA pre-filters NaN before constraint creation; under v1 we'd want NaN-as-absent to be a first-class signal so the pre-filter could be dropped. A regression test: build m.add_constraints(x == rhs) with rhs a DataArray that is NaN-everywhere-except-one-snapshot, confirm only one constraint row is materialized.
Per-snapshot time-varying marginal_cost_t with sparse non-NaN coverage. We override foreign offshore wind mc_t to equal landing-bus LMP only on snapshots where the gen would otherwise dispatch (NaN elsewhere → static base cost). Same NaN-handling concern as (1), but on the objective side rather than the constraint side. Currently works; v1 needs to keep working.
Label-indexed pandas Series multiplied by per-country/per-carrier DataFrames. Our scarcity-rent and neighbor-markup-propagation modules construct factors like pd.Series({'NL': 0.15, 'BE': 0.12, ...}) and multiply against per-snapshot generation expressions. This is exactly case 1 (positional alignment ignores labels) territory. Mixing pandas (label-aligned) and xarray (position-aligned) operands is where we feel least safe. Strict label alignment in v1 would be a clear win — we'd happily accept a ValueError on mismatch over silent swap.
phase_shift_extendable PST custom constraint (PyPSA fork PR #1661, currently in our pin). Writes Transformer phase-shift variables and flow constraints with mixed label/positional coords. Would benefit from a clear v0→v1 migration story — we'd rather port once, with strict errors guiding us, than chase silent dispatch shifts later.
Sweep-orchestrator multi-process reproducibility. Adjacent to fix: keep coords dimension order for DataArray bounds (#706) #710's _broadcast_points set-iteration fix: we run many variants in parallel processes. Any hash-randomized dim ordering produces apparent run-to-run drift that masks real model deltas. Strict v1 ordering is a correctness win even beyond the silent-bugs angle.

Migration concern: the breakage surface is large for downstream PyPSA-Eur derivatives. A deprecation cycle that emits a LinopyAlignmentWarning on any operation whose v0 vs v1 result differs — rather than blanket on all alignment — would let us find and port only the affected call sites. Otherwise the warning noise floor will be too high to act on.

Offer: happy to run our 8760h LP against a v1-enabled branch and report any silently shifted constraints (compare LP file hash + objective). Let us know which SHA you want exercised.

FBumann · 2026-05-21T10:57:39Z

@MaykThewessen Thanks for the perspective. We are working on this currently, but it will take a few weeks probably until we have it ready for you to run.

Would you be open for discussion about the nan stuff?

MaykThewessen · 2026-05-21T11:25:29Z

@FBumann yes, happy to. NaN semantics are the part that touches our code most
directly — constraint-RHS side (SoC pin via state_of_charge_set) and objective
side (marginal_cost_t and p_max_pu with sparse non-NaN coverage). Async on a
dedicated issue is probably most efficient; sync call also fine if it helps
unblock design.

Concrete starting point: we ran a controlled test of #627's draft against our
SoC-pin pattern this morning.

Under arithmetic_convention="legacy" (default), behaviour is identical to
current master — SoC pin holds (SoC[-1]=100.0 in a 24h LP).
Under arithmetic_convention="v1", the same pattern raises
ValueError: Constraint RHS contains NaN values at expressions.py:1325,
before PyPSA's existing NaN-RHS auto-mask runs. Loud, not silent — which is
great. But it means every PyPSA-Eur downstream that relies on the auto-mask
(storage with state_of_charge_set, sparse marginal_cost_t, gapped
p_max_pu) will fail at constraint-build time the moment they flip the
option.

The design question is where the pre-mask should live. Two options:

Keep linopy strict, push mask to consumers. Every downstream
pre-filters NaN before add_constraints. One rule, but high migration cost
and duplicated masking logic across PyPSA, PyPSA-Eur, atlite-derived
builders, custom forks.
add_constraints accepts NaN-as-absent natively, per-call. E.g.
add_constraints(lhs == rhs, on_nan="raise" | "skip_row" | "propagate")
with raise as the v1 default. One implementation, fused with the existing
xarray alignment, no global mode, explicit at each call site.

I lean toward (2) because the NaN in question is created by linopy's own
xr.align(join="outer"), not by user intent — the consumer can't distinguish
"user meant skip" from "alignment fill", only linopy sees the align op. So
handling it in the alignment layer feels structurally correct. But I see the
argument for keeping linopy policy-free.

Happy to open a discussion issue with the repro + sketch of both options.
File on linopy, or prefer to scope on your side first?

FBumann · 2026-05-21T11:40:39Z

@FBumann yes, happy to. NaN semantics are the part that touches our code most directly — constraint-RHS side (SoC pin via state_of_charge_set) and objective side (marginal_cost_t and p_max_pu with sparse non-NaN coverage). Async on a dedicated issue is probably most efficient; sync call also fine if it helps unblock design.

Concrete starting point: we ran a controlled test of #627's draft against our SoC-pin pattern this morning.

Under arithmetic_convention="legacy" (default), behaviour is identical to
current master — SoC pin holds (SoC[-1]=100.0 in a 24h LP).

Under arithmetic_convention="v1", the same pattern raises
ValueError: Constraint RHS contains NaN values at expressions.py:1325,
before PyPSA's existing NaN-RHS auto-mask runs. Loud, not silent — which is
great. But it means every PyPSA-Eur downstream that relies on the auto-mask
(storage with state_of_charge_set, sparse marginal_cost_t, gapped
p_max_pu) will fail at constraint-build time the moment they flip the
option.

The design question is where the pre-mask should live. Two options:

Keep linopy strict, push mask to consumers. Every downstream
pre-filters NaN before add_constraints. One rule, but high migration cost
and duplicated masking logic across PyPSA, PyPSA-Eur, atlite-derived
builders, custom forks.

add_constraints accepts NaN-as-absent natively, per-call. E.g.
add_constraints(lhs == rhs, on_nan="raise" | "skip_row" | "propagate")
with raise as the v1 default. One implementation, fused with the existing
xarray alignment, no global mode, explicit at each call site.

I lean toward (2) because the NaN in question is created by linopy's own xr.align(join="outer"), not by user intent — the consumer can't distinguish "user meant skip" from "alignment fill", only linopy sees the align op. So handling it in the alignment layer feels structurally correct. But I see the argument for keeping linopy policy-free.

Happy to open a discussion issue with the repro + sketch of both options. File on linopy, or prefer to scope on your side first?

Feel free to open one. I like your take on nan. Im leaning on defaulting to rais/strict, but allowing it via opt in seems reasonable. Im not sure if it should be in add_constraints though, because a user using pypsa cant reach that, and it would not work for expressions, as they are created bedore add_constraints() does its job.

FabianHofmann and others added 19 commits February 9, 2026 14:28

refac: introduce consistent convention for linopy operations with sub…

cc4d6ab

…sets and supersets

Merge branch 'master' into harmonize-linopy-operations

d12717b

move scalar addition to add_constant

e408b8e

add overwriting logic to add constant

1f339e8

add join parameter to control alignment in operations

c47b90b

Add le, ge, eq methods with join parameter for constraints

72b0ce1

Add le(), ge(), eq() methods to LinearExpression and Variable classes, mirroring the pattern of add/sub/mul/div methods. These methods support the join parameter for flexible coordinate alignment when creating constraints.

update notebooks

130a5df

Merge branch 'master' into harmonize-linopy-operations

81204c0

update release notes

b616074

fix types

bd04a3a

add regression test

32ddf91

Merge branch 'master' into harmonize-linopy-operations

087a3cf

Update notebook as spec

91bd515

Added user warning for joins which result in size 0 expressions.py

140021e

Update convention and tests. Make notebooks mroe concise

abd3ac2

show assign_coords pattern

1e18984

FBumann and others added 2 commits February 20, 2026 13:51

Merge branch 'master' into harmonize-linopy-operations

594efef

FBumann mentioned this pull request Feb 24, 2026

Constraints do not match RHS based on coordinates #586

Open

2 tasks

FabianHofmann added 2 commits March 4, 2026 08:09

Merge branch 'harmonize-linopy-operations' into harmonize-linopy-oper…

370e27f

…ations-mixed

Merge branch 'master' into harmonize-linopy-operations

eae8e86

FBumann added enhancement New feature or request bug Something isn't working labels Mar 18, 2026

FBumann mentioned this pull request Mar 18, 2026

Treat NaN as absent in v1 arithmetic #627

Draft

4 tasks

This was referenced Mar 30, 2026

fix: add_variables ignoring coords for DataArray bounds #614

Merged

perf: cache MatrixAccessor properties to avoid redundant recomputation #616

Closed

docs: add memory best-practices notebook for large-scale models #624

Closed

FBumann mentioned this pull request Apr 20, 2026

Upstream: adopt v1 arithmetic convention (PyPSA/linopy#591) FBumann/linopy-yaml#8

Open

FBumann mentioned this pull request May 3, 2026

bug when adding two linopy expressions/variables that share dimension names but have disjoint label coordinates of same size #670

Open

2 tasks

darmis007 mentioned this pull request May 17, 2026

fix: ramp limit constraint leaking p_nom of other generators on non-alphabetical index PyPSA/PyPSA#1677

Merged

8 tasks

MaykThewessen mentioned this pull request May 21, 2026

test: pin alignment contract for variable/expression/constraint bugs #715

Open

FBumann mentioned this pull request May 21, 2026

Auxiliary non-dimension coordinates leak into expressions and break alignment #295

Open

Conversation

FBumann commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Harmonize linopy arithmetic with legacy/v1 convention transition

Why: silent bugs in the current (legacy) arithmetic

What: the v1 convention

Strict coordinate matching

Algebraic laws

v1 NaN convention

Source changes

Documentation

Test structure

Rollout plan

Open questions

Sub-PRs

Test plan

Uh oh!

FBumann commented Feb 20, 2026

Uh oh!

FBumann commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coroa commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

labels

dimensions

Uh oh!

FBumann commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dimensions and broadcasting

labels

Uh oh!

FBumann commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FBumann commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brynpickering commented Mar 18, 2026

Uh oh!

FBumann commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regarding your secnond example

Uh oh!

FBumann commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brynpickering commented Mar 18, 2026

Uh oh!

FBumann commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MaykThewessen commented May 21, 2026

Uh oh!

FBumann commented May 21, 2026

Uh oh!

MaykThewessen commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FBumann commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

FBumann commented Feb 20, 2026 •

edited

Loading

FBumann commented Feb 27, 2026 •

edited

Loading

coroa commented Feb 27, 2026 •

edited

Loading

FBumann commented Feb 28, 2026 •

edited

Loading

FBumann commented Feb 28, 2026 •

edited

Loading

FBumann commented Feb 28, 2026 •

edited

Loading

FBumann commented Mar 18, 2026 •

edited

Loading

FBumann commented Mar 18, 2026 •

edited

Loading

FBumann commented Mar 18, 2026 •

edited

Loading

MaykThewessen commented May 21, 2026 •

edited

Loading