Redesign as_dataarray: strict public API + internal ensure_dataarray helper#551
Closed
FBumann wants to merge 19 commits intoPyPSA:masterfrom
Closed
Redesign as_dataarray: strict public API + internal ensure_dataarray helper#551FBumann wants to merge 19 commits intoPyPSA:masterfrom
FBumann wants to merge 19 commits intoPyPSA:masterfrom
Conversation
Collaborator
Author
|
Maybe remove explicit broadcast from masking added in a prior patch |
75b0610 to
20b2831
Compare
Previously, when a DataArray was passed to as_dataarray(), the coords parameter was silently ignored. This was inconsistent with other input types (numpy, pandas) where coords are applied. Now, when coords is provided as a dict and the input is a DataArray, the function will reindex the array to match the provided coordinates. This ensures consistent behavior across all input types. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2. Expands to new dims from coords (broadcast) Summary: - as_dataarray now consistently applies coords for all input types - DataArrays with fewer dims are expanded to match the full coords specification - This fixes the inconsistency when creating variables with DataArray bounds
Replace reindex with a strict equality check for DataArray inputs. Silent reindexing is dangerous as it introduces NaNs for missing indices and drops unmatched ones, masking user bugs. Now raises ValueError if coords don't match, while still allowing expand_dims for broadcasting to new dimensions.
Strict by default: raises ValueError if a DataArray has dimensions not present in coords. Call sites that need broadcasting (multiply, dot, add) opt in with allow_extra_dims=True. Structural call sites like add_variables bounds/mask remain strict.
When coords is a sequence (e.g. from add_variables), convert it to a dict using dims or Index names so the same validation applies. This closes the gap where sequence coords were silently ignored for DataArray inputs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Unifies the sequence-to-dict coords conversion used in pandas_to_dataarray, numpy_to_dataarray, and the DataArray branch of as_dataarray into a single helper. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document the coord validation and broadcasting behavior from the user perspective.
20b2831 to
3f5d866
Compare
- Skip coord validation for DataArray inputs in arithmetic contexts (allow_extra_dims=True) to preserve xarray's native alignment - Add allow_extra_dims=True to comparison operator and quadratic dot as_dataarray calls for consistent broadcasting - Handle MultiIndex levels in expand_dims guard Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3f5d866 to
9508117
Compare
Move validation out of as_dataarray into model.add_variables directly. This removes the allow_extra_dims flag and all changes to expressions.py and variables.py — arithmetic call sites are unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
SHould be merged before #591 |
Resolve conflict in model.py: keep both semi-continuous variable validation (from master) and DataArray coord validation (from this PR). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix "coordinates are taken considered" → "are considered" in pandas_to_dataarray warning - Add TODO noting mask DataArray validation is intentionally skipped to preserve broadcast_mask's fill-with-False behavior - Clarify coords docstring for add_variables Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
Adjust mask validation in #591 |
Collaborator
Author
|
@FabianHofmann I think merging this would be a great addition in UX |
Collaborator
Author
|
Fixed #450 |
3 tasks
…ay helper Split as_dataarray into two functions with distinct responsibilities: - as_dataarray (public): strict coord validation when coords provided, rejects extra dims, expands missing dims, raises on coord mismatch - _coerce_to_dataarray (internal): pure type conversion using coords only as construction hints, no validation or expansion This removes the allow_extra_dims flag and makes the API predictable: callers that need lenient conversion (expression arithmetic, masks) use _coerce_to_dataarray, while add_variables bounds use as_dataarray. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…to optional coords - Rename internal helper to ensure_dataarray (no underscore, consistent with other common.py conventions like broadcast_mask) - Revert as_dataarray coords back to optional — when coords is None, delegates to _type_dispatch for pure type conversion - Deduplicate _type_dispatch call in as_dataarray (always called, validation conditional) - Update all call sites in expressions.py, model.py, variables.py - Update test names to match new function name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Raise ValueError when dims length doesn't match coords sequence length - Raise ValueError on duplicate .name in coords sequence - Warn on unnamed items in coords sequence (when dims is None) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborator
Author
|
Closed in favor of #614 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Redesigns
as_dataarray()by splitting it into two functions with clear responsibilities:as_dataarray(arr, coords?, dims?)— Public API. Whencoordsis provided: converts to DataArray, validates shared dim coords match exactly, rejects extra dims, broadcasts missing dims viaexpand_dims. WhencoordsisNone: pure type conversion only.ensure_dataarray(arr, coords?, dims?)— Internal helper. Pure type conversion (scalar, numpy, pandas, polars, list → DataArray). No validation, noexpand_dims. Coords are used only as construction hints. Callers handle alignment themselves (xarray broadcasting,reindex_like, etc.).Both share a common
_type_dispatchfunction for the actual type conversion logic.Motivation
Previously
as_dataarray()mixed two conflicting roles:add_variableswhere user-provided bounds must match coords exactlyThis led to bugs where DataArray inputs to
add_variableshad their coords silently ignored, and attempts to fix it broke expression arithmetic.How variable coordinates are determined
coordsprovided, bounds are scalars/numpy/pandascoordsdefines the variable's coordinate space; inputs are converted usingcoordscoordsprovided, bounds are DataArrayscoordsdefines the variable's coordinate space; bounds are validated againstcoords(shared dims must match, extra dims raiseValueError, missing dims are broadcast)coordsisNone, bounds are DataArrayscoordsisNone, bounds are scalarsChanges
linopy/common.py_type_dispatch(shared conversion),_expand_missing_dims,_validate_dataarray_coords(strict, noallow_extra_dimsflag). Addensure_dataarray. Refactoras_dataarrayto delegate toensure_dataarraywhen no coords, or validate+expand when coords provided.linopy/expressions.pyensure_dataarray(arithmetic, constants, constraint comparisons,dot)linopy/variables.pyVariable.to_linexprtoensure_dataarraylinopy/model.pyadd_variableslower/upper useas_dataarray(strict). Masks, sign, rhs useensure_dataarray(lenient).test/test_common.py_validate_dataarray_coords,ensure_dataarray(no expand, allows extra dims, no coord validation), andadd_variablesintegration.doc/release_notes.rstMask handling
The
maskparameter usesensure_dataarray(no validation). The existingbroadcast_maskfunction handles misaligned mask coords by filling withFalse. A TODO is left for a future deprecation path toward stricter validation (see #591).Examples
Checklist