Skip to content

perf: use numpy array lookup for solution unpacking#619

Open
MaykThewessen wants to merge 2 commits intoPyPSA:masterfrom
MaykThewessen:perf/vectorize-solution-unpacking
Open

perf: use numpy array lookup for solution unpacking#619
MaykThewessen wants to merge 2 commits intoPyPSA:masterfrom
MaykThewessen:perf/vectorize-solution-unpacking

Conversation

@MaykThewessen
Copy link

Summary

Replace pandas Series indexing with numpy array lookup in the solution unpacking loop (Model.solve(), lines 1570-1594).

Before:

sol = set_int_index(sol)
sol.loc[-1] = nan
for name, var in self.variables.items():
    idx = np.ravel(var.labels)
    vals = sol[idx].values.reshape(var.labels.shape)  # pandas indexing per variable
    var.solution = xr.DataArray(vals, var.coords)

After:

sol = set_int_index(sol)
# Build dense numpy lookup array once
sol_arr = np.full(sol_max_idx + 1, nan)
sol_arr[sol.index[sol.index >= 0]] = sol.values[sol.index >= 0]

for name, var in self.variables.items():
    idx = np.ravel(var.labels)
    vals = sol_arr[np.clip(idx, 0, sol_max_idx)]  # numpy indexing
    vals[idx < 0] = nan
    var.solution = xr.DataArray(vals.reshape(var.labels.shape), var.coords)

Same pattern applied to the dual values unpacking loop.

Motivation

After HiGHS solves, the solution is a pandas Series with integer labels. The unpacking loop accesses this Series once per variable type (~20 types in a typical PyPSA model). Each sol[idx].values call involves pandas' __getitem__ with index alignment overhead. Converting to a numpy array first and using direct array indexing eliminates this overhead.

Context

See #198 (comment) — item 5 in the priority list.

Test plan

  • test_optimization.py highs-direct — 24/25 pass (one pre-existing failure in test_modified_model)
  • Solution values verified correct via test_default_setting_sol_and_dual_accessor
  • Expression solution accessor verified via test_default_setting_expression_sol_accessor
  • Duplicated variables test passes

🤖 Generated with Claude Code

MaykThewessen and others added 2 commits March 13, 2026 22:23
Convert the primal/dual pandas Series to a dense numpy lookup array
before the per-variable/per-constraint unpacking loop. This replaces
pandas indexing (sol[idx].values) with direct numpy array indexing
(sol_arr[idx]), avoiding pandas overhead per variable type.

The loop over variable/constraint types still exists (needed to set
each variable's .solution xr.DataArray), but the inner indexing
operation is now pure numpy instead of pandas Series.__getitem__.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant