Skip to content

Conversation

@nathanneike
Copy link
Contributor

Add lazy EMD solver with on-the-fly distance computation

  • Implement emd_c_lazy in C++ network simplex for memory-efficient OT
  • Add lazy mode to emd2() accepting coordinates (X_a, X_b) instead of cost matrix
  • Support sqeuclidean, euclidean, and cityblock metrics
  • Add restrict for SIMD optimization
  • Remove debug output from network_simplex_simple.h
  • Add tests for lazy solver and metric variants

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Motivation and context / Related issue

Addresses memory limitations when computing OT with large point clouds. Instead of pre-computing and storing the full n×m cost matrix, the lazy solver computes distances on-the-fly during the network simplex algorithm. This reduces memory from O(nm) to O(n+m) while maintaining exact EMD solutions.

How has this been tested

  • All existing tests pass (1031 passed, 69 skipped)
  • Added new tests in test_solvers.py: test_solve_sample_lazy and test_solve_sample_lazy_emd
  • Verified correctness against standard EMD solver
  • Tested all three metrics (sqeuclidean, euclidean, cityblock)
  • Confirmed SIMD optimization with compiler output analysis

PR checklist

  • I have read the CONTRIBUTING document.
  • The documentation is up-to-date with the changes I made (check build artifacts).
  • All tests passed, and additional code has been covered with new tests.
  • I have added the PR and Issue fix to the RELEASES.md file.

- Implement emd_c_lazy in C++ network simplex for memory-efficient OT
- Add lazy mode to emd2() accepting coordinates (X_a, X_b) instead of cost matrix
- Support sqeuclidean, euclidean, and cityblock metrics
- Add __restrict__ for SIMD optimization
- Remove debug output from network_simplex_simple.h
- Add tests for lazy solver and metric variants
@rflamary rflamary changed the title Add lazy EMD solver with on-the-fly distance computation Add lazy EMD solver with O(n) memory requirement Jan 20, 2026
X_sb, X_tb, ab, bb = nx.from_numpy(X_s, X_t, a, b)

# Test all supported metrics
for metric in ["sqeuclidean", "euclidean", "cityblock"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use @pytest.mark.parametrize to check teh metrics instead of a loop inside test

return G


def emd2(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too much of an API change please create a new function emd2_lazy

@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 93.04348% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.03%. Comparing base (4c49769) to head (cd61070).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #788      +/-   ##
==========================================
- Coverage   97.07%   97.03%   -0.04%     
==========================================
  Files         107      107              
  Lines       22156    22249      +93     
==========================================
+ Hits        21507    21590      +83     
- Misses        649      659      +10     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants