Skip to content

fix: cap KNNVectorDistanceExec and FTS search parallelism to target_partitions#7194

Draft
wjones127 wants to merge 1 commit into
lance-format:mainfrom
wjones127:worktree-delegated-inventing-swing
Draft

fix: cap KNNVectorDistanceExec and FTS search parallelism to target_partitions#7194
wjones127 wants to merge 1 commit into
lance-format:mainfrom
wjones127:worktree-delegated-inventing-swing

Conversation

@wjones127

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #7087. Two more CPU-bound execution nodes were using get_num_compute_intensive_cpus() directly without respecting DataFusion's target_partitions session config:

  • KNNVectorDistanceExec: caps buffer_unordered for the distance-compute stream with target_partitions. Each concurrent task calls spawn_cpu to run vector distance calculation on the CPU pool.
  • MatchQueryExec / PhraseQueryExec: pass a capped parallelism value into search_segments(), which controls concurrent BM25 scoring across FTS index segments.

IO-bound parallelism paths (TakeExec.map_batch, LancePushdownScanExec.batch_readahead) are intentionally left uncapped, consistent with the approach in #7087.

Does this change default parallelism?

No. The cap only takes effect when a caller explicitly lowers target_partitions below the default (which equals the logical CPU count), matching the behavior established in #7087.

Test plan

  • Added test_knn_vector_distance_respects_target_partitions: executes KNNVectorDistanceExec with target_partitions=1 and verifies all rows are returned correctly.
  • All existing KNN and FTS tests pass.

🤖 Generated with Claude Code

…artitions

Follow-up to lance-format#7087. Two CPU-bound execution nodes were using
`get_num_compute_intensive_cpus()` directly without respecting DataFusion's
`target_partitions` session config:

- `KNNVectorDistanceExec`: caps `buffer_unordered` for the distance-compute
  stream with `target_partitions`.
- `MatchQueryExec` / `PhraseQueryExec`: pass a capped `parallelism` value into
  `search_segments()`, which controls concurrent BM25 scoring across FTS index
  segments.

IO-bound parallelism paths (`TakeExec`, `LancePushdownScanExec.batch_readahead`)
are intentionally left uncapped, consistent with the approach in lance-format#7087.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the bug Something isn't working label Jun 9, 2026
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant