Commit 2b5f39b

committed

perf(planner): use USearch index.get() for low-selectivity path

Replace the parquet-native full scan with direct vector retrieval from the USearch index. The index stores vectors alongside the HNSW graph, so index.get(key) retrieves them in O(1) per key. Previously, the low-selectivity path scanned the entire Parquet file including the vector column (e.g. 6.95GB for 1.2M rows) just to compute distances for the few rows matching the WHERE clause. Now it retrieves vectors only for valid_keys collected during the pre-scan, computes distances, maintains a top-k heap, then fetches result rows from the lookup provider. This eliminates the full_scan DataSourceExec at runtime for filtered queries. The parquet-native code is retained but unused, pending removal after production validation.

1 parent edb6c94 commit 2b5f39bCopy full SHA for 2b5f39b

2 files changed

src
- planner.rs
- registry.rs

Comments

(0)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit 2b5f39b

File tree

0 commit comments