Skip to content

Milvus optimize can start search before post-compaction query view is stable #784

@foxspy

Description

@foxspy

Summary

The Milvus backend optimize() path can finish before Milvus reaches a stable post-compaction query view.

For clean VectorDBBench runs on OpenAI 500K with SVS_VAMANA_LEANVEC, the current implementation waits for only the returned compaction id to become Completed, then waits for index pending rows, then calls refresh_load(). In practice, this is not a strong enough barrier: the returned force-merge compaction can be planned over only a partial segment view while other compactions are still converging, and the first search may start with nondeterministic QueryNode-visible segment counts.

This makes consecutive clean benchmark runs hard to compare because the benchmark can start from different post-optimize segment distributions even though optimize() has returned successfully.

Environment

  • VectorDBBench commit: 0c20701725a84fbcd2a14b5d628c77cac2beb071
  • Milvus commit: ddd76bccd9a1173e6a97221e834315dcbb55271f (origin/3.0, built with USE_SVS=ON)
  • PyMilvus: 2.6.8
  • Deployment: Milvus standalone
  • Host: AWS m6id.2xlarge, Intel Xeon Platinum 8375C, 8 vCPU, about 30 GiB RAM
  • Dataset/index: OpenAI 500K, dim 1536, cosine, SVS_VAMANA_LEANVEC, topK 10

Reproduction

Run consecutive clean VectorDBBench runs:

vectordbbench milvussvsvamanaleanvec \
  --uri http://127.0.0.1:20643 \
  --svs-graph-max-degree 64 \
  --svs-construction-window-size 200 \
  --svs-storage-kind leanvec4x8 \
  --svs-search-window-size 1000 \
  --svs-search-buffer-capacity 1000 \
  --case-type Performance1536D500K \
  --k 10 \
  --svs-leanvec-dim 768 \
  --skip-search-concurrent

The relevant current Milvus optimize flow is:

def _wait_for_segments_sorted(self):
    segments = self.client.list_persistent_segments(self.collection_name)
    unsorted = [s for s in segments if not s.is_sorted]
    ...

def _wait_for_index(self):
    info = self.client.describe_index(self.collection_name, self._vector_index_name)
    if info.get("pending_index_rows", -1) == 0:
        break

def _wait_for_compaction(self, compaction_id):
    state = self.client.get_compaction_state(compaction_id)
    if state == "Completed":
        break

def _optimize(self):
    self.client.flush(self.collection_name)
    self._wait_for_segments_sorted()
    self._wait_for_index()
    compaction_id = self.client.compact(self.collection_name, target_size=(2**63 - 1))
    if compaction_id > 0:
        self._wait_for_compaction(compaction_id)
    log.info("force merge compaction completed.")
    self._wait_for_index()
    self.client.refresh_load(self.collection_name)

Actual Behavior

Across clean runs of the same workload, VectorDBBench reached serial search with different QueryNode-visible sealed segment counts:

Run First-search sealedSegmentNum Recall@10
1 4 0.9929
2 4 0.9721
3 1 0.9914
4 4 0.9775
5 1 0.8933
6 2 0.9926
7 4 0.9847
8 1 0.9914
9 1 0.8933

In one representative clean run, VectorDBBench's force-merge compaction completed, but the plan only compacted one current segment of about 98K rows:

11:41:14 DataCoord force-merge calculation:
  targetSegmentCount=1
  triggerID=466419709007132860

11:41:14 Compaction plan submitted:
  planID=466419709007132861
  inputSegments=[466419709003500393]

11:41:23 get_compaction_state(466419709007132860) returned Completed

compactTo=466419709007132863
numRows=98000

Other auto/mix compactions were still producing the rest of the 500K-row segment set around the same time. Before/during the first serial search, QueryNode was fully loaded but still had multiple sealed segments:

11:45:38 QueryNode query view:
  loadedRatio=1
  loadedSealedRowCount=500000
  unloadedSealedSegmentNum=0
  sealedSegmentNum=7

11:45:48 QueryNode query view:
  loadedRatio=1
  loadedSealedRowCount=500000
  unloadedSealedSegmentNum=0
  sealedSegmentNum=5

11:46:18 QueryNode query view:
  loadedRatio=1
  loadedSealedRowCount=500000
  unloadedSealedSegmentNum=0
  sealedSegmentNum=5

When a manual force-merge was called later, after the system had converged to those 5 segments, Milvus planned over all five current segments and QueryNode eventually reached one loaded 500K segment:

11:52:37 Compaction plan submitted:
  planID=466419709007266916
  inputSegments=[five current segments]

11:57:48 QueryNode query view:
  loadedRatio=1
  loadedSealedRowCount=500000
  unloadedSealedSegmentNum=0
  sealedSegmentNum=1

Expected Behavior

VectorDBBench's Milvus optimize() should only return when the benchmark is ready to run a stable search phase.

At minimum, after compaction and refresh_load(), VectorDBBench should wait for a post-compaction steady state, for example:

  • describe_index(...).pending_index_rows == 0
  • persistent segments are sorted and cover the expected row count
  • loaded/query segments cover the expected row count
  • unloadedSealedSegmentNum == 0
  • the loaded/query segment id set and row counts remain unchanged for several consecutive polls

For force-merge benchmarks where one compacted segment per channel is expected, VectorDBBench could optionally assert the expected compacted segment count before starting search. If the expected segment count cannot be hardcoded for all Milvus topologies, VectorDBBench should at least wait until the segment id set is stable for a configurable interval.

Why this matters

Without a stronger post-optimize readiness barrier, consecutive clean VectorDBBench runs may benchmark different Milvus states.

In the OpenAI500K SVS investigation, repeated search-only runs on the same already-loaded collection were stable. The instability was associated with clean runs that rebuild/reload and then immediately enter search after the current optimize() flow.

Note: the low ~0.8933 SVS recall state can also be reproduced directly with Knowhere without Milvus segments or QueryNode loading, so this issue is not claiming that segment count alone causes the SVS recall drop. The VectorDBBench issue is narrower: the current Milvus optimize path does not guarantee a stable benchmark start state.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions