Skip to content

MB-59670: Add GPU support#385

Open
capemox wants to merge 14 commits intopre_gpufrom
gpu-support
Open

MB-59670: Add GPU support#385
capemox wants to merge 14 commits intopre_gpufrom
gpu-support

Conversation

@capemox
Copy link
Copy Markdown
Member

@capemox capemox commented Mar 25, 2026

  • New faissGPUFloat32Index implementing faissIndex. It opaquely performs operations on cpu or gpu when appropriate.
  • Supports training and search on gpu, falls back to cpu when appropriate
  • Reorganize makeFaissIndex to avoid losing direct map information when transferring to gpu

@CascadingRadium
Copy link
Copy Markdown
Member

hey @capemox, please check the new batchSearch API in the request batcher module.

Please add the batcher as an optional object in your struct, with the struct implementing the batchSearch API. That way when we call SearchWithoutIDs we essentially forward the request to the batcher module via its search API.

When it has finished batching and needs to execute a batch, it will call the batchSearch of your GPU struct, which will execute the search on the GPU index. Take special care to NOT involve the batcher if the clone to GPU fails and if the GPU index is unavailable, in which case we do the CPU fallback.

Resolve the comments and fix the merge conflicts ASAP, I am working on resolving merge conflicts on the base pre_gpu branch.

Thanks
cc @abhinavdangeti

@CascadingRadium
Copy link
Copy Markdown
Member

Hey @capemox, i think youre right, I was rebasing over fastmerge on local and should have pushed it later on. Have reverted.

@capemox capemox requested a review from CascadingRadium March 31, 2026 08:32
@capemox capemox self-assigned this Apr 1, 2026
Copy link
Copy Markdown
Member

@CascadingRadium CascadingRadium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds GPU acceleration support for Faiss vector indexes by introducing a GPU-backed float32 index wrapper and threading a per-field useGPU option through index creation, merge, and cache-loading paths. It also adjusts the IVF/SQ build flow to preserve direct map / nprobe behavior across GPU↔CPU sync and improves shutdown semantics in the request batcher.

Changes:

  • Introduce faissGPUFloat32Index and select it via faissIndexFactory when useGPU is enabled for IVF-based float32 indexes.
  • Plumb useGPU through vector indexing/merging and vector index cache loading/creation.
  • Update training/build sequence to trainAndAdd() (then set direct map / nprobe) and improve batcher shutdown by waiting for monitor goroutine exit.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
section_faiss_vector_index.go Threads useGPU into index build/merge; reorganizes IVF/SQ train+add vs add flow.
faiss_vector_request_batcher.go Adds doneCh to allow stop() to wait for monitor goroutine exit.
faiss_vector_posting.go Passes per-field useGPU into vector index cache load/create.
faiss_vector_index.go Updates IVF/SQ interfaces to use trainAndAdd.
faiss_vector_index_gpu_float32.go New GPU-backed float32 index wrapper with async GPU init + batched GPU search + CPU fallback.
faiss_vector_index_float32.go Adds trainAndAdd for CPU float32 index wrapper.
faiss_vector_index_bivf.go Replaces train with trainAndAdd for binary IVF wrapper.
faiss_vector_cache.go Extends cache load/create API to optionally wrap indexes with GPU support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@CascadingRadium CascadingRadium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pull in minor fixes from the batcher branch as well.

func (f *faissGPUFloat32Index) setNProbe(nprobe int32) {
f.cpuIdx.SetNProbe(nprobe)
f.waitGPU()
if f.gpuIdx != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think GPU index supports this API. We should be setting nprobe to only the cpu index, since this is in the indexing path and we serialize the cpu index only anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants