perf(warm): warm the full search path at startup, not just the model by Neverdecel · Pull Request #61 · Neverdecel/CodeRAG

Neverdecel · 2026-06-19T15:50:11Z

Context

The embed-vs-store badge from #60 paid off immediately. On the public demo a query showed embed 26 ms vs store 363 ms over 548 chunks. 548 chunks brute-forced is sub-millisecond, so 363 ms isn't compute — it's a cold index load.

Root cause: warm() ran status() + embed_query(), so only the model was warmed. The store's vector/FTS/scalar indexes and LanceDB's query path load lazily on the first real query, and that entire cold-load lands in store_ms. warm()'s own docstring already promised the first query should "reflect warm performance" — the store half just wasn't being warmed.

Change

Run one representative search() in warm() so the retrieval indexes are resident before the first user query. Guarded and best-effort so warm-up can never block startup; a no-op on an empty index.

Verification

Reproduced locally (~550 chunks, fresh process):

	total	embed	store (dense / lex / hydrate)
cold first query	35.4 ms	0.2	35.0 (14.4 / 9.4 / 11.2)
after warm search	14.7 ms	0.1	14.5 (5.2 / 4.9 / 4.4)

The cold penalty is spread evenly across dense/lexical/hydrate — the signature of cold index loading, not one slow op. On the demo's slower disk this cold-load is far larger (the observed ~360 ms). Tests: retrieval / store / surfaces / webui all pass; lint + format clean.

🤖 Generated with Claude Code

Generated by Claude Code

warm() ran status() + embed_query(), so the store's vector/FTS/scalar indexes and LanceDB's query path stayed cold until the first real query — which then paid the entire index-load cost. With the new badge breakdown this is visible as a large store_ms (e.g. embed 26ms vs store 363ms over 548 chunks on the demo) while embed is already warm. Run one representative search() in warm() so the retrieval indexes are resident before the first user query. Measured locally (~550 chunks): first-query store drops from ~35ms to ~14ms; on a slower deployed host the cold-load is far larger. Best-effort and guarded so warm-up can never block startup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Y1DfHPqxHppXF6zEYgFKi3

codecov-commenter · 2026-06-19T15:51:33Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Neverdecel merged commit 77c0ade into master Jun 19, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(warm): warm the full search path at startup, not just the model#61

perf(warm): warm the full search path at startup, not just the model#61
Neverdecel merged 1 commit into
masterfrom
claude/warm-search-path

Neverdecel commented Jun 19, 2026

Uh oh!

codecov-commenter commented Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Neverdecel commented Jun 19, 2026

Context

Change

Verification

Uh oh!

codecov-commenter commented Jun 19, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants