Skip to content

docs: update LoCoMo retrieval-only SochDB NVIDIA benchmark#7

Merged
sushanthpy merged 2 commits into
sochdb:mainfrom
tatavishnurao:report/locomo-metadata-retrieval
May 30, 2026
Merged

docs: update LoCoMo retrieval-only SochDB NVIDIA benchmark#7
sushanthpy merged 2 commits into
sochdb:mainfrom
tatavishnurao:report/locomo-metadata-retrieval

Conversation

@tatavishnurao
Copy link
Copy Markdown
Contributor

@tatavishnurao tatavishnurao commented May 29, 2026

Summary

This PR updates the LoCoMo retrieval-only benchmark report with a full SochDB-backed NVIDIA retrieval run using metadata-rich memory rendering.

The evaluated configuration uses:

  • SochDB vector backend
  • NVIDIA llama-nemotron-embed-1b-v2
  • 2048-dimensional embeddings
  • BM25 + vector hybrid retrieval
  • RRF fusion
  • single-query retrieval
  • metadata-rich memory rendering
  • no reranker
  • no decomposed retrieval
  • no evidence completion

Result

On the full LoCoMo retrieval-only evaluation:

Variant Hit@100 Recall@100
Raw full K100 82.70 77.46
Metadata full K100 89.53 84.69
Delta +6.83 +7.24

The strongest gains were in:

Category Δ Hit@100 Δ Recall@100
multi_hop +14.61 +15.51
single_hop +9.96 +12.16
temporal +7.19 +7.84

Notes

This is a retrieval-only benchmark. It measures gold evidence recovery using Hit@K and Recall@K. It does not report final LoCoMo answer accuracy or judge-based QA accuracy.

Validation

The full metadata run produced:

  • 1,986 retrieval rows
  • 1,977 scored questions
  • 10 LoCoMo samples

Command Configuration

uv run python benchmarks/paper/locomo/runners/run_hybrid_locomo_retrieval.py \
  --memories benchmarks/paper/locomo/data/locomo_memories.jsonl \
  --questions benchmarks/paper/locomo/data/locomo_questions.jsonl \
  --embedding-provider nvidia \
  --embedding-model nvidia/llama-nemotron-embed-1b-v2 \
  --embedding-dim 2048 \
  --vector-backend sochdb \
  --host "$SOCHDB_HOST" \
  --port "$SOCHDB_PORT" \
  --collection-prefix locomo_sochdb_nvidia_dim2048_metadata_full_k100 \
  --k 100 \
  --candidate-k 200 \
  --bm25-weight 1.5 \
  --vector-weight 0.75 \
  --rrf-k 60 \
  --query-mode single \
  --memory-render-mode metadata \
  --retrieval-plan one_shot \
  --evidence-completion none \
  --retrieved-id-mode memory \
  --reranker-provider none \
  --out benchmarks/paper/locomo/results/full_single_metadata_sochdb_nvidia_dim2048_k100/retrieval.jsonl

@sushanthpy sushanthpy merged commit 43f50af into sochdb:main May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants