Production-grade intent-first search using BM25 + query-time boosts.
ML suggests. Rules protect. Users decide.
A production-oriented reference implementation of a modern search system that enforces user intent first using BM25 + query-time lexical boosts, before introducing semantic or ML-based ranking.
ML suggests. Rules protect. Users decide.
This repository demonstrates how real search systems are built in production:
deterministic first, intelligent later.
Search failures rarely come from bad algorithms — they come from broken boundaries.
This system enforces the following invariant:
Intent is enforced deterministically before any ML is allowed to influence ranking.
Search is a funnel, not a brain.
User Query
↓
Intent Control (BM25 + Query-Time Boosts)
↓
Candidate Retrieval (Elasticsearch)
---
- API Layer → input validation, orchestration
- Query Understanding → intent extraction & boosts
- Search Orchestrator → controls flow & guarantees
- Elasticsearch → retrieval + BM25 scoring
🔁 Runtime Flow (Sequence Diagram)
🔍 Low-Level Design (LLD)
1️⃣ Query Understanding Service
Responsible for:
Tokenization
Stop-word removal
Intent-based boost assignment (category > material > attribute)
2️⃣ Search Orchestrator
Responsible for:
Enforcing architecture rules
Building Elasticsearch queries
Preventing ML/semantic override
3️⃣ Elasticsearch (Logical View)
Elasticsearch is not the brain.
It is a retrieval engine only.
✅ What This Repo Demonstrates
Intent-first search design
BM25 as the authority
Query-time lexical boosting
Clean architectural boundaries
Explainable, deterministic ranking
Production-real patterns (not toy ML)
🧩 Extension Path (Future Work)
This architecture safely supports:
Hybrid BM25 + vector search
Offline Learning-to-Rank
Feature flags & fallbacks
Multi-region deployment
Caching & performance tuning
All without breaking correctness.