Intent-First Search Engine

Production-grade intent-first search using BM25 + query-time boosts.

ML suggests. Rules protect. Users decide.

Intent-First Search Engine

A production-oriented reference implementation of a modern search system that enforces user intent first using BM25 + query-time lexical boosts, before introducing semantic or ML-based ranking.

ML suggests. Rules protect. Users decide.

This repository demonstrates how real search systems are built in production:
deterministic first, intelligent later.

🎯 Core Philosophy

Search failures rarely come from bad algorithms — they come from broken boundaries.

This system enforces the following invariant:

Intent is enforced deterministically before any ML is allowed to influence ranking.

🧠 Architectural Overview

Search is a funnel, not a brain.


User Query
↓
Intent Control (BM25 + Query-Time Boosts)
↓
Candidate Retrieval (Elasticsearch)
---

🏗 High-Level Design (HLD)

Responsibilities by Layer

API Layer → input validation, orchestration
Query Understanding → intent extraction & boosts
Search Orchestrator → controls flow & guarantees
Elasticsearch → retrieval + BM25 scoring

🔁 Runtime Flow (Sequence Diagram)

🔍 Low-Level Design (LLD)

1️⃣ Query Understanding Service

Responsible for:

Tokenization

Stop-word removal

Intent-based boost assignment (category > material > attribute)

2️⃣ Search Orchestrator

Responsible for:

Enforcing architecture rules

Building Elasticsearch queries

Preventing ML/semantic override

3️⃣ Elasticsearch (Logical View)

Elasticsearch is not the brain.

It is a retrieval engine only.

✅ What This Repo Demonstrates

Intent-first search design

BM25 as the authority

Query-time lexical boosting

Clean architectural boundaries

Explainable, deterministic ranking

Production-real patterns (not toy ML)

🧩 Extension Path (Future Work)

This architecture safely supports:

Hybrid BM25 + vector search

Offline Learning-to-Rank

Feature flags & fallbacks

Multi-region deployment

Caching & performance tuning

All without breaking correctness.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github/workflows		.github/workflows
api		api
data		data
docker/elasticsearch		docker/elasticsearch
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intent-First Search Engine

Intent-First Search Engine

🎯 Core Philosophy

🧠 Architectural Overview

🏗 High-Level Design (HLD)

Responsibilities by Layer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intent-First Search Engine

Intent-First Search Engine

🎯 Core Philosophy

🧠 Architectural Overview

🏗 High-Level Design (HLD)

Responsibilities by Layer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages