NLWeb is a natural language search system that provides intelligent query processing, multi-source retrieval, and AI-powered response generation. The system consists of a Python backend serving a modern JavaScript frontend via HTTP/HTTPS.
GET/POST /ask- Main query endpoint- Parameters:
query(string): User's natural language querysite(string/array): Target site(s) to searchgenerate_mode(string): "list", "summarize", or "generate"streaming(boolean): Enable server-sent events streamingprev(array): Previous queries for contextlast_ans(array): Previous answers for contextitem_to_remember(string): Items to remember in conversationmodel(string): LLM model to useoauth_id(string): User ID for authenticated storagethread_id(string): Conversation thread ID
- Returns: JSON response or SSE stream of results
- Parameters:
-
GET /sites- Get list of available sites- Parameters:
streaming(boolean) - Returns: Array of site names
- Parameters:
-
GET /who- Handle "who" queries- Parameters: Same as
/ask - Returns: Person/entity information
- Parameters: Same as
-
GET /api/oauth/config- Get OAuth configuration- Returns: Enabled providers and client IDs
-
POST /api/oauth/token- Exchange OAuth code for token- Body:
{ code, provider } - Returns: User info and auth token
- Body:
-
GET /api/conversations- Get user's conversations- Headers:
Authorization: Bearer <token> - Returns: Array of conversations
- Headers:
-
POST /api/conversations- Create/update conversation- Headers:
Authorization: Bearer <token> - Body: Conversation object
- Returns: Saved conversation
- Headers:
-
DELETE /api/conversations/{id}- Delete conversation- Headers:
Authorization: Bearer <token> - Returns: Success status
- Headers:
api_version- API version informationquery_analysis- Query understanding resultsdecontextualized_query- Reformulated query for contextremember- Items to rememberasking_sites- Sites being queriedresult_batch- Batch of search resultssummary- Summarized responsenlws- Natural language web search response (for generate mode)ensemble_result- Multi-source recommendationschart_result- Data visualization HTMLresults_map- Location-based results for mappingintermediate_message- Progress updatescomplete- Stream completion signalerror- Error messages
{
"query": str, # User's query text
"site": Union[str, List[str]], # Target site(s)
"generate_mode": str, # "list", "summarize", "generate"
"streaming": bool, # Enable SSE streaming
"prev": List[str], # Previous queries
"last_ans": List[Dict], # Previous answers [{title, url}]
"item_to_remember": str, # Memory items
"model": str, # LLM model name
"oauth_id": str, # User identifier
"thread_id": str, # Conversation thread
"display_mode": str, # "full" or other display modes
}# Internal representation
[url, json_data, name, site] # Tuple format
# API response format
{
"url": str,
"name": str,
"site": str,
"score": float,
"description": str,
"schema_object": dict, # Schema.org structured data
"details": dict, # Additional details
}{
"id": str, # Unique conversation ID
"title": str, # Conversation title
"messages": List[{
"content": str, # Message content
"type": str, # "user" or "assistant"
"timestamp": int, # Unix timestamp
"parsedAnswers": List[{ # For assistant messages
"title": str,
"url": str
}]
}],
"timestamp": int, # Last update timestamp
"site": str, # Associated site
"user_id": str, # Owner user ID
}{
"message_type": str, # Type of message
"query_id": str, # Query identifier
# Type-specific fields:
"message": str, # For text messages
"results": List[dict], # For result batches
"answer": str, # For nlws responses
"items": List[dict], # For nlws items
"html": str, # For chart results
"locations": List[{ # For map results
"title": str,
"address": str
}]
}HTTP Request → Route Matching → Handler Selection → Parameter Parsing
NLWebHandler Creation → State Initialization → Streaming Setup
Parallel execution of:
Direct Vector Search → Early Results → Stream if Available
1. Decontextualization (if prev queries exist)
- Use LLM to reformulate query with context
2. Query Analysis
- Item type detection
- Relevance checking
- Memory processing
3. Tool Selection
- Load tool definitions
- Evaluate tools against query
- Route to specialized handler if matched
1. Prepare Query
- Apply site filters
- Format for vector DB
2. Parallel Search
- Query multiple vector DB endpoints
- Aggregate results
- Deduplicate by URL
3. Result Processing
- Convert to standard format
- Apply initial filtering
1. LLM-based Ranking (if enabled)
- Score results for relevance
- Apply query-specific criteria
2. Post-Ranking Tasks
- Additional filtering
- Result enrichment
- Score normalization
Based on generate_mode:
Format Results → Stream result_batch messages → Complete
Results → LLM Summarization → Stream summary + results → Complete
Results → GenerateAnswer Handler → RAG Generation → Stream nlws message → Complete
Collect Results → Format Conversation → Store to Database
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│ Browser │────▶│ WebServer │────▶│ Router │
│ (JS Client) │◀────│ (HTTP) │◀────│ (Tools) │
└─────────────┘ └──────────────┘ └─────────────┘
│ │
▼ ▼
┌──────────────┐ ┌─────────────┐
│ NLWebHandler │────▶│ Specialized │
│ (Base) │ │ Handlers │
└──────────────┘ └─────────────┘
│
┌───────┴───────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ Retriever │ │ LLM │
│ (Vector DB) │ │ Provider │
└─────────────┘ └─────────────┘
- config.yaml - Main configuration
- config_retrieval.yaml - Retrieval endpoints
- config_llm.yaml - LLM provider settings
- oauth_config.yaml - OAuth provider configuration
- Site-specific configs - Per-site customization
The system is designed for extensibility, supporting multiple vector databases, LLM providers, and specialized tools while maintaining a consistent API interface.