search-proxy

A rate-limiting proxy for SearXNG that prevents upstream search engine throttling when AI agents fire many concurrent queries.

Problem

AI coding agents (Claude Code, Open WebUI, etc.) send parallel search queries through SearXNG. Each SearXNG query fans out to multiple upstream engines (Google, Bing, DuckDuckGo). Burst requests overwhelm upstream rate limits, triggering CAPTCHAs and empty results.

Solution

search-proxy sits between your AI tools and SearXNG, limiting concurrent upstream queries with an async semaphore. Excess requests queue in memory and drain as slots free. Callers see slower responses under load instead of failures.

AI Agent --> search-proxy:8086 --> SearXNG:8089 --> Google/Bing/DDG
                (max 4 concurrent)

Quick Start

docker compose up -d

By default, the proxy listens on port 8088 and forwards to http://searxng:8080. Configure via environment variables:

Variable	Default	Description
`UPSTREAM_URL`	`http://searxng:8080`	SearXNG base URL
`MAX_CONCURRENT`	`4`	Max simultaneous upstream queries
`REQUEST_TIMEOUT`	`30`	Upstream timeout in seconds
`PORT`	`8088`	Proxy listen port
`LOG_FILE`	`/var/log/search-proxy/proxy.log`	Log file path

Endpoints

GET /search?q=QUERY&format=json -- proxied to SearXNG with rate limiting. All query parameters are forwarded as-is.
GET /health -- returns {"status": "ok", "active_requests": N, "queue_depth": N, "max_concurrent": N}

Deployment

Same Docker network as SearXNG

services:
  search-proxy:
    build: .
    ports:
      - "8086:8088"
    environment:
      - UPSTREAM_URL=http://searxng:8080
      - MAX_CONCURRENT=4

SearXNG on host network

services:
  search-proxy:
    build: .
    ports:
      - "8086:8088"
    environment:
      - UPSTREAM_URL=http://host.docker.internal:8089
      - MAX_CONCURRENT=4

Point your DNS or client config to the proxy port instead of SearXNG directly.

How it works

FastAPI with asyncio.Semaphore(MAX_CONCURRENT) gates upstream requests
Excess requests queue in memory (no Redis/database needed)
Callers block until their request completes (no polling or job IDs)
Health endpoint exposes real-time queue depth and active request count
Dual logging to stdout and file

Running tests

pip install -r requirements.txt
pytest tests/ -v

Why MAX_CONCURRENT=4?

Google starts returning CAPTCHAs at roughly 10-20 queries/minute from the same IP. With 5 upstream engines per SearXNG query, 4 concurrent queries means ~20 upstream requests in flight -- right at the threshold. Adjust based on your upstream engine configuration and IP reputation.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
search_proxy.py		search_proxy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

search-proxy

Problem

Solution

Quick Start

Endpoints

Deployment

Same Docker network as SearXNG

SearXNG on host network

How it works

Running tests

Why MAX_CONCURRENT=4?

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

search-proxy

Problem

Solution

Quick Start

Endpoints

Deployment

Same Docker network as SearXNG

SearXNG on host network

How it works

Running tests

Why MAX_CONCURRENT=4?

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages