AutoSQL

Self-optimizing SQL query pipeline. Inspired by karpathy/autoresearch.

LLM rewrites query → run it → measure (speed + correctness) → keep if better → repeat

One file. Zero fluff.

The Problem

Slow SQL queries are everywhere. Correlated subqueries, redundant scans, missing pre-aggregations — they kill app performance and nobody catches them until prod is on fire.

AutoSQL fixes this automatically:

You give it a query
It runs the query, records the baseline time and result hash
An LLM rewrites the query trying to make it faster
It runs the new query, checks the result is identical, measures the speedup
Keeps the best version, feeds the history back into the next iteration
Repeats until the budget is spent

Quickstart

# 1. Install dependencies
pip install openai            # covers Groq, OpenAI, Ollama, Together, OpenRouter
pip install anthropic         # only needed for Claude models

# 2. Set the API key for your chosen provider
export GROQ_API_KEY=gsk_...        # Groq  (free at https://console.groq.com)
export OPENAI_API_KEY=sk_...       # OpenAI
export ANTHROPIC_API_KEY=sk-ant-... # Anthropic
# Ollama needs no key — just have it running locally

# 3. Optimize your query
python AutoSQL.py --query slow.sql --db myapp.sqlite

# OpenAI
python AutoSQL.py --query slow.sql --db myapp.sqlite --model gpt-4o --provider openai

# Anthropic / Claude
python AutoSQL.py --query slow.sql --db myapp.sqlite --model claude-opus-4-5 --provider anthropic

# Ollama (local)
python AutoSQL.py --query slow.sql --db myapp.sqlite --model llama3 --provider ollama

# Any OpenAI-compatible endpoint
python AutoSQL.py --query slow.sql --db myapp.sqlite --model my-model --base-url http://localhost:8080/v1

# More iterations or a different Groq model
python AutoSQL.py --query "SELECT * FROM ..." --db myapp.sqlite --iterations 15
python AutoSQL.py --query slow.sql --db myapp.sqlite --model llama-3.1-8b-instant

Example output

──────────────────────────────────────────────────────────────
  AutoSQL — Self-optimizing Query Pipeline
  Model  : llama-3.3-70b-versatile
  Provider : groq
──────────────────────────────────────────────────────────────

  Measuring baseline … 1842.3 ms

  [01/10] Generating … ✓  312.4 ms  (5.89× faster)  ← NEW BEST
  [02/10] Generating … ✓  298.1 ms  (6.18× faster)  ← NEW BEST
  [03/10] Generating … ✗  wrong result
  [04/10] Generating … ✓  291.7 ms  (6.32× faster)  ← NEW BEST
  ...

──────────────────────────────────────────────────────────────
  Final Report
──────────────────────────────────────────────────────────────
  Baseline : 1842.3 ms
  Best     : 291.7 ms  (6.32× faster)

What it optimizes

AutoSQL tackles the classic anti-patterns that kill query performance:

Anti-pattern	What AutoSQL does
Correlated subquery in `WHERE`	Replaces with pre-aggregated JOIN / CTE
Correlated subquery in `SELECT`	Replaces with pre-aggregated JOIN
Multiple passes over the same table	Collapses into a single scan

Correctness guarantee

AutoSQL never accepts a faster query that returns different results.
Every candidate is run and its output hashed. Only an exact match against the baseline hash counts as correct.

Outputs

After the run, autosql_log.json contains every attempt:

{
  "baseline_ms": 1842.3,
  "best_ms": 291.7,
  "speedup": 6.32,
  "iterations": [
    { "iteration": 1, "time_ms": 312.4, "speedup": 5.89, "correct": true, "query": "..." },
    ...
  ]
}

Options

Flag	Default	Description
`--query`	(required)	SQL string or `.sql` file path
`--db`	(required)	SQLite database path
`--iterations`	`10`	How many rewrites to attempt
`--model`	`llama-3.3-70b-versatile`	LLM model to use
`--provider`	(auto-detect)	`groq`, `openai`, `anthropic`, `ollama`, `together`, `openrouter`
`--base-url`	(none)	Custom OpenAI-compatible endpoint URL
`--min-speedup`	`0` (disabled)	Stop early once this speedup multiplier is reached (e.g. `3.0`)
`--timeout`	`0` (unlimited)	Per-execution timeout in milliseconds; aborts runaway queries

Extending

AutoSQL intentionally stays minimal. Easy extension points:

Other databases — swap sqlite3 for psycopg2, duckdb, etc. Only run_query() changes.
Other metrics — add memory, I/O reads, or query plan cost alongside wall-clock time.
Indexes — let the LLM propose CREATE INDEX statements alongside the query rewrite.
Batch mode — feed a folder of .sql files, get a report for each.

Requirements

Python 3.10+
openai (pip install openai) — covers Groq, OpenAI, Ollama, Together, OpenRouter
anthropic (pip install anthropic) — only for Claude/Anthropic models
At least one API key set as an environment variable (see Quickstart above)
SQLite database

Supported providers

Provider	`--provider`	Key env var	Notes
Groq	`groq`	`GROQ_API_KEY`	Default; free tier available
OpenAI	`openai`	`OPENAI_API_KEY`	GPT-4o, o3, etc.
Anthropic	`anthropic`	`ANTHROPIC_API_KEY`	Claude models; needs `pip install anthropic`
Ollama	`ollama`	(none)	Runs locally; no key required
Together AI	`together`	`TOGETHER_API_KEY`
OpenRouter	`openrouter`	`OPENROUTER_API_KEY`
Any OpenAI-compatible	`openai`	`OPENAI_API_KEY`	Pass `--base-url` to override endpoint

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
AutoSQL.py		AutoSQL.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoSQL

The Problem

Quickstart

Example output

What it optimizes

Correctness guarantee

Outputs

Options

Extending

Requirements

Supported providers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AutoSQL

The Problem

Quickstart

Example output

What it optimizes

Correctness guarantee

Outputs

Options

Extending

Requirements

Supported providers

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages