Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 23 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
[![PyPI version](https://img.shields.io/pypi/v/qql-cli?color=blue&label=PyPI)](https://pypi.org/project/qql-cli/)
[![Python 3.12+](https://img.shields.io/pypi/pyversions/qql-cli)](https://pypi.org/project/qql-cli/)
[![MIT License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
[![Tests](https://img.shields.io/badge/tests-549%20passing-brightgreen)](tests/)
[![Tests](https://img.shields.io/badge/tests-635%20passing-brightgreen)](tests/)

Write `INSERT`, `SELECT`, `SEARCH`, `SCROLL`, `RECOMMEND`, `UPDATE`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, grouped search (GROUP BY), cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, and collection dump/restore.
Write `INSERT`, `SELECT`, `SEARCH`, `SCROLL`, `RECOMMEND`, `UPDATE`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, grouped search (GROUP BY), cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, collection dump/restore, async execution, gRPC transport, parameterized queries, and batched query execution.

```
qql> INSERT INTO COLLECTION notes VALUES {'text': 'Qdrant is a vector database', 'author': 'alice', 'year': 2024}
Expand Down Expand Up @@ -50,16 +50,23 @@ Your query string

When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) by default to merge the results of both retrieval methods. You can switch hybrid search to DBSF with `FUSION 'dbsf'`.

QQL also exposes a **programmatic API** for use inside Python applications — no CLI required:
QQL also exposes a **programmatic API** for use inside Python applications — no CLI required. Use `Connection` for sync code, `AsyncConnection` for async apps, and batch helpers when you want QQL to combine compatible operations into fewer Qdrant requests:

```python
from qql import Connection
from qql import Connection, QQLBatch

with Connection("http://localhost:6333") as conn:
conn.run_query("INSERT INTO COLLECTION notes VALUES {'text': 'Qdrant is fast'}")
result = conn.run_query("SEARCH notes SIMILAR TO 'vector database' LIMIT 5")
for hit in result.data:
print(hit["score"], hit["payload"])
result = conn.run_parameterized_query(
"SEARCH notes SIMILAR TO :query LIMIT 5",
{"query": "vector database"},
)

with QQLBatch(conn) as batch:
neurology = batch.add("SEARCH notes SIMILAR TO 'neurology' LIMIT 5")
cardiology = batch.add("SEARCH notes SIMILAR TO 'cardiology' LIMIT 5")

print(neurology.result.data, cardiology.result.data)
```

---
Expand Down Expand Up @@ -97,8 +104,8 @@ Full documentation lives in the [`docs/`](docs/) folder and at **[pavanjava.gith
| [SEARCH / SELECT / SCROLL / RECOMMEND / Hybrid / GROUP BY / RERANK](docs/search.md) | Semantic search, grouped search, point retrieval, pagination, hybrid, reranking, recommendations |
| [WHERE Filters](docs/filters.md) | Full SQL-style filter operators |
| [Collections & Quantization](docs/collections.md) | SHOW, CREATE, DROP, QUANTIZE (scalar/turbo/binary/product), CREATE INDEX, UPDATE VECTOR, UPDATE PAYLOAD |
| [Scripts: EXECUTE / DUMP](docs/scripts.md) | Script files, collection backup/restore |
| [Programmatic Usage](docs/programmatic.md) | Use QQL as a Python library via `Connection` or `run_query()` |
| [Scripts: EXECUTE / DUMP](docs/scripts.md) | Script files, `BEGIN BATCH` blocks, collection backup/restore |
| [Programmatic Usage](docs/programmatic.md) | Sync/async Python APIs, parameterized queries, batching, gRPC |
| [Reference: Models / Config / Errors](docs/reference.md) | Embedding models, config file, error reference |

---
Expand Down Expand Up @@ -170,6 +177,12 @@ DELETE FROM articles WHERE year < 2020
-- Scripts
EXECUTE /path/to/script.qql
DUMP articles /path/to/backup.qql

-- Batch block
BEGIN BATCH;
SEARCH articles SIMILAR TO 'query one' LIMIT 5;
SEARCH articles SIMILAR TO 'query two' LIMIT 5;
END BATCH
```

---
Expand All @@ -182,7 +195,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
pytest tests/ -v
```

Expected: **549 tests passing**.
Expected: **635 tests passing**.

---

Expand Down
11 changes: 9 additions & 2 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ title: "Getting Started"

# Getting Started with QQL

QQL is a SQL-like query language and CLI for [Qdrant](https://qdrant.tech). Instead of writing Python SDK calls you write natural query statements to insert, search, manage, and delete vector data.
QQL is a SQL-like query language and CLI for [Qdrant](https://qdrant.tech). Instead of writing Python SDK calls you write natural query statements to insert, search, manage, and delete vector data. It can also be used as a sync or async Python library with batching, parameterized queries, and optional gRPC transport.

---

Expand Down Expand Up @@ -154,6 +154,12 @@ SHOW COLLECTION notes

-- Retrieve a point by ID
SELECT * FROM notes WHERE id = 1

-- Run compatible queries as one batch
BEGIN BATCH;
SEARCH notes SIMILAR TO 'vector databases' LIMIT 5;
SEARCH notes SIMILAR TO 'semantic search' LIMIT 5;
END BATCH
```

---
Expand All @@ -164,5 +170,6 @@ SELECT * FROM notes WHERE id = 1
- [SEARCH / SELECT / SCROLL / RECOMMEND / Hybrid / RERANK](search.md) — querying
- [WHERE Filters](filters.md) — payload filtering
- [Collections & Quantization](collections.md) — managing collections
- [Scripts: EXECUTE / DUMP](scripts.md) — automating with script files
- [Scripts: EXECUTE / DUMP](scripts.md) — automating with script files and batch blocks
- [Programmatic Usage](programmatic.md) — sync/async APIs, batching, parameterized queries, gRPC
- [Embedding Models](reference.md#embedding-models) — model reference
129 changes: 128 additions & 1 deletion docs/programmatic.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ single connection to Qdrant once and reuses it for every `run_query()` call —
more efficient than the legacy `run_query()` function, which creates a new
client on every invocation.

Use `AsyncConnection` when your application already runs on `asyncio`.

### Basic usage

```python
Expand Down Expand Up @@ -70,6 +72,22 @@ with Connection("https://<your-cluster>.qdrant.io", secret="<your-api-key>") as
print(result.data)
```

### gRPC transport

QQL can ask the Qdrant client to prefer gRPC for lower request overhead:

```python
from qql import Connection

with Connection(
"http://localhost:6333",
prefer_grpc=True,
grpc_port=6334,
) as conn:
result = conn.run_query("SHOW COLLECTIONS")
print(result.data)
```

### Custom embedding model

```python
Expand Down Expand Up @@ -155,9 +173,117 @@ with Connection("http://localhost:6333") as conn:
| `url` | `str` | `"http://localhost:6333"` | Qdrant instance URL |
| `secret` | `str \| None` | `None` | API key; `None` for unauthenticated |
| `default_model` | `str \| None` | `None` → `sentence-transformers/all-MiniLM-L6-v2` | Dense embedding model used when no `USING MODEL` clause is given |
| `prefer_grpc` | `bool` | `False` | Passes `prefer_grpc=True` to the Qdrant client |
| `grpc_port` | `int` | `6334` | gRPC port used when `prefer_grpc=True` |
| `default_dense_vector_name` | `str` | `"dense"` | Dense vector name used when QQL creates a collection and no explicit `USING VECTOR` name is given |
| `default_sparse_vector_name` | `str` | `"sparse"` | Sparse vector name used when QQL creates a hybrid collection and no explicit sparse vector name is given |

---

## Parameterized Queries

Parameterized helpers render `:name` placeholders before parsing the QQL statement. String values are quoted and escaped; booleans are rendered as `true` / `false`.

```python
from qql import Connection

with Connection("http://localhost:6333") as conn:
result = conn.run_parameterized_query(
"SEARCH notes SIMILAR TO :query LIMIT 5 WHERE author = :author",
{"query": "vector database", "author": "alice"},
)

results = conn.run_parameterized_batch(
"SEARCH notes SIMILAR TO :query LIMIT 5 WHERE category = :category",
[
{"query": "brain stroke", "category": "Neurology"},
{"query": "heart attack", "category": "Cardiology"},
],
)
```

Parameterized queries are a convenience for building QQL strings safely in application code; they are not sent to Qdrant as server-side prepared statements.

---

## Batch Execution

`run_queries_batch()` parses multiple QQL strings into a `BatchBlockStmt`. The executor groups compatible statements:

- compatible `SEARCH` / `RECOMMEND` statements use Qdrant `query_batch_points`
- compatible `INSERT` statements become one `INSERT BULK`
- mixed or incompatible statements still execute in order

```python
from qql import Connection

with Connection("http://localhost:6333") as conn:
results = conn.run_queries_batch([
"SEARCH docs SIMILAR TO 'neurology' LIMIT 5",
"SEARCH docs SIMILAR TO 'cardiology' LIMIT 5",
])

for result in results:
print(result.message)
```

For ergonomic batching in application code, use `QQLBatch`:

```python
from qql import Connection, QQLBatch

with Connection("http://localhost:6333") as conn:
with QQLBatch(conn) as batch:
neuro = batch.add("SEARCH docs SIMILAR TO 'neurology' LIMIT 5")
cardio = batch.add("SEARCH docs SIMILAR TO 'cardiology' LIMIT 5")

print(neuro.result.data)
print(cardio.result.data)
```

Each proxy's `.result` becomes available after the context manager exits.

---

## Async API

`AsyncConnection` mirrors the sync API for `asyncio` applications and uses `AsyncQdrantClient` under the hood.

```python
from qql import AsyncConnection

async with AsyncConnection("http://localhost:6333") as conn:
await conn.run_query(
"INSERT INTO COLLECTION notes VALUES {'text': 'async QQL'}"
)
result = await conn.run_query(
"SEARCH notes SIMILAR TO 'async vector search' LIMIT 5"
)
print(result.data)
```

Async batching and parameterized helpers are also available:

```python
from qql import AsyncConnection, QQLAsyncBatch

async with AsyncConnection("http://localhost:6333", prefer_grpc=True) as conn:
result = await conn.run_parameterized_query(
"SEARCH docs SIMILAR TO :query LIMIT 5",
{"query": "clinical notes"},
)

async with QQLAsyncBatch(conn) as batch:
first = batch.add("SEARCH docs SIMILAR TO 'neurology' LIMIT 5")
second = batch.add("SEARCH docs SIMILAR TO 'cardiology' LIMIT 5")

print(first.result.data, second.result.data)
```

The async executor preserves the same `ExecutionResult` shape as the sync executor.

---

### Power-user: `executor` property

For low-level access to the pipeline, use `conn.executor` directly:
Expand Down Expand Up @@ -250,7 +376,8 @@ class ExecutionResult:
|---|---|
| INSERT (dense) | `{"id": int \| "<uuid>", "collection": "<name>"}` |
| INSERT (hybrid) | `{"id": int \| "<uuid>", "collection": "<name>"}` |
| INSERT BULK | `None` (count in `result.message`) |
| INSERT BULK | `{"ids": [int \| "<uuid>", ...]}` |
| BEGIN BATCH / programmatic batch | `[ExecutionResult, ...]` |
| SELECT | `{"id": str, "payload": dict}` or `None` when not found |
| SEARCH | `[{"id": str, "score": float, "payload": dict}, ...]` |
| SCROLL | `{"points": [{"id": str, "payload": dict}, ...], "next_offset": str \| int \| None}` |
Expand Down
39 changes: 34 additions & 5 deletions docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ title: "Reference"

# Reference — Models, Config, Project Structure, Errors

Default embedding models, configuration parameters, project layout, and common error codes for troubleshooting.
Default embedding models, configuration parameters, public APIs, project layout, and common error codes for troubleshooting.

---

Expand Down Expand Up @@ -147,30 +147,56 @@ You can edit this file directly to change the default model without reconnecting

---

## Public Python API

| API | Description |
|---|---|
| `Connection` | Stateful sync QQL client backed by `QdrantClient` |
| `AsyncConnection` | Stateful async QQL client backed by `AsyncQdrantClient` |
| `QQLBatch` | Sync context manager for collecting statements and resolving per-statement results after execution |
| `QQLAsyncBatch` | Async context manager equivalent of `QQLBatch` |
| `Executor` | Low-level sync AST executor |
| `AsyncExecutor` | Low-level async AST executor |
| `ExecutionResult` | Standard result object returned by all operations |

Both sync and async connections support:

- `run_query(query)`
- `run_queries_batch([query, ...])`
- `run_parameterized_query(template, params)`
- `run_parameterized_batch(template, [params, ...])`
- `prefer_grpc=True` and `grpc_port=<port>` connection options

---

## Project Structure

```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language to fenced code block for lint compliance.

Line 174 opens a fenced block without a language, which trips MD040 and can fail docs linting pipelines.

Proposed fix
-```
+```text
 qql/
 ├── pyproject.toml          # Package config; installs the `qql` CLI command
 ...
-```
+```
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 174-174: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference.md` at line 174, Add a language identifier to the opening
fenced code block so the markdown linter (MD040) passes: change the opening ```
to ```text (i.e., update the fenced block that currently starts with ``` before
the listing "qql/" to begin with ```text) so the code block has an explicit
language.

qql/
├── pyproject.toml # Package config; installs the `qql` CLI command
├── src/
│ └── qql/
│ ├── __init__.py # Public API: Connection, run_query()
│ ├── __init__.py # Public API exports: sync, async, batching, parser/executor
│ ├── cli.py # CLI entry point: connect, disconnect, execute, dump, REPL
│ ├── config.py # QQLConfig dataclass + ~/.qql/config.json I/O
│ ├── connection.py # Connection class — stateful programmatic API
│ ├── connection.py # Sync Connection, QQLBatch, parameterized query helpers
│ ├── async_connection.py # AsyncConnection and QQLAsyncBatch
│ ├── exceptions.py # QQLError, QQLSyntaxError, QQLRuntimeError
│ ├── lexer.py # Tokenizer: string → List[Token]
│ ├── ast_nodes.py # Frozen dataclasses for each statement and filter type
│ ├── parser.py # Recursive descent parser: tokens → AST node
│ ├── embedder.py # Embedder (dense) + SparseEmbedder (BM25) + CrossEncoderEmbedder (rerank)
│ ├── executor.py # AST node → Qdrant client call + filter + hybrid search
│ ├── executor.py # Sync AST node → Qdrant client call
│ ├── async_executor.py # Async AST node → AsyncQdrantClient call
│ ├── utils.py # Shared pure helpers for parsing, filters, batching, vectors
│ ├── script.py # Script runner: parse and execute .qql files statement by statement
│ └── dumper.py # Collection exporter: scroll all points → .qql INSERT BULK script
└── tests/
├── test_lexer.py # Tokenizer unit tests
├── test_parser.py # Parser unit tests
├── test_executor.py # Executor unit tests (mocked Qdrant client)
├── test_connection.py # Connection class unit tests (mocked Qdrant client)
├── test_async_connection.py # AsyncConnection / AsyncExecutor tests
├── test_script.py # Script runner unit tests
└── test_dumper.py # Dumper unit tests
```
Expand All @@ -185,7 +211,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
pytest tests/ -v
```

Expected output: **604 tests passing**.
Expected output: **635 tests passing**.

---

Expand Down Expand Up @@ -218,3 +244,6 @@ Expected output: **604 tests passing**.
| `Unknown index type '...'` | Invalid schema type in CREATE INDEX | Use one of: `keyword`, `integer`, `float`, `bool`, `text`, `geo`, `datetime`, `uuid` |
| `Unknown CREATE INDEX option '...'` | Unsupported advanced option for the chosen payload index type | Check which `WITH { ... }` keys are supported for `keyword`, `uuid`, or `text` |
| `Qdrant error during CREATE INDEX: ...` | Qdrant rejected the index creation | Check field name and collection state |
| `Unterminated batch block; expected END BATCH` | A `BEGIN BATCH` block was not closed | Add `END BATCH` at the end of the block |
| `Batch has not been executed yet.` | Read a `QQLBatch` proxy result before leaving the context manager | Access `.result` only after the `with QQLBatch(...)` block exits |
| `AsyncBatch has not been executed yet.` | Read a `QQLAsyncBatch` proxy result before leaving the async context manager | Access `.result` only after the `async with QQLAsyncBatch(...)` block exits |
Loading
Loading