feat: add async execution and batched programmatic APIs by srimon12 · Pull Request #45 · pavanjava/qql

srimon12 · 2026-05-24T14:02:03Z

Summary

This PR upgrades QQL from a primarily sync, one-query-at-a-time execution layer into a stronger application-facing client API.

It adds async execution, sync/async batching, parameterized query helpers, optional gRPC transport, and a shared utility layer that reduces duplicated executor/parser logic.

What Changed

Programmatic API

Added AsyncConnection for async applications using AsyncQdrantClient
Added AsyncExecutor to mirror the sync executor behavior with async / await
Added QQLBatch and QQLAsyncBatch context managers
Added batch helpers:
- run_queries_batch()
- run_parameterized_batch()
Added parameterized query helpers:
- run_parameterized_query()
- run_parameterized_batch()
Added optional gRPC connection support:
- prefer_grpc
- grpc_port

QQL Language

Added BEGIN BATCH ... END BATCH syntax
Added semicolon token support for batch blocks
Updated script splitting so batch blocks are treated as one executable statement

Executor / Parser Internals

Added BatchBlockStmt
Added grouped batch execution for compatible statements
Combined compatible inserts into bulk insert execution
Combined compatible search/recommend statements through Qdrant batch query APIs
Moved shared pure logic into qql.utils, including:
- filter conversion
- dense query shaping
- vector payload shaping
- topology parsing helpers
- batch grouping
- search parser helpers
- parameter rendering
Integrated the latest collection topology improvements from PR fix for the issue#42 #44

Tests and Docs

Added async connection/executor coverage
Added tests for batching, parameterized queries, gRPC wiring, and async lifecycle behavior
Updated docs for async usage, batching, parameterized queries, gRPC, and batch-block syntax

Why

These changes make QQL more useful inside real Python applications, not only as a CLI.

Before this, repeated programmatic usage was mostly centered around one statement at a time. This PR adds the pieces needed for higher-throughput and async workloads:

reuse one connection across many queries
run multiple compatible queries with fewer Qdrant round trips
use QQL from async services
build query templates with parameters
opt into gRPC transport where appropriate
keep sync and async behavior aligned through shared helpers

Validation

uv run pytest -q
Result: 635 passed

Summary by CodeRabbit

New Features
- Batch query execution with BEGIN BATCH...END BATCH syntax for grouped operations.
- Async Python API with AsyncConnection for asyncio-based applications.
- gRPC transport support via prefer_grpc option.
- Parameterized queries with template-based named parameters.
Documentation
- Expanded programmatic API guide covering sync/async clients, batching, and parameterized queries.
- Added gRPC configuration and batching examples.
Tests
- Increased test coverage to 635 tests (from 549).

Add async QQL support with AsyncConnection and AsyncExecutor, plus sync and async batching helpers for running multiple statements through one programmatic API. Introduce BEGIN BATCH syntax, parameterized query helpers, and optional gRPC connection settings. Refactor shared parser and executor logic into qql.utils so sync and async paths can reuse filter conversion, vector shaping, topology parsing, batch grouping, and search parsing helpers. Update tests and docs for async usage, batching, parameterized queries, gRPC configuration, and batch block execution.

coderabbitai · 2026-05-24T14:02:15Z

📝 Walkthrough

Walkthrough

This pull request implements comprehensive batch query execution and asynchronous support for QQL. The changes enable BEGIN BATCH ... END BATCH syntax for grouped operations, introduce AsyncConnection and AsyncExecutor for async workloads with optional gRPC transport, add parameterized query templating, extract shared utility functions used across both sync and async paths, and expand the public API surface with all new client types.

Changes

Batch Execution & Async Support

Layer / File(s)	Summary
Language Foundation: Lexer, Parser, AST, and Script Splitting `src/qql/lexer.py`, `src/qql/parser.py`, `src/qql/ast_nodes.py`, `src/qql/script.py`	Lexer emits BEGIN/BATCH/END/SEMICOLON tokens; parser routes BEGIN to _parse_batch_block which parses semicolon-separated statements until END BATCH; AST includes new BatchBlockStmt node; script splitting tracks batch_depth to handle nested batch blocks correctly.
Shared Utilities: Batching, Filtering, and Search Parsing `src/qql/utils.py`	New module provides BatchGroup and search option dataclasses, batch statement grouping logic, parameterized query rendering, filter/Qdrant conversion, hybrid fusion and MMR handling, and search clause parsing (LOOKUP, USING, WITH, GROUP BY) extracted for reuse by both executors.
Sync Executor: Refactoring and Batch Support `src/qql/executor.py`	Executor refactored to delegate vector/filter/search helpers to utils; supports BatchBlockStmt via _execute_batch_block and _execute_query_batch; INSERT BULK returns {ids: [...]} instead of single count; collection/topology handling simplified via imported utilities.
Sync Connection & Batch APIs: gRPC, Batching, and Proxy Objects `src/qql/connection.py`	Connection constructor adds prefer_grpc and grpc_port for optional gRPC transport; implements run_queries_batch, run_parameterized_query, and run_parameterized_batch methods; introduces QQLBatch context manager and OperationProxy for deferred batch execution.
Async Executor: Full Async Implementation for All Operations `src/qql/async_executor.py`	Complete AsyncExecutor supporting all collection and data operations (insert/delete/update/search/recommend/batch) with async Qdrant client, hybrid/dense/sparse search modes, GROUP BY support, optional reranking, and concurrent collection creation coordination via async lock.
Async Connection: Async Context Manager and Batch APIs `src/qql/async_connection.py`	AsyncConnection constructs AsyncQdrantClient and AsyncExecutor, exposes async run_query and batching methods, supports async context manager lifecycle, and provides config/executor properties. QQLAsyncBatch context manager queues and executes batches via async run_queries_batch, resolving AsyncOperationProxy instances.
Public API Exports `src/qql/__init__.py`	Expands all to export AsyncConnection, AsyncExecutor, QQLBatch, QQLAsyncBatch, OperationProxy, async variants, error types, parser/lexer, and run_query convenience function.
Documentation: User Guides and API Reference `README.md`, `docs/getting-started.md`, `docs/programmatic.md`, `docs/reference.md`, `docs/scripts.md`	Updates badges (549→635 tests), README examples showing parameterized queries and QQLBatch, Getting Started batch example, Programmatic Usage documentation for AsyncConnection and gRPC, Reference section for public API, and Scripts documentation for batch block execution semantics.
Tests: Async Connection and Batching Validation `tests/test_async_connection.py`, `tests/test_connection.py`	New comprehensive AsyncConnection test suite covering initialization, lifecycle, batching, parameterized rendering, and proxy behavior. New TestConnectionBatching suite validates gRPC parameters, run_queries_batch, parameterized batch/query, and QQLBatch context manager; includes architectural tests for batch grammar strictness, ID propagation, and concurrent collection creation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

pavanjava

Poem

🐰 Hark! Batch blocks now bloom in QQL's domain,
With BEGIN...END, async queries reign—
Async/sync twins dance with gRPC grace,
Parameterized queries find their place,
One 635-test suite shows the gain! 🚀

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 26.52% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add async execution and batched programmatic APIs' accurately and concisely summarizes the main additions (async execution, batching, and programmatic APIs) without being misleading or vague.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/reference.md`:
- Line 174: Add a language identifier to the opening fenced code block so the
markdown linter (MD040) passes: change the opening ``` to ```text (i.e., update
the fenced block that currently starts with ``` before the listing "qql/" to
begin with ```text) so the code block has an explicit language.

In `@src/qql/async_connection.py`:
- Line 193: The batch resolver currently pairs proxies and results with
zip(self._proxies, results) in async_connection.py (and the analogous loop in
connection.py), which silently drops unmatched proxies or results; update the
post-run handling in the async method that calls run_queries_batch(...) to
explicitly compare len(results) to len(self._proxies) and, if they differ, raise
a clear RuntimeError (or set exception on each unmatched Proxy via
Proxy.set_exception) instead of silently leaving proxies unresolved; for matched
entries continue to call proxy.set_result(res) as before, and mirror the same
length-check and unmatched-proxy exception-handling logic in the sync batch
context manager in connection.py so all proxies are either resolved or fail
deterministically.

In `@src/qql/async_executor.py`:
- Around line 293-301: The code currently calls build_dense_point_vector (and
similarly build sparse vectors/PointStruct.vector) using a topology captured
before calling self._ensure_collection, which can race with another coroutine
that creates the collection with different vector names; change the flow so that
you first call self._ensure_collection(...) (and any hybrid creation lock), then
refresh/re-resolve the topology from the collection metadata that was actually
created/returned, and only after that call build_dense_point_vector (and the
corresponding sparse/vector construction) using the refreshed topology; apply
the same pattern for the other occurrences that build vectors (the blocks using
build_dense_point_vector and setting PointStruct.vector at the other mentioned
sites).

In `@src/qql/parser.py`:
- Around line 73-76: The parser currently calls parse() ->
_parse_single_statement() and immediately expects TokenKind.EOF, rejecting
trailing SEMICOLON tokens; update parse() to allow and consume any number of
trailing TokenKind.SEMICOLON tokens after the node is parsed (e.g., loop while
next token is TokenKind.SEMICOLON call the existing token-consumption helper
such as self._expect/_advance/_consume), then call self._expect(TokenKind.EOF)
and return the node so top-level statements like "SHOW COLLECTIONS;" succeed.

In `@src/qql/utils.py`:
- Around line 80-93: The function render_parameterized_query currently
serializes parameters via str() and inadequate escaping, causing None to become
"None" and backslashes/control chars to be unescaped; update
render_parameterized_query to: map None to the QQL NULL literal, escape strings
for single quotes and backslashes and encode common control sequences (e.g., \n,
\t, \\) so they are preserved as literals, format booleans as "true"/"false" and
numbers via their canonical representation, and then replace placeholders (keep
the current placeholder logic and sorted key order) with these properly
serialized QQL literals.
- Around line 160-170: The batching logic currently treats every SearchStmt as a
plain "query" which strips group semantics; change the branch so a SearchStmt is
only added to the "query" batch when it has no grouping (i.e., stmt.group_by and
stmt.group_size are falsy). For SearchStmt instances with group_by/group_size,
flush the current group via _append_batch_group and start a new group that is
not the plain "query" path (e.g., set current_type to a distinct token like
"grouped_search" and current_group = [stmt]) so
_execute_query_batch/run_queries_batch won’t treat it as a flat QueryRequest;
keep the existing behavior for RecommendStmt and non-grouped SearchStmt.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 20d846a7-d3f5-4169-a380-a106ae17a1cb

📥 Commits

Reviewing files that changed from the base of the PR and between dbb6713 and 8a066e1.

📒 Files selected for processing (17)

README.md
docs/getting-started.md
docs/programmatic.md
docs/reference.md
docs/scripts.md
src/qql/__init__.py
src/qql/ast_nodes.py
src/qql/async_connection.py
src/qql/async_executor.py
src/qql/connection.py
src/qql/executor.py
src/qql/lexer.py
src/qql/parser.py
src/qql/script.py
src/qql/utils.py
tests/test_async_connection.py
tests/test_connection.py

coderabbitai · 2026-05-24T14:09:22Z

+
 ## Project Structure

 ```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add language to fenced code block for lint compliance.

Line 174 opens a fenced block without a language, which trips MD040 and can fail docs linting pipelines.

Proposed fix

-``` +```text qql/ ├── pyproject.toml # Package config; installs the `qql` CLI command ... -``` +```

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

```

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 174-174: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference.md` at line 174, Add a language identifier to the opening fenced code block so the markdown linter (MD040) passes: change the opening ``` to ```text (i.e., update the fenced block that currently starts with ``` before the listing "qql/" to begin with ```text) so the code block has an explicit language.

coderabbitai · 2026-05-24T14:09:22Z

+        if not self._queries:
+            return
+        results = await self.connection.run_queries_batch(self._queries)
+        for proxy, res in zip(self._proxies, results):


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect the relevant section in src/qql/async_connection.py sed -n '150,230p' src/qql/async_connection.py | nl -ba | sed -n '1,120p' # 2) Find where `_proxies` is populated/used and where `results` comes from rg -n "_proxies|results" src/qql/async_connection.py # 3) Check Python version / ruff config for zip strict support expectations ls -la fd -t f "pyproject.toml" . -x printf "%s\n" {} fd -t f "*ruff*" . -x printf "%s\n" {} # Show python requires and ruff settings if present if [ -f pyproject.toml ]; then rg -n "requires-python|python|target-version|ruff" pyproject.toml || true sed -n '1,200p' pyproject.toml fi

Repository: pavanjava/qql

Length of output: 100

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect the relevant section in src/qql/async_connection.py # Prefer cat -n (available) over nl if [ ! -f src/qql/async_connection.py ]; then echo "Missing src/qql/async_connection.py" exit 1 fi sed -n '170,220p' src/qql/async_connection.py | cat -n # 2) Locate the exact zip usage and surrounding context (more precise) rg -n "zip$self\._proxies,\s*results$" src/qql/async_connection.py rg -n "zip\(" src/qql/async_connection.py # 3) Find where `_proxies` is defined/populated and where `results` comes from rg -n "_proxies" src/qql/async_connection.py rg -n "results" src/qql/async_connection.py # 4) Check Python version / ruff config for strict zip expectations ls -la fd -t f "pyproject.toml" . -x printf "%s\n" {} || true fd -t f "*ruff*" . -x printf "%s\n" {} || true if [ -f pyproject.toml ]; then echo "---- pyproject.toml (head) ----" sed -n '1,220p' pyproject.toml | cat -n echo "---- ruff/python target hints ----" rg -n "requires-python|target-version|python-version|ruff" pyproject.toml || true fi

Repository: pavanjava/qql

Length of output: 6113

🏁 Script executed:

#!/bin/bash set -euo pipefail # Find AsyncConnection and its run_queries_batch implementation rg -n "class AsyncConnection|def run_queries_batch" -S src # Show the implementation of run_queries_batch (and any helpers it calls) rg -n "run_queries_batch" -S src # Inspect AsyncConnection around the discovered definition # (Use file/line reporting first, then print a local window.) matches=$(rg -n "def run_queries_batch" -S src | head -n 5) echo "$matches" file=$(echo "$matches" | awk -F: '{print $1}' | head -n 1) line=$(echo "$matches" | awk -F: '{print $2}' | head -n 1) echo "Printing $file around line $line" sed -n "$((line-40)),$((line+120))p" "$file" | cat -n # Check ruff config for B905 (zip without strict) if [ -f ruff.toml ]; then rg -n "B905|zip|strict" ruff.toml pyproject.toml || true sed -n '1,200p' ruff.toml | cat -n fi

Repository: pavanjava/qql

Length of output: 9098

🏁 Script executed:

#!/bin/bash set -euo pipefail # Locate BatchBlockStmt definition and executor handling rg -n "BatchBlockStmt" -S src/qql # Inspect AST node definition files=$(rg -l "class BatchBlockStmt" -S src/qql) echo "$files" for f in $files; do echo "---- $f ----" # print small window around class line=$(rg -n "class BatchBlockStmt" "$f" | head -n 1 | awk -F: '{print $2}') start=$((line-30)) end=$((line+120)) sed -n "${start},${end}p" "$f" | cat -n done # Inspect Executor.execute branch/handling for BatchBlockStmt executor_files=$(rg -l "def execute\(" -S src/qql) echo "Executor-related files:" echo "$executor_files" | head -n 20 # Search specifically for BatchBlockStmt inside executor implementation for f in $executor_files; do if rg -q "BatchBlockStmt" "$f"; then echo "---- executor handling in $f ----" # show relevant region rg -n "BatchBlockStmt" "$f" # print around first occurrence occ_line=$(rg -n "BatchBlockStmt" "$f" | head -n 1 | awk -F: '{print $2}') sed -n "$((occ_line-80)),$((occ_line+160))p" "$f" | cat -n fi done # Also inspect ExecutionResult.data definition / type hints res_files=$(rg -l "class ExecutionResult|ExecutionResult" -S src/qql) echo "---- ExecutionResult files ----" echo "$res_files" | head -n 20 for f in $(echo "$res_files" | head -n 5); do echo "---- $f (head) ----" sed -n '1,220p' "$f" | cat -n done

Repository: pavanjava/qql

Length of output: 1243

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Inspect BatchBlockStmt sed -n '320,430p' src/qql/ast_nodes.py | cat -n # 2) Inspect sync executor batch execution sed -n '1600,1760p' src/qql/executor.py | cat -n rg -n "def _execute_batch_block" src/qql/executor.py rg -n "BatchBlockStmt" src/qql/executor.py | head -n 20 # 3) Inspect async executor batch execution sed -n '1160,1260p' src/qql/async_executor.py | cat -n rg -n "def _execute_batch_block" src/qql/async_executor.py rg -n "BatchBlockStmt" src/qql/async_executor.py | head -n 20 # 4) Inspect ExecutionResult structure (especially `.data`) rg -n "class ExecutionResult" -S src/qql res_file=$(rg -l "class ExecutionResult" -S src/qql | head -n 1 || true) echo "ExecutionResult file: ${res_file:-<none>}" if [ -n "${res_file:-}" ]; then sed -n '1,220p' "$res_file" | cat -n fi

Repository: pavanjava/qql

Length of output: 23074

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Find group_batch_statements + inserted_point_results in utils rg -n "def group_batch_statements|def inserted_point_results" -S src/qql # 2) Print those helper implementations utils_file=$(rg -l "def group_batch_statements" -S src/qql | head -n 1) echo "utils_file=$utils_file" sed -n '1,260p' "$utils_file" | cat -n # narrower windows around the functions (avoid huge output) gline=$(rg -n "def group_batch_statements" "$utils_file" | head -n 1 | awk -F: '{print $2}') iline=$(rg -n "def inserted_point_results" "$utils_file" | head -n 1 | awk -F: '{print $2}') sed -n "$((gline-80)),$((gline+200))p" "$utils_file" | cat -n sed -n "$((iline-80)),$((iline+240))p" "$utils_file" | cat -n # 3) Inspect _execute_query_batch return shape rg -n "def _execute_query_batch" -S src/qql/executor.py src/qql/async_executor.py # Print _execute_query_batch in sync executor enough to see how it builds its result list sync_file="src/qql/executor.py" qline=$(rg -n "def _execute_query_batch" "$sync_file" | head -n 1 | awk -F: '{print $2}') sed -n "$((qline-40)),$((qline+240))p" "$sync_file" | cat -n # 4) Inspect async executor _execute_query_batch similarly (shorter window) async_file="src/qql/async_executor.py" aline=$(rg -n "async def _execute_query_batch" "$async_file" | head -n 1 | awk -F: '{print $2}') sed -n "$((aline-40)),$((aline+260))p" "$async_file" | cat -n

Repository: pavanjava/qql

Length of output: 10007

🏁 Script executed:

#!/bin/bash set -euo pipefail # Sync: inspect full _execute_query_batch body boundaries and key return points sync_file="src/qql/executor.py" sync_start=$(rg -n "def _execute_query_batch" "$sync_file" | head -n 1 | cut -d: -f1) sync_next=$(awk -v start="$sync_start" 'NR>start && /^ def / {print NR; exit}' "$sync_file" || true) echo "sync_start=$sync_start sync_next=${sync_next:-<none>}" # Print from start through (start..start+260) and also around first return/all_results usage sed -n "${sync_start},$((sync_start+320))p" "$sync_file" | cat -n | sed -n '1,220p' rg -n "all_results|return .*res|return .*results|return .*ExecutionResult" -S "$sync_file" | head -n 50 # Async: inspect full _execute_query_batch boundaries async_file="src/qql/async_executor.py" async_start=$(rg -n "async def _execute_query_batch" "$async_file" | head -n 1 | cut -d: -f1) async_next=$(awk -v start="$async_start" 'NR>start && /^ async def / {print NR; exit}' "$async_file" || true) echo "async_start=$async_start async_next=${async_next:-<none>}" sed -n "${async_start},$((async_start+320))p" "$async_file" | cat -n | sed -n '1,220p' rg -n "all_results|return .*res|return .*results|return .*ExecutionResult" -S "$async_file" | head -n 50

Repository: pavanjava/qql

Length of output: 23181

🏁 Script executed:

#!/bin/bash set -euo pipefail # Check ruff B905 config (whether it's selected and whether any ignores exist) sed -n '1,120p' ruff.toml | cat -n rg -n "B905" ruff.toml pyproject.toml || true # Find other zip usages to see if strict zip is commonly used elsewhere rg -n "zip\(" src/qql | head -n 50 # Specifically find other proxy resolution zips rg -n "zip\(self\._proxies" -S src/qql

Repository: pavanjava/qql

Length of output: 1799

Guard against silent proxy/result mismatch in batch resolution.

src/qql/async_connection.py resolves proxies with zip(self._proxies, results) without length checking; if run_queries_batch(...) yields fewer results than queued statements, trailing proxies never get resolved (their .result will later raise). The query-batch execution path also doesn’t validate response length vs requests.

Also present in src/qql/connection.py for the sync batch context manager.

Proposed fix

- for proxy, res in zip(self._proxies, results): + for proxy, res in zip(self._proxies, results, strict=True): proxy._resolve(res)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for proxy, res in zip(self._proxies, results):

for proxy, res in zip(self._proxies, results, strict=True):

proxy._resolve(res)

🧰 Tools

🪛 Ruff (0.15.13)

[warning] 193-193: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/qql/async_connection.py` at line 193, The batch resolver currently pairs proxies and results with zip(self._proxies, results) in async_connection.py (and the analogous loop in connection.py), which silently drops unmatched proxies or results; update the post-run handling in the async method that calls run_queries_batch(...) to explicitly compare len(results) to len(self._proxies) and, if they differ, raise a clear RuntimeError (or set exception on each unmatched Proxy via Proxy.set_exception) instead of silently leaving proxies unresolved; for matched entries continue to call proxy.set_result(res) as before, and mirror the same length-check and unmatched-proxy exception-handling logic in the sync batch context manager in connection.py so all proxies are either resolved or fail deterministically.

coderabbitai · 2026-05-24T14:09:22Z

+        await self._ensure_collection(
+            node.collection, len(vector), topology, node.dense_vector
+        )
+        point_vector = build_dense_point_vector(
+            topology,
+            vector,
+            node.dense_vector,
+            self._default_dense_vector_name(),
+        )


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Re-resolve topology before shaping vectors after the create race.

These branches build point_vector / PointStruct.vector from the pre-lock topology, then _ensure_collection() or the hybrid creation lock can discover that another coroutine already created the collection with a different single named vector. In that race, the upsert still uses the stale vector key and will write to the wrong name or fail.

A safer pattern is: resolve or create the collection first, refresh the topology that actually won, and only then build the point vectors from the resolved dense/sparse names.

Also applies to: 355-367, 370-386, 413-418, 423-425

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/qql/async_executor.py` around lines 293 - 301, The code currently calls build_dense_point_vector (and similarly build sparse vectors/PointStruct.vector) using a topology captured before calling self._ensure_collection, which can race with another coroutine that creates the collection with different vector names; change the flow so that you first call self._ensure_collection(...) (and any hybrid creation lock), then refresh/re-resolve the topology from the collection metadata that was actually created/returned, and only after that call build_dense_point_vector (and the corresponding sparse/vector construction) using the refreshed topology; apply the same pattern for the other occurrences that build vectors (the blocks using build_dense_point_vector and setting PointStruct.vector at the other mentioned sites).

coderabbitai · 2026-05-24T14:09:22Z

    def parse(self) -> ASTNode:
+        node = self._parse_single_statement()
+        self._expect(TokenKind.EOF)
+        return node


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Allow trailing semicolons after top-level statements.

SEMICOLON is now tokenized, but parse() still requires EOF immediately after the first statement. That means SHOW COLLECTIONS; and script chunks ending with ; still fail, so semicolons only work inside BEGIN BATCH.

Suggested fix

def parse(self) -> ASTNode: node = self._parse_single_statement() + while self._peek().kind == TokenKind.SEMICOLON: + self._advance() self._expect(TokenKind.EOF) return node

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/qql/parser.py` around lines 73 - 76, The parser currently calls parse() -> _parse_single_statement() and immediately expects TokenKind.EOF, rejecting trailing SEMICOLON tokens; update parse() to allow and consume any number of trailing TokenKind.SEMICOLON tokens after the node is parsed (e.g., loop while next token is TokenKind.SEMICOLON call the existing token-consumption helper such as self._expect/_advance/_consume), then call self._expect(TokenKind.EOF) and return the node so top-level statements like "SHOW COLLECTIONS;" succeed.

coderabbitai · 2026-05-24T14:09:22Z

+def render_parameterized_query(template: str, params: dict[str, Any]) -> str:
+    query_str = template
+    for k in sorted(params.keys(), key=len, reverse=True):
+        val = params[k]
+        placeholder = f":{k}"
+        if isinstance(val, str):
+            escaped_val = val.replace("'", "\\'")
+            repr_val = f"'{escaped_val}'"
+        elif isinstance(val, bool):
+            repr_val = "true" if val else "false"
+        else:
+            repr_val = str(val)
+        query_str = query_str.replace(placeholder, repr_val)
+    return query_str


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Render real QQL literals here instead of raw str() output.

This helper currently mangles valid parameter values: None becomes None instead of NULL, and string backslashes are not escaped, so values containing \n, \t, or \\ are reinterpreted by the lexer. Parameterized queries can therefore change user input or fail to parse.

Suggested direction

+def _render_qql_literal(val: Any) -> str: + if val is None: + return "NULL" + if isinstance(val, str): + escaped = val.replace("\\", "\\\\").replace("'", "\\'") + return f"'{escaped}'" + if isinstance(val, bool): + return "true" if val else "false" + if isinstance(val, list): + return "[" + ", ".join(_render_qql_literal(v) for v in val) + "]" + if isinstance(val, dict): + return ( + "{" + + ", ".join(f"'{k}': {_render_qql_literal(v)}" for k, v in val.items()) + + "}" + ) + return str(val) + def render_parameterized_query(template: str, params: dict[str, Any]) -> str: query_str = template for k in sorted(params.keys(), key=len, reverse=True): - val = params[k] placeholder = f":{k}" - if isinstance(val, str): - escaped_val = val.replace("'", "\\'") - repr_val = f"'{escaped_val}'" - elif isinstance(val, bool): - repr_val = "true" if val else "false" - else: - repr_val = str(val) - query_str = query_str.replace(placeholder, repr_val) + query_str = query_str.replace(placeholder, _render_qql_literal(params[k])) return query_str

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/qql/utils.py` around lines 80 - 93, The function render_parameterized_query currently serializes parameters via str() and inadequate escaping, causing None to become "None" and backslashes/control chars to be unescaped; update render_parameterized_query to: map None to the QQL NULL literal, escape strings for single quotes and backslashes and encode common control sequences (e.g., \n, \t, \\) so they are preserved as literals, format booleans as "true"/"false" and numbers via their canonical representation, and then replace placeholders (keep the current placeholder logic and sorted key order) with these properly serialized QQL literals.

coderabbitai · 2026-05-24T14:09:22Z

+    for stmt in statements:
+        if isinstance(stmt, (SearchStmt, RecommendStmt)):
+            coll = stmt.collection
+            if current_type == "query" and current_collection == coll:
+                current_group.append(stmt)
+                continue
+            _append_batch_group(groups, current_type, current_collection, current_group)
+            current_type = "query"
+            current_collection = coll
+            current_group = [stmt]
+            continue


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Exclude SEARCH ... GROUP BY ... from the flat query batching path.

This batches every SearchStmt into a "query" group, but both _execute_query_batch() implementations only build plain QueryRequests and read response.points. A grouped search run via BEGIN BATCH or run_queries_batch() will therefore lose group_by/group_size semantics and return a flat result list instead.

Suggested fix

- if isinstance(stmt, (SearchStmt, RecommendStmt)): + if isinstance(stmt, RecommendStmt) or ( + isinstance(stmt, SearchStmt) and stmt.group_by is None + ): coll = stmt.collection if current_type == "query" and current_collection == coll: current_group.append(stmt) continue

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/qql/utils.py` around lines 160 - 170, The batching logic currently treats every SearchStmt as a plain "query" which strips group semantics; change the branch so a SearchStmt is only added to the "query" batch when it has no grouping (i.e., stmt.group_by and stmt.group_size are falsy). For SearchStmt instances with group_by/group_size, flush the current group via _append_batch_group and start a new group that is not the plain "query" path (e.g., set current_type to a distinct token like "grouped_search" and current_group = [stmt]) so _execute_query_batch/run_queries_batch won’t treat it as a flat QueryRequest; keep the existing behavior for RecommendStmt and non-grouped SearchStmt.

srimon12 requested a review from pavanjava May 24, 2026 14:02

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add async execution and batched programmatic APIs#45

feat: add async execution and batched programmatic APIs#45
srimon12 wants to merge 1 commit into
pavanjava:mainfrom
srimon12:feat/refractor-batch-grpc-async-support

srimon12 commented May 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

coderabbitai Bot May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	for proxy, res in zip(self._proxies, results):
	for proxy, res in zip(self._proxies, results, strict=True):
	proxy._resolve(res)

Conversation

srimon12 commented May 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Programmatic API

QQL Language

Executor / Parser Internals

Tests and Docs

Why

Validation

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

srimon12 commented May 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 24, 2026 •

edited

Loading