Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ As of May 2026, SQLRite has:
- Full-text search + hybrid retrieval (Phase 8 complete): FTS5-style inverted index with BM25 ranking + `fts_match` / `bm25_score` scalar functions + `try_fts_probe` optimizer hook + on-disk persistence with on-demand v4 → v5 file-format bump (8a-8c), a worked hybrid-retrieval example combining BM25 with vector cosine via raw arithmetic (8d), and a `bm25_search` MCP tool symmetric with `vector_search` (8e). See [`docs/fts.md`](fts.md).
- SQL surface + DX follow-ups (Phase 9 complete, v0.2.0 → v0.9.1): DDL completeness — `DEFAULT`, `DROP TABLE` / `DROP INDEX`, `ALTER TABLE` (9a); free-list + manual `VACUUM` (9b) + auto-VACUUM (9c); `IS NULL` / `IS NOT NULL` (9d); `GROUP BY` + aggregates + `DISTINCT` + `LIKE` + `IN` (9e); four flavors of `JOIN` — INNER, LEFT, RIGHT, FULL OUTER (9f); prepared statements + `?` parameter binding with a per-connection LRU plan cache (9g); HNSW probe widened to cosine + dot via `WITH (metric = …)` (9h); `PRAGMA` dispatcher with the `auto_vacuum` knob (9i)
- Benchmarks against SQLite + DuckDB (Phase 10 complete, SQLR-4 / SQLR-16): twelve-workload bench harness with a pluggable `Driver` trait, criterion-driven, pinned-host runs published. See [`docs/benchmarks.md`](benchmarks.md).
- Phase 11 (concurrent writes via MVCC + `BEGIN CONCURRENT`, SQLR-22) is **shipped end-to-end through 11.11a** plus the 11.12 docs sweep — a small set of follow-ups (checkpoint-drain to enable `Mvcc → Wal` downgrade; indexes under MVCC; the "N concurrent writers" benchmark workload) remain explicitly parked. `Connection` is `Send + Sync`; `Connection::connect()` mints sibling handles. `sqlrite::mvcc` exposes `MvccClock`, `ActiveTxRegistry`, `MvStore`, `ConcurrentTx`, and the `MvccCommitBatch` / `MvccLogRecord` WAL codec. WAL header v1 → v2 persisted the clock high-water mark; v2 → v3 added typed MVCC log-record frames. `PRAGMA journal_mode = mvcc;` opts a database into MVCC. `BEGIN CONCURRENT` writes commit-validate against `MvStore`, abort with `SQLRiteError::Busy`, and append a typed MVCC log-record frame to the WAL — covered by the same fsync as the legacy page commit. Reopen replays those frames into `MvStore` and seeds `MvccClock` past the highest committed `commit_ts`, so the MVCC conflict-detection window survives a process restart. Reads via `Statement::query` see the BEGIN-time snapshot. Per-commit GC + `vacuum_mvcc()` bound version-chain growth. C FFI / Python / Node / Go propagate `Busy` / `BusySnapshot` as typed retryable errors; the FFI's `sqlrite_connect_sibling`, Python's `Connection.connect()`, and Node's `db.connect()` mint sibling handles that share backing state. The `sqlrite` REPL ships `.spawn` / `.use` / `.conns` for interactive demos. **User-facing reference:** [`docs/concurrent-writes.md`](concurrent-writes.md); runnable example at [`examples/rust/concurrent_writers.rs`](../examples/rust/concurrent_writers.rs). Original design proposal: [`docs/concurrent-writes-plan.md`](concurrent-writes-plan.md).
- Phase 11 (concurrent writes via MVCC + `BEGIN CONCURRENT`, SQLR-22) is **shipped end-to-end** — `Connection` is `Send + Sync`; `Connection::connect()` mints sibling handles. `sqlrite::mvcc` exposes `MvccClock`, `ActiveTxRegistry`, `MvStore`, `ConcurrentTx`, and the `MvccCommitBatch` / `MvccLogRecord` WAL codec. WAL header v1 → v2 persisted the clock high-water mark; v2 → v3 added typed MVCC log-record frames. `PRAGMA journal_mode = mvcc;` opts a database into MVCC. `BEGIN CONCURRENT` writes commit-validate against `MvStore`, abort with `SQLRiteError::Busy`, and append a typed MVCC log-record frame to the WAL — covered by the same fsync as the legacy page commit. Reopen replays those frames into `MvStore` and seeds `MvccClock` past the highest committed `commit_ts`. Reads via `Statement::query` see the BEGIN-time snapshot. Per-commit GC + `vacuum_mvcc()` bound version-chain growth. C FFI / Python / Node / Go all propagate `Busy` / `BusySnapshot` as typed retryable errors *and* mint sibling handles that share backing state — Go's process-level path registry (Phase 11.11c) handles cross-`*sql.DB` sharing too. The `sqlrite` REPL ships `.spawn` / `.use` / `.conns` for interactive demos; the SQLR-16 benchmark suite adds `W13` (concurrent writers, mostly disjoint rows) as the Phase-11 differentiator workload. The only remaining items are deferred-by-design or foundation work: indexes under MVCC (11.10) and the checkpoint-drain follow-up (parked half of 11.9). **User-facing reference:** [`docs/concurrent-writes.md`](concurrent-writes.md); runnable example at [`examples/rust/concurrent_writers.rs`](../examples/rust/concurrent_writers.rs). Original design proposal: [`docs/concurrent-writes-plan.md`](concurrent-writes-plan.md).
- A fully-automated release pipeline that ships every product to its registry on every release with one human action — Rust engine + `sqlrite-ask` + `sqlrite-mcp` to crates.io, Python wheels to PyPI (`sqlrite`), Node.js + WASM to npm (`@joaoh82/sqlrite` + `@joaoh82/sqlrite-wasm`), Go module via `sdk/go/v*` git tag, plus C FFI tarballs, MCP binary tarballs, and unsigned desktop installers as GitHub Release assets (Phase 6 complete)

See the [Roadmap](roadmap.md) for the full phase plan.
Expand Down
4 changes: 2 additions & 2 deletions docs/concurrent-writes.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,10 +193,10 @@ Sibling propagation across each SDK (Phase 11.7 + 11.8):
| C FFI | `sqlrite_connect_sibling(existing, out)` | `SqlriteStatus::Busy` / `BusySnapshot`; `sqlrite_status_is_retryable` |
| Python | `conn.connect()` | `sqlrite.BusyError` / `sqlrite.BusySnapshotError` (both subclass `SQLRiteError`) |
| Node.js | `db.connect()` | `errorKind(message)` returns `'Busy'` / `'BusySnapshot'` / `'Other'` |
| Go | `(via database/sql pool — see notes below)` | `errors.Is(err, sqlrite.ErrBusy)` / `ErrBusySnapshot`; `sqlrite.IsRetryable(err)` |
| Go | `database/sql` pool + cross-pool path registry (Phase 11.11c) | `errors.Is(err, sqlrite.ErrBusy)` / `ErrBusySnapshot`; `sqlrite.IsRetryable(err)` |
| WASM | *(deferred — single-threaded runtime)* | *(deferred)* |

For Go, each `sql.Open("sqlrite", path)` still constructs its own backing DB; siblings within a single `sql.DB` pool share state automatically. Cross-pool sharing is a separate follow-up (Phase 11.11b).
For Go, every `sql.Open("sqlrite", path)` against a file-backed read-write DB routes through a process-level path registry (Phase 11.11c) — multiple `sql.Open` calls for the same canonical path mint sibling handles off a shared primary, so each `*sql.DB`'s pool can issue its own `BEGIN CONCURRENT` against the same backing engine. `:memory:` opens stay isolated by design; read-only opens (via `sqlrite.OpenReadOnly`) take a shared lock and bypass the registry. See [`sdk/go/README.md`](../sdk/go/README.md#multi-handle-reads--writes-phase-1111c) for the runnable cross-pool example.

### The retry loop

Expand Down
26 changes: 26 additions & 0 deletions docs/design-decisions.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,32 @@ dispatch tree, every REPL line goes through it.

---

### 12i. Go SDK uses a process-level path registry to mint siblings (Phase 11.11c)

**Decision.** The Go SDK at [`sdk/go/`](../sdk/go/) keeps a process-level `map[string]*sharedEntry` keyed by canonical absolute path. Every file-backed read-write `sql.Open("sqlrite", path)` resolves the path through `filepath.Abs` + `filepath.Clean`, then either creates a registry entry (paying for a real `sqlrite_open`) or mints a **sibling handle** off the existing entry's primary via the FFI's `sqlrite_connect_sibling`. The registry holds a refcount of outstanding siblings; the last close fires `sqlrite_close` on the primary and removes the entry.

**Why.** `database/sql`'s pool model expects `driver.Open` to be cheap and idempotent: a single `*sql.DB` will call it whenever it needs another pool slot, and applications routinely hold multiple `*sql.DB` instances against the same file (one for the API server, one for a background worker, …). SQLRite's engine takes `flock(LOCK_EX)` on the WAL sidecar at the first `Connection::open`, so the second `sqlrite_open` for the same path would deadlock against the first one in the *same* process — a real defect that surfaced as "the existing TestFileBackedPersistsAcrossConnections only works because each `db.Close()` releases the lock before the next `sql.Open`."

The registry is the smallest thing that makes the SDK match the Phase 11.7 / 11.8 contract: sibling handles share `Arc<Mutex<Database>>` and each can hold its own `BEGIN CONCURRENT`. The Python / Node / C SDKs already had this story since 11.8 because they expose sibling creation directly (`Connection.connect()` / `db.connect()` / `sqlrite_connect_sibling`). Go's quirk is `database/sql`'s pool — it asks the driver for connections on its own schedule — so the work happens transparently inside `newConn`.

**Why a process-level (not `*sql.DB`-level) registry.** Cross-`*sql.DB` sharing was the original 11.8 gap. Keying the registry on path rather than on a pool instance is what closes that gap; two `sql.Open` calls in the same process for the same file converge on the same backing engine, same as the FFI / Python / Node story.

**Why a hidden "primary" + refcount, not just the first opener's handle.** The first opener could close *before* the second opener finishes its work. If the registry held only "the first opener's handle" the close would either:
- close the underlying engine (leaving the second opener with a dead handle), or
- need to transfer ownership somewhere, complicating the close path

A hidden primary that the registry itself owns sidesteps both problems: every `*conn` gets its own sibling, closes are independent, and the registry tears the primary down only when the refcount reaches zero.

**Why `:memory:` and read-only opens bypass the registry.** `:memory:` databases are isolated by design — each `sql.Open(":memory:")` is its own DB, matching SQLite. Read-only opens take a shared `flock(LOCK_SH)` that already coexists with other readers; routing them through the registry would give every read-only opener a read-write sibling (since `connect_sibling` doesn't downgrade access mode), which is the wrong abstraction. Cross-pool read-only sharing is a clean follow-up if anyone surfaces the use case.

**Why no symlink resolution.** `filepath.Abs` + `filepath.Clean` is lexical only — two `sql.Open` calls via different symlinks pointing at the same file end up as separate registry entries and the second one fails to acquire the flock. Resolving symlinks via `os.EvalSymlinks` would close that gap but breaks if intermediate path components are themselves symlinks that change between the resolution and the file open. v0 keeps the simple key and documents the symlink caveat in [`sdk/go/README.md`](../sdk/go/README.md); callers who care can pass an `EvalSymlinks`-canonicalized path.

**Lock order.** Two locks are in play: each `*conn`'s `c.mu`, and the registry's `registryMu`. Acquisition order is always `c.mu → registryMu`, never the reverse. `newConn` only holds `registryMu` (the `*conn` doesn't exist yet); `conn.Close` takes `c.mu` first, then `registryMu`. Other operations only hold `c.mu`.

**Plan-doc reference.** [`concurrent-writes-plan.md`](concurrent-writes-plan.md) §10.8 (multi-handle SDK shape, Go follow-up).

---

## Query execution

### 13. `NULL`-as-false in `WHERE` clauses
Expand Down
15 changes: 12 additions & 3 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The project is staged in phases. Each phase is shippable on its own, ends with a working build + full test suite + a commit on `main`, and can be paused between. The README's roadmap section is a summary of this doc.

> **Active frontier (May 2026):** Phases 0–10 shipped end-to-end. After Phase 8 closed the v0.1.x cycle, the v0.2.0 → v0.9.1 wave (Phase 9, sub-phases 9a–9i) landed the SQL surface that had been parked under "possible extras": DDL completeness (DEFAULT, DROP TABLE/INDEX, ALTER TABLE), free-list + auto-VACUUM, IS NULL, GROUP BY + aggregates + DISTINCT + LIKE + IN, four flavors of JOIN, prepared statements with parameter binding, HNSW metric extension, and the PRAGMA dispatcher. Phase 10 published the SQLR-4 / SQLR-16 benchmarks against SQLite + DuckDB. **Current head: v0.9.1.** **Phase 11 (concurrent writes via MVCC + `BEGIN CONCURRENT`, SQLR-22) is shipped end-to-end through 11.12** — the multi-connection foundation, logical clock, `MvStore`, `BEGIN CONCURRENT` writes + commit-time validation, snapshot-isolated reads, garbage collection, SDK propagation across C / Python / Node / Go, multi-handle SDK shape, WAL log-record durability + crash recovery, REPL `.spawn` for interactive demos, and the canonical user-facing reference all landed. A small set of follow-ups (checkpoint-drain to enable `Mvcc → Wal` downgrade, indexes under MVCC, the "N concurrent writers" benchmark workload) remain explicitly parked. See [`concurrent-writes.md`](concurrent-writes.md) for the user-facing reference; [`concurrent-writes-plan.md`](concurrent-writes-plan.md) for the design rationale.
> **Active frontier (May 2026):** Phases 0–10 shipped end-to-end. After Phase 8 closed the v0.1.x cycle, the v0.2.0 → v0.9.1 wave (Phase 9, sub-phases 9a–9i) landed the SQL surface that had been parked under "possible extras": DDL completeness (DEFAULT, DROP TABLE/INDEX, ALTER TABLE), free-list + auto-VACUUM, IS NULL, GROUP BY + aggregates + DISTINCT + LIKE + IN, four flavors of JOIN, prepared statements with parameter binding, HNSW metric extension, and the PRAGMA dispatcher. Phase 10 published the SQLR-4 / SQLR-16 benchmarks against SQLite + DuckDB. **Current head: v0.9.1.** **Phase 11 (concurrent writes via MVCC + `BEGIN CONCURRENT`, SQLR-22) is shipped end-to-end through 11.12 + 11.11b + 11.11c** — the multi-connection foundation, logical clock, `MvStore`, `BEGIN CONCURRENT` writes + commit-time validation, snapshot-isolated reads, garbage collection, SDK propagation across C / Python / Node / Go (cross-pool sibling shape on Go via the path registry), multi-handle SDK shape, WAL log-record durability + crash recovery, REPL `.spawn` for interactive demos, the `W13` concurrent-writers bench workload, and the canonical user-facing reference all landed. The only remaining items are deferred-by-design or foundation work: indexes under MVCC (11.10, Turso punted on the same problem), and the checkpoint-drain follow-up (parked half of 11.9, enables `set_journal_mode(Mvcc → Wal)` once `MvStore` is drainable). See [`concurrent-writes.md`](concurrent-writes.md) for the user-facing reference; [`concurrent-writes-plan.md`](concurrent-writes-plan.md) for the design rationale.

## ✅ Phase 0 — Modernization

Expand Down Expand Up @@ -708,9 +708,18 @@ New `W13` workload in [`benchmarks/`](../benchmarks/) pits SQLRite-MVCC against

Headline numbers will land with the first pinned-host re-publication; v1 ships the workload + correctness gate so any future numbers stand on a verified base.

### Phase 11.11c — Go SDK cross-pool sibling shape *(planned)*
### Phase 11.11c — Go SDK cross-pool sibling shape

Each `sql.Open("sqlrite", path)` today builds an independent backing DB; sharing engine state across `sql.DB` pools needs a process-level registry keyed by path. Bundled into Phase 11.11 originally; split out because it touches the Go binding architecture (cgo + the `database/sql` driver model) rather than the bench harness or the engine. See [`sdk/go/README.md`](../sdk/go/README.md) for the current single-pool sibling story.
The Go SDK ([`sdk/go/`](../sdk/go/)) used to take one engine-level `Connection::open` per `sql.Open("sqlrite", path)`. A second `sql.Open` (or a single pool that grew past one connection) collided with the first opener's `flock(LOCK_EX)` and deadlocked — `database/sql`'s pool model + SQLRite's exclusive-writer lock disagreed.

This slice adds a **process-level path registry** (in [`sdk/go/sqlrite.go`](../sdk/go/sqlrite.go)) keyed by canonical absolute path. File-backed read-write opens now route through it: the first opener pays for a real `sqlrite_open` and the resulting handle is stashed as a hidden "primary" in the registry; subsequent openers mint a **sibling** off that primary via the C FFI's [`sqlrite_connect_sibling`](../sqlrite-ffi/include/sqlrite.h) (shipped in 11.8), sharing the engine's `Arc<Mutex<Database>>` underneath. A refcount tracks outstanding siblings; the registry closes the primary when it hits zero.

- `:memory:` opens stay isolated by design (matches SQLite); each `sql.Open(":memory:")` is its own DB.
- Read-only opens (`sqlrite.OpenReadOnly`) bypass the registry — they take a shared `flock(LOCK_SH)` that can coexist with other readers but conflicts with any writer in the same process.
- Symlinks are **not** resolved; the registry key is `filepath.Abs` + `filepath.Clean`. Symlink-equality is the caller's job (use `os.EvalSymlinks`-ed paths).
- New tests cover cross-`*sql.DB` state sharing, BEGIN CONCURRENT across separate pools with a real Busy + retry, and the refcount dropping to zero on the last close.

End result: every shipped SDK — C FFI / Python / Node / Go — now mints sibling handles that share backing state. The 11.7 retryable-error machinery (`sqlrite.ErrBusy`, `sqlrite.ErrBusySnapshot`, `sqlrite.IsRetryable`) is finally exerciseable cross-pool from Go.

### ✅ Phase 11.12 — Docs sweep *(plan-doc "Phase 10.9")*

Expand Down
Loading
Loading