feat(payload): positional-cursor /check paging; remove the spool#52
Open
andinux wants to merge 14 commits into
Open
feat(payload): positional-cursor /check paging; remove the spool#52andinux wants to merge 14 commits into
andinux wants to merge 14 commits into
Conversation
cloudsync_payload_chunks could only resume a window from since>db_version, so a chunk boundary that lands inside a single committed db_version (or inside a fragmented oversized value) was not addressable — which is the whole reason the server stages the stream into a cloudsync_payload_spool table to page it out. This makes the vtab resumable by position instead. Add an optional positional cursor: hidden inputs resume_db_version, resume_seq, resume_frag_offset start the scan at (db_version, seq) inclusive and re-enter a mid-value fragment at a byte offset; new outputs next_db_version, next_seq, next_frag_offset and is_final report where the emitted chunk stopped. A stateless /check can then page the whole window with an O(1) seek per call (vs O(N^2) replay-from-since), no spool table, no server-side state. Legacy since>db_version callers (send path, spool fill) are unchanged: columns are appended and the positional branch only activates when resume_db_version is bound. Tiling is exact, not idempotent-overlap: (db_version, seq) is a unique total order (changes rowid = (db_version<<30)|seq), next_* names the row that did not fit or the exact byte already emitted, and the fragment plan is a deterministic function of the row, so a resumed fragment tiles identically. The drain-start cursor must seek exclusive-after the last applied change (see docs/internal design note) so the protocol never relies on changes-level idempotency to absorb a re-sent row. New unit test Payload Chunks Positional Resume drives a window mixing a db_version split across chunks with a value larger than the chunk budget, pages it one chunk per call via the cursor, and asserts byte-identity with a full-window scan (including a mid-fragment resume). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…into codex/positional-cursor-chunks
… SRF Mirror the SQLite vtab's positional cursor on the PostgreSQL SRF so the /check job can page the node one chunk per round-trip without the cloudsync_payload_spool table. Adds inputs resume_db_version, resume_seq, resume_frag_offset (start the scan at (db_version, seq) inclusive and re-enter a mid-value fragment at a byte offset) and outputs next_db_version, next_seq, next_frag_offset, is_final (where the emitted chunk stopped). The positional query reuses the (db_version > $ OR (db_version = $ AND seq >= $)) shape already used by payload_blob_checked's estimate, with an inclusive seq to match the vtab's exact tiling. Fragment setup is factored into payload_chunks_pg_begin_fragment so a streamed and a resumed fragment build identically; the next cursor is read from the buffered source row (or the same row mid-fragment), peeking one row ahead to set is_final. Legacy callers (send path, spool fill) are unchanged: new args default to NULL so the positional branch only activates when resume_db_version is bound, and the existing output columns keep their positions. New test 55_payload_chunks_positional_resume.sql resumes at every non-final chunk's reported cursor and asserts the fetched chunk equals the next chunk of a full-window scan byte-for-byte, including a mid-fragment resume. Verified against PostgreSQL 17; existing chunk/spool/fragment tests (52-54) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Part 1 of test 55 proves the positional cursor produces byte-identical chunks to a full scan. Part 2 now closes the loop end-to-end: a plpgsql helper drains the whole window the way the /check job will (legacy exclusive since=0, then the positional cursor one chunk per call, ORDER BY chunk_index LIMIT 1 so each value-per-call SRF runs to completion), the drained stream is applied to a fresh database, and the receiver's table content is hashed and compared to the source. This validates the real path (positional drain -> apply -> faithful replica), not just byte-identity against a baseline. Verified on PostgreSQL 17: 501 rows reproduced, hashes match. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Mirror the PostgreSQL test's end-to-end check on the SQLite side: after asserting the positional drain reproduces the full-scan chunks byte-for-byte, apply that drained stream to a fresh receiver database and assert its split_test content matches the source via test_split_tables_equal. Chunks are applied in reverse drain order so the apply path must reassemble v3 fragments and merge rows independent of transport order. Validates positional drain -> apply -> faithful replica, the real path the /check job will use. All unit tests pass, no leaks. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
chunk_bench builds a window of N chunks and times paging the whole window one chunk per call via the positional cursor on cloudsync_payload_chunks, reporting wall time, per-chunk cost and throughput. Local-only (loads the built dylib, no network). Env-tunable rows/row_bytes/txns/repeats/chunk_size (TXNS splits the rows across db_versions). Lets the drain's computational growth be tracked — e.g. to confirm a future indexed (db_version, seq) seek flattens it from O(N^2) to O(N). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ions The /check job now pages the tenant via the positional cursor on cloudsync_payload_chunks, so the server-side download spool is dead. Remove the cloudsync_payload_spool table and the cloudsync_payload_spool_fill / _drop / _drop_chunk functions on both engines (SQLite: sql_sqlite.c constants, sql.h decls, cloudsync_sqlite.c functions + registration; PostgreSQL: cloudsync.sql.in and the 1.0->1.1 migration), plus the SQLite "Payload Download Spool" unit test and the spool sections of the PostgreSQL chunk test. The client-side cursor paging (network.c) is unchanged: the wire protocol still pages chunks by an integer cursor; only the server's node-side staging mechanism changed. All remaining chunk/fragment/positional tests pass on both engines. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… max network_sync_state_update_from_response kept the maximum optimistic/confirmed version seen across a multi-chunk send. But the server can move lastOptimisticVersion BACKWARD when a later chunk fails (rollback), and that value becomes the durable send checkpoint (CLOUDSYNC_KEY_SEND_DBVERSION) — so masking a decrease advanced the checkpoint past the rolled-back changes, which were then never re-sent (silent data loss). Take the latest valid (>= 0) value instead, matching how `gaps` is already handled; a missing/unparseable field (-1) still leaves the value untouched. Pre-existing since the chunked-payload transport work, unreleased. Add a network-layer unit test harness (the layer had none): network_unit.c is built with networking compiled in (T_CFLAGS minus OMIT_NETWORK, linked against curl) so the internal parsing/state functions can be exercised on in-memory NETWORK_RESULT buffers with no server. `make network-unittest` runs it (also wired into `make test`). The new test reproduces the rollback (50→100→50 must end at 50, not 100) and covers the missing-field no-op and network_compute_status; network_sync_state_update_from_response / network_compute_status are de-static'd for it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The network-unittest target (added in 01f6d25) links dist/network_unit as an executable from NT_LDFLAGS, derived from the platform LDFLAGS. Linux, Android and Windows put -shared in LDFLAGS (the extension is a .so there); the filter only stripped macOS's -dynamiclib, so -shared leaked into the executable link and ld tried to emit a shared object from non-PIC objects: relocation R_X86_64_PC32 ... can not be used when making a shared object Strip -shared too so it links as a plain executable. macOS is unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e range cloudsync_network_send_changes announced each chunk's actual [db_version_min, db_version_max] to /apply. When the local db_version clock has advanced past the site's own changes — e.g. after merging applied remote changes, or any transaction that produced no local-site change — those skipped versions are absent from the sent ranges, so the server's per-site coverage (ComputeGaps walks it from version 1) reports them as gaps and never advances lastOptimisticVersion to the local version. The send then churns and never reaches "synced". Announce the covered window instead: min = a running window lower bound (send_checkpoint+1, chained to the previous chunk's max+1), max = the chunk's own max, tiling [send_checkpoint+1 .. watermark] contiguously. network_announce_min() clamps the min to the chunk's own min so consecutive chunks sharing a db_version (a value fragmented across chunks) keep min<=max; a partial-fragment failure stays safe because the receiver stages incomplete fragments (0 applied rows), so coverage is only recorded once the value is whole. Pre-existing since the chunked-payload transport (f1ace81), unreleased. Tests: network-unit case for the tiling (leading hole / inter-chunk hole / same-version fragment pair); e2e repro test_send_gap_from_clock_hole forces a clock jump with cloudsync_db_version_next() and asserts serverVersion reaches localVersion (red before this change, green after). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e binary The Android CI runs test binaries on the emulator from a command script generated by `make test -n` (it pushes the workspace and executes the script via adb). The network-unittest recipe linked dist/network_unit *inside the recipe*, so `make -n` emitted the cross-compiler link command into that on-device script; the emulator has no host NDK toolchain, so it failed with x86_64-linux-android26-clang: not found (EXIT_CODE=127). Move the link to a file-target rule (like dist/unit) and keep only the run in the network-unittest recipe, so once the host build is done `make -n` emits just `./dist/network_unit`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add step-by-step comments through the benchmark flow and, in trace builds, print the raw JSON returned by the cloudsync_network_* functions: route the pre-measure-sync and cleanup-send calls through a trace-aware db_exec_network helper, and surface the send/check response JSON from timed_request. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Android CI host runs `make test`, which aborts when `unittest` runs the cross-built dist/unit (an Android binary on the Linux host). dist/network_unit is not in TEST_TARGET (network_unit.c is filtered out of TEST_SRC), so it was built only via the network-unittest prerequisite — ordered after unittest, i.e. never, once make aborted. `make test -n` then saw the binary missing and emitted the NDK link into the on-emulator command script (2028cc3's file-rule split was necessary but not sufficient). List dist/network_unit as a direct `test` prerequisite before unittest so it builds during the file-build phase, exactly like dist/unit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The send path's status/version reporting had no e2e coverage — both the spurious gap and the optimistic-version regression moved data correctly while reporting the wrong status, so data-only assertions missed them. Add two helpers: - db_send_ok(): run a send expected to succeed and assert it did not fail at the protocol level (status != "error", no send.lastFailure) — catches a server-reported apply/check failure that does not raise a SQL error, which a plain db_exec of the send ignored. Applied at the basic and chunked primary send sites (not the two failure-path tests, nor the best-effort in-loop sends). - db_send_await_converge(): send, then poll until send.serverVersion reaches send.localVersion (durably covered, no gap), robust to the server's async apply. test_send_gap_from_clock_hole now uses it instead of a single synchronous assert. Full e2e suite (incl. chunked) green against a live server. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the server-side download spool with a positional cursor on
cloudsync_payload_chunks, so the/checkjob can page a tenant's change windowone chunk per round-trip with no temporary table. This is the extension half of the
change; the companion server PR is sqliteai/cloudsync#52.
This branch also carries three client send-path fixes found while benchmarking
the change — see Send-path fixes.
Why
The spool stages a whole
/checkwindow into acloudsync_payload_spooltable onthe tenant, then pages it out. That has three operational costs this PR removes:
spool_fillissuesCREATE TABLE, which the tenant's defaultreadwriteapikey is not authorized for →failures.check: database_permission_denied(the original bug; the bench polled forever). Positional needs no DDL.
Positional writes nothing to the tenant.
creatable.
How it works
cloudsync_payload_chunksgains an optional positional cursor:resume_db_version,resume_seq,resume_frag_offset— start the scanat
(db_version, seq)inclusive and re-enter a mid-value fragment at a byte offset;next_db_version,next_seq,next_frag_offset,is_final— where theemitted chunk stopped.
The server pages with
… LIMIT 1, threading each chunk'snext_*back as the nextresume_*under a pinneduntilwatermark, untilis_final. Tiling is exact —(db_version, seq)is a unique total order, so each change is emitted in exactly onechunk with no overlap and no reliance on changes-level idempotency. The first fetch
uses the legacy exclusive
since(drain-start "exclusive-after" rule).The
/checkclient wire protocol is unchanged — the positional cursor isserver-internal (job ↔ node); the client still pages by an integer artifact cursor.
The positional cursor itself touches no client code; the send-path fixes below are a
separate, unrelated change that does modify
network.c.Feature comparison
CREATE TABLEprivilege/checkPerformance (
test/chunk_bench.c, local SQLite, end-to-end paging)Chunk counts/bytes match exactly (positional is correct, just slower); the ratio
doubles as the window doubles ⇒ positional is O(N²). Root cause: each resume runs
… FROM cloudsync_changes WHERE (db_version,seq) ≥ cursor ORDER BY db_version, seq,and
cloudsync_changeshas no(db_version, seq)index, so it re-scans + re-sortsthe window every call.
Accepted trade-off:
/checkis an async background job, so the O(N²) is node CPUon cold-start/large windows — fine for typical windows and mitigatable. The real O(N)
fix is a follow-up (indexed
(db_version, seq)seek oncloudsync_changes);test/chunk_bench.cis kept as the guard to confirm it flattens the curve.Changes
cloudsync_payload_chunkspositional cursor — SQLite vtab (cloudsync_sqlite.c)and PostgreSQL SRF (
cloudsync_postgresql.c+ SQL signatures). Legacy callers(send path) unchanged; columns appended, positional branch only active when
resume_db_versionis bound.cloudsync_payload_spooltable +spool_fill/spool_drop/drop_chunkon both engines, the SQLite
Payload Download Spoolunit test, and the spoolsections of the PG chunk test.
an end-to-end drain→apply→content-hash round-trip.
test/chunk_bench.c— local positional-drain benchmark / O(N²)→O(N) guard.Send-path fixes (independent of the positional cursor)
Three pre-existing bugs in the unreleased 1.1 chunked send path, surfaced while
benchmarking. No CHANGELOG entries (introduced and fixed within the same unreleased
version).
Send announces the covered window, not the raw change range.
cloudsync_network_send_changesannounced each chunk's actual[db_version_min, db_version_max]to/apply. When the local db_version clock has run ahead of thesite's own changes — after merging applied remote changes, or any transaction that
produced no local-site change — the skipped versions are absent from the sent
ranges, so the server's per-site coverage (
ComputeGapswalks it from version 1)reports them as gaps and never advances
lastOptimisticVersionto the localversion; the send then churns and never reaches
synced. Now it announces thecovered window
[min(running_lo, chunk_min) … chunk_max],running_lochainedfrom
send_checkpoint+1, tiling[send_checkpoint+1 … watermark]contiguously.The
min(…, chunk_min)clamp keepsmin ≤ maxfor consecutive chunks that sharea db_version (a value fragmented across chunks); a partial-fragment failure stays
safe because the receiver stages incomplete fragments (0 applied rows), so coverage
is recorded only once a value is whole.
lastOptimisticVersiontaken latest-valid, not monotonic-max.network_sync_state_update_from_responsekept the maximum optimistic/confirmedversion across a multi-chunk send. The server moves the optimistic version
backward when a later chunk fails (rollback), and that value becomes the durable
send checkpoint — so the max masked the rollback and advanced the checkpoint past
rolled-back changes (silent data loss). Now it takes the latest valid (
>= 0)value, matching how
gapsis already handled.Client fail-fast on a non-retryable
failures.check— surfaces a permanenterror immediately instead of polling until timeout.
New network-layer unit-test harness (
make network-unittest,test/network_unit.c),built with networking compiled in so
network.c's internal functions run on in-memorybuffers with no server — the layer previously had none. Plus an e2e repro
test_send_gap_from_clock_holethat forces a clock hole withcloudsync_db_version_next()and asserts
serverVersionreacheslocalVersion(red before fix 1, green after).Verification
the spool removed.
make network-unittestpasses (version fold, announce-window tiling — leadinghole / inter-chunk hole / same-version fragment — and status computation).
test_send_gap_from_clock_holerepro.Follow-ups (not blocking)
(db_version, seq)seek oncloudsync_changes→ positional drain O(N)./checkjob-duration metric/alert for pathological large windows.🤖 Generated with Claude Code