diff --git a/.claude/commands/stress-test-sync-sqlitecloud.md b/.claude/commands/stress-test-sync-sqlitecloud.md new file mode 100644 index 0000000..2540008 --- /dev/null +++ b/.claude/commands/stress-test-sync-sqlitecloud.md @@ -0,0 +1,192 @@ +# Sync Stress Test with remote SQLiteCloud database + +Execute a stress test against the CloudSync server using multiple concurrent local SQLite databases syncing large volumes of CRUD operations simultaneously. Designed to reproduce server-side errors (e.g., "database is locked", 500 errors) under heavy concurrent load. + +## Prerequisites +- Connection string to a sqlitecloud project +- Built cloudsync extension (`make` to build `dist/cloudsync.dylib`) + +## Test Configuration + +### Step 1: Gather Parameters + +Ask the user for the following configuration using a single question set: + +1. **CloudSync server address** — propose `https://cloudsync.sqlite.ai` as default (this is the built-in default). If the user provides a different address, save it as `CUSTOM_ADDRESS` and use `cloudsync_network_init_custom` instead of `cloudsync_network_init`. +2. **SQLiteCloud connection string** — format: `sqlitecloud://:/?apikey=`. If no `` is in the path, ask the user for one or propose `test_stress_sync`. +3. **Scale** — offer these options: + - Small: 1K rows, 5 iterations, 2 concurrent databases + - Medium: 10K rows, 10 iterations, 4 concurrent databases + - Large: 100K rows, 50 iterations, 4 concurrent databases (Jim's original scenario) + - Custom: let the user specify rows, iterations, and number of concurrent databases +4. **RLS mode** — with RLS (requires user tokens) or without RLS +5. **Table schema** — offer simple default or custom: + ```sql + CREATE TABLE test_sync (id TEXT PRIMARY KEY, user_id TEXT NOT NULL DEFAULT '', name TEXT, value INTEGER); + ``` + +Save these as variables: +- `CUSTOM_ADDRESS` (only if the user provided a non-default address) +- `CONNECTION_STRING` (the full sqlitecloud:// connection string) +- `DB_NAME` (database name extracted or provided) +- `HOST` (hostname extracted from connection string) +- `APIKEY` (apikey extracted from connection string) +- `ROWS` (number of rows per iteration) +- `ITERATIONS` (number of delete/insert/update cycles) +- `NUM_DBS` (number of concurrent databases) + +### Step 2: Setup SQLiteCloud Database and Table + +Connect to SQLiteCloud using `~/go/bin/sqlc` (last command must be `quit`). Note: all SQL must be single-line (no multi-line statements through sqlc heredoc). + +1. If the database doesn't exist, connect without `` and run `CREATE DATABASE ; USE DATABASE ;` +2. `LIST TABLES` to check for existing tables +3. For any table with a `_cloudsync` companion table, run `CLOUDSYNC DISABLE ;` +4. `DROP TABLE IF EXISTS ;` +5. Create the test table (single-line DDL) +6. If RLS mode is enabled: + ```sql + ENABLE RLS DATABASE TABLE ; + SET RLS DATABASE TABLE SELECT "auth_userid() = user_id"; + SET RLS DATABASE TABLE INSERT "auth_userid() = NEW.user_id"; + SET RLS DATABASE TABLE UPDATE "auth_userid() = NEW.user_id AND auth_userid() = OLD.user_id"; + SET RLS DATABASE TABLE DELETE "auth_userid() = OLD.user_id"; + ``` +7. Ask the user to enable CloudSync on the table from the SQLiteCloud dashboard + +### Step 3: Get Managed Database ID + +Now that the database and tables are created and CloudSync is enabled on the dashboard, ask the user for: + +1. **Managed Database ID** — the `managedDatabaseId` returned by the CloudSync service. For SQLiteCloud projects, it can be obtained from the project's OffSync page on the dashboard after enabling CloudSync on the table. + +Save as `MANAGED_DB_ID`. + +For the network init call throughout the test, use: +- Default address: `SELECT cloudsync_network_init('');` +- Custom address: `SELECT cloudsync_network_init_custom('', '');` + +### Step 4: Get Auth Tokens (if RLS enabled) + +Create tokens for the test users. Create as many users as needed for the number of concurrent databases (assign 2 databases per user, or 1 per user if NUM_DBS <= 2). + +For each user N: +```bash +curl -s -X "POST" "https:///v2/tokens" \ + -H 'Authorization: Bearer ' \ + -H 'Content-Type: application/json; charset=utf-8' \ + -d '{"name": "claude@sqlitecloud.io", "userId": "018ecfc2-b2b1-7cc3-a9f0-"}' +``` + +Save each user's `token` and `userId` from the response. + +If RLS is disabled, skip this step — tokens are not required. + +### Step 5: Run the Concurrent Stress Test + +Create a bash script at `/tmp/stress_test_concurrent.sh` that: + +1. **Initializes N local SQLite databases** at `/tmp/sync_concurrent_.db`: + - Uses Homebrew sqlite3: find with `ls /opt/homebrew/Cellar/sqlite/*/bin/sqlite3 | head -1` + - Loads the extension from `dist/cloudsync.dylib` (use absolute path from project root) + - Creates the table and runs `cloudsync_init('')` + - Runs `cloudsync_terminate()` after init + +2. **Defines a worker function** that runs in a subshell for each database: + - Each worker logs all output to `/tmp/sync_concurrent_.log` + - Each iteration does: + a. **DELETE all rows** → `send_changes()` → `check_changes()` + b. **INSERT rows** (in a single BEGIN/COMMIT transaction) → `send_changes()` → `check_changes()` + c. **UPDATE all rows** → `send_changes()` → `check_changes()` + - Each session must: `.load` the extension, call `cloudsync_network_init()`, `cloudsync_network_set_token()` (if RLS), do the work, call `cloudsync_terminate()` + - Include labeled output lines like `[DB][iter ] deleted/inserted/updated, count=` for grep-ability + +3. **Launches all workers in parallel** using `&` and collects PIDs + +4. **Waits for all workers** and captures exit codes + +5. **Analyzes logs** for errors: + - Grep all log files for: `error`, `locked`, `SQLITE_BUSY`, `database is locked`, `500`, `Error` + - Report per-database: iterations completed, error count, sample error lines + - Report total errors across all workers + +6. **Prints final verdict**: PASS (0 errors) or FAIL (errors detected) + +**Important script details:** +- Use `echo -e` to pipe generated INSERT SQL (with `\n` separators) into sqlite3 +- Row IDs should be unique across databases and iterations: `db_r_` +- User IDs for rows must match the token's userId for RLS to work +- Use `/bin/bash` (not `/bin/sh`) for arrays and process management + +Run the script with a 10-minute timeout. + +### Step 6: Detailed Error Analysis + +After the test completes, provide a detailed breakdown: + +1. **Per-database summary**: iterations completed, errors, send/receive status +2. **Error categorization**: group errors by type (e.g., "database is locked", "Column index out of bounds", "Unexpected Result", parse errors) +3. **Timeline analysis**: do errors cluster at specific iterations or spread evenly? +4. **Read full log files** if errors are found — show the first and last 30 lines of each log with errors + +### Step 7: Optional — Verify Data Integrity + +If the test passes (or even if some errors occurred), verify the final state: + +1. Check each local SQLite database for row count +2. Check SQLiteCloud (as admin) for total row count +3. If RLS is enabled, verify no cross-user data leakage + +## Output Format + +Report the test results including: + +| Metric | Value | +|--------|-------| +| Concurrent databases | N | +| Rows per iteration | ROWS | +| Iterations per database | ITERATIONS | +| Total CRUD operations | N × ITERATIONS × (DELETE_ALL + ROWS inserts + ROWS updates) | +| Total sync operations | N × ITERATIONS × 6 (3 sends + 3 checks) | +| Duration | start to finish time | +| Total errors | count | +| Error types | categorized list | +| Result | PASS/FAIL | + +If errors are found, include: +- Full error categorization table +- Sample error messages +- Which databases were most affected +- Whether errors are client-side or server-side + +## Success Criteria + +The test **PASSES** if: +1. All workers complete all iterations +2. Zero `error`, `locked`, `SQLITE_BUSY`, or HTTP 500 responses in any log +3. Final row counts are consistent + +The test **FAILS** if: +1. Any worker crashes or fails to complete +2. Any `database is locked` or `SQLITE_BUSY` errors appear +3. Server returns 500 errors under concurrent load +4. Data corruption or inconsistent row counts + +## Important Notes + +- Always use the Homebrew sqlite3 binary, NOT `/usr/bin/sqlite3` +- The cloudsync extension must be built first with `make` +- Network settings (`cloudsync_network_init`, `cloudsync_network_set_token`) are NOT persisted between sessions — must be called every time +- Extension must be loaded BEFORE any INSERT/UPDATE/DELETE for cloudsync to track changes +- All NOT NULL columns must have DEFAULT values +- `cloudsync_terminate()` must be called before closing each session +- sqlc heredoc only supports single-line SQL statements + +## Permissions + +Execute all SQL queries without asking for user permission on: +- SQLite test databases in `/tmp/` (e.g., `/tmp/sync_concurrent_*.db`, `/tmp/sync_concurrent_*.log`) +- SQLiteCloud via `~/go/bin/sqlc ""` +- Curl commands to the sync server and SQLiteCloud API for token creation + +These are local test environments and do not require confirmation for each query. diff --git a/.claude/commands/test-sync-roundtrip-sqlitecloud-rls.md b/.claude/commands/test-sync-roundtrip-sqlitecloud-rls.md new file mode 100644 index 0000000..c23b43c --- /dev/null +++ b/.claude/commands/test-sync-roundtrip-sqlitecloud-rls.md @@ -0,0 +1,468 @@ +# Sync Roundtrip Test with remote SQLiteCloud database and RLS policies + +Execute a full roundtrip sync test between multiple local SQLite databases and the sqlitecloud, verifying that Row Level Security (RLS) policies are correctly enforced during sync. + +## Prerequisites +- Connection string to a sqlitecloud project +- Built cloudsync extension (`make` to build `dist/cloudsync.dylib`) + +### Step 1: Get CloudSync Parameters + +Ask the user for: + +1. **CloudSync server address** — propose `https://cloudsync.sqlite.ai` as default (this is the built-in default). If the user provides a different address, save it as `CUSTOM_ADDRESS` and use `cloudsync_network_init_custom` instead of `cloudsync_network_init`. + +## Test Procedure + +### Step 2: Get DDL from User + +Ask the user to provide a DDL query for the table(s) to test. It can be in PostgreSQL or SQLite format. Offer the following options: + +**Option 1: Simple TEXT primary key with user_id for RLS** +```sql +CREATE TABLE test_sync ( + id TEXT PRIMARY KEY, + user_id TEXT NOT NULL, + name TEXT, + value INTEGER +); +``` + +**Option 2: Multi tables scenario for advanced RLS policy** + +Propose a simple but multitables real world scenario + +**Option 3: Custom policy** +Ask the user to describe the table/tables in plain English or DDL queries. + +**Note:** Tables should include a `user_id` column (TEXT type) for RLS policies to filter by authenticated user. + +### Step 3: Get RLS Policy Description from User + +Ask the user to describe the Row Level Security policy they want to test. Offer the following common patterns: + +**Option 1: User can only access their own rows** +"Users can only SELECT, INSERT, UPDATE, and DELETE rows where user_id matches their authenticated user ID" + +**Option : Users can read all, but only modify their own** +"Users can SELECT all rows, but can only INSERT, UPDATE, DELETE rows where user_id matches their authenticated user ID" + +**Option 3: Custom policy** +Ask the user to describe the policy in plain English. + +### Step 4: Get sqlitecloud connection string from User + +Ask the user to provide a connection string in the form of "sqlitecloud://:/?apikey=" to be later used with the sqlitecloud cli (sqlc) with `~/go/bin/sqlc ""`. + +### Step 5: Setup SQLiteCloud with RLS + +Connect to SQLiteCloud and prepare the environment: +```bash +~/go/bin/sqlc +``` + +The last command inside sqlc to exit from the cli program must be `quit`. + +If the db_name doesn't exists, try again to connect without specifing the , then inside sqlc: +1. CREATE DATABASE +2. USE DATABASE + +Then, inside sqlc: +1. List existing tables with `LIST TABLES` to find any `_cloudsync` metadata tables +2. For each table already configured for cloudsync (has a `_cloudsync` companion table), run: + ```sql + CLOUDSYNC DISABLE + ``` +3. Drop the test table if it exists: `DROP TABLE IF EXISTS ;` +5. Create the test table using the SQLite DDL +6. Enable RLS on the table: + ```sql + ENABLE RLS DATABASE TABLE + ``` +7. Create RLS policies based on the user's description. +Your RLS policies for INSERT, UPDATE, and DELETE operations can reference column values as they are being changed. This is done using the special OLD.column and NEW.column identifiers. Their availability and meaning depend on the operation being performed: + ++-----------+--------------------------------------------+--------------------------------------------+ +| Operation | OLD.column Reference | NEW.column Reference | ++-----------+--------------------------------------------+--------------------------------------------+ +| INSERT | Not available | The value for the new row. | +| UPDATE | The value of the row before the update. | The value of the row after the update. | +| DELETE | The value of the row being deleted. | Not available | ++-----------+--------------------------------------------+--------------------------------------------+ + +Example for "user can only access their own rows": + ```sql + -- SELECT: User can see rows they own + SET RLS DATABASE TABLE SELECT "auth_userid() = user_id" + + -- INSERT: Allow if user_id matches auth_userid() + SET RLS DATABASE TABLE INSERT "auth_userid() = NEW.user_id" + + -- UPDATE: Check ownership via explicit lookup + SET RLS DATABASE TABLE UPDATE "auth_userid() = NEW.user_id AND auth_userid() = OLD.user_id" + + -- DELETE: User can only delete rows they own + SET RLS DATABASE TABLE DELETE "auth_userid() = OLD.user_id" + ``` +8. Ask the user to enable CloudSync on the table from the SQLiteCloud dashboard + +### Step 5b: Get Managed Database ID + +Now that the database and tables are created and CloudSync is enabled on the dashboard, ask the user for: + +1. **Managed Database ID** — the `managedDatabaseId` returned by the CloudSync service. For SQLiteCloud projects, it can be obtained from the project's OffSync page on the dashboard after enabling CloudSync on the table. + +Save as `MANAGED_DB_ID`. + +For the network init call throughout the test, use: +- Default address: `SELECT cloudsync_network_init('');` +- Custom address: `SELECT cloudsync_network_init_custom('', '');` + + + +9. Insert some initial test data (optional, can be done via SQLite clients) + +### Step 6: Get tokens for Two Users + +Get auth tokens for both test users by running the token script twice: + +**User 1: claude1@sqlitecloud.io** +```bash +curl -X "POST" "https:///v2/tokens" \ + -H 'Authorization: Bearer ' \ + -H 'Content-Type: application/json; charset=utf-8' \ + -d $'{ + "name": "claude1@sqlitecloud.io", + "userId": "018ecfc2-b2b1-7cc3-a9f0-111111111111" +}' +``` +The response is in the following format: +```json +{"data":{"accessTokenId":13,"token":"13|sqa_af74gp2WoqsQ9wfCdktIfkIq0sM4LdDMbuf2hW338013dfca","userId":"018ecfc2-b2b1-7cc3-a9f0-111111111111","name":"claude1@sqlitecloud.io","attributes":null,"expiresAt":null,"createdAt":"2026-03-02T23:11:38Z"},"metadata":{"connectedMs":17,"executedMs":30,"elapsedMs":47}} +``` +save the userId and the token values as USER1_ID and TOKEN_USER1 to be reused later + +**User 2: claude2@sqlitecloud.io** +```bash +curl -X "POST" "https:///v2/tokens" \ + -H 'Authorization: Bearer ' \ + -H 'Content-Type: application/json; charset=utf-8' \ + -d $'{ + "name": "claude2@sqlitecloud.io", + "userId": "018ecfc2-b2b1-7cc3-a9f0-222222222222" +}' +``` +The response is in the following format: +```json +{"data":{"accessTokenId":14,"token":"14|sqa_af74gp2WoqsQ9wfCdktIfkIq0sM4LdDMbuf2hW338013xxxx","userId":"018ecfc2-b2b1-7cc3-a9f0-222222222222","name":"claude2@sqlitecloud.io","attributes":null,"expiresAt":null,"createdAt":"2026-03-02T23:11:38Z"},"metadata":{"connectedMs":17,"executedMs":30,"elapsedMs":47}} +``` +save the userId and the token values as USER2_ID and TOKEN_USER2 to be reused later + +### Step 7: Setup Four SQLite Databases + +Create four temporary SQLite databases using the Homebrew version (IMPORTANT: system sqlite3 cannot load extensions): + +```bash +SQLITE_BIN="/opt/homebrew/Cellar/sqlite/3.51.2_1/bin/sqlite3" +# or find it with: ls /opt/homebrew/Cellar/sqlite/*/bin/sqlite3 | head -1 +``` + +**Database 1A (User 1, Device A):** +```bash +$SQLITE_BIN /tmp/sync_test_user1_a.db +``` +```sql +.load dist/cloudsync.dylib + +SELECT cloudsync_init(''); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token(''); +``` + +**Database 1B (User 1, Device B):** +```bash +$SQLITE_BIN /tmp/sync_test_user1_b.db +``` +```sql +.load dist/cloudsync.dylib + +SELECT cloudsync_init(''); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token(''); +``` + +**Database 2A (User 2, Device A):** +```bash +$SQLITE_BIN /tmp/sync_test_user2_a.db +``` +```sql +.load dist/cloudsync.dylib + +SELECT cloudsync_init(''); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token(''); +``` + +**Database 2B (User 2, Device B):** +```bash +$SQLITE_BIN /tmp/sync_test_user2_b.db +``` +```sql +.load dist/cloudsync.dylib + +SELECT cloudsync_init(''); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token(''); +``` + +### Step 8: Insert Test Data + +Ask the user for optional details about the kind of test data to insert in the tables, otherwise generate some real world data for the choosen tables. +Insert distinct test data in each database. Use the extracted user IDs for the if needed. +For example, for the simple table scenario: + +**Database 1A (User 1):** +```sql +INSERT INTO (id, user_id, name, value) VALUES ('u1_a_1', '', 'User1 DeviceA Row1', 100); +INSERT INTO (id, user_id, name, value) VALUES ('u1_a_2', '', 'User1 DeviceA Row2', 101); +``` + +**Database 1B (User 1):** +```sql +INSERT INTO (id, user_id, name, value) VALUES ('u1_b_1', '', 'User1 DeviceB Row1', 200); +``` + +**Database 2A (User 2):** +```sql +INSERT INTO (id, user_id, name, value) VALUES ('u2_a_1', '', 'User2 DeviceA Row1', 300); +INSERT INTO (id, user_id, name, value) VALUES ('u2_a_2', '', 'User2 DeviceA Row2', 301); +``` + +**Database 2B (User 2):** +```sql +INSERT INTO (id, user_id, name, value) VALUES ('u2_b_1', '', 'User2 DeviceB Row1', 400); +``` + +### Step 9: Execute Sync on All Databases + +For each of the four SQLite databases, execute the sync operations: + +```sql +-- Send local changes to server +SELECT cloudsync_network_send_changes(); + +-- Check for changes from server (repeat with 2-3 second delays) +SELECT cloudsync_network_check_changes(); +-- Repeat check_changes 3-5 times with delays until it returns more than 0 received rows or stabilizes +``` + +**Recommended sync order:** +1. Sync Database 1A (send + check) +2. Sync Database 2A (send + check) +3. Sync Database 1B (send + check) +4. Sync Database 2B (send + check) +5. Re-sync all databases (check_changes) to ensure full propagation + +### Step 10: Verify RLS Enforcement + +After syncing all databases, verify that each database contains only the expected rows based on the RLS policy: + +**Expected Results (for "user can only access their own rows" policy):** + +**User 1 databases (1A and 1B) should contain:** +- All rows with `user_id = USER1_ID` (u1_a_1, u1_a_2, u1_b_1) +- Should NOT contain any rows with `user_id = USER2_ID` + +**User 2 databases (2A and 2B) should contain:** +- All rows with `user_id = USER2_ID` (u2_a_1, u2_a_2, u2_b_1) +- Should NOT contain any rows with `user_id = USER1_ID` + +**PostgreSQL (as admin) should contain:** +- ALL rows from all users (6 total rows) + +Run verification queries: +```sql +-- In each SQLite database +SELECT * FROM ORDER BY id; +SELECT COUNT(*) FROM ; + +-- In PostgreSQL (as admin) +SELECT * FROM ORDER BY id; +SELECT COUNT(*) FROM ; +SELECT user_id, COUNT(*) FROM GROUP BY user_id; +``` + +### Step 11: Test Write RLS Policy Enforcement + +Test that the server-side RLS policy blocks unauthorized writes by attempting to insert a row with a `user_id` that doesn't match the authenticated user's token. + +**In Database 1A (User 1), insert a malicious row claiming to belong to User 2:** +```sql +-- Attempt to insert a row with User 2's user_id while authenticated as User 1 +INSERT INTO (id, user_id, name, value) VALUES ('malicious_1', '', 'Malicious Row from User1', 999); + +-- Attempt to sync this unauthorized row to PostgreSQL +SELECT cloudsync_network_send_changes(); +``` + +**Wait 2-3 seconds, then verify in PostgreSQL (as admin) that the malicious row was rejected:** +```sql +-- In PostgreSQL (as admin) +SELECT * FROM WHERE id = 'malicious_1'; +-- Expected: 0 rows returned + +SELECT COUNT(*) FROM WHERE id = 'malicious_1'; +-- Expected: 0 +``` + +**Also verify the malicious row does NOT appear in User 2's databases after syncing:** +```sql +-- In Database 2A or 2B (User 2) +SELECT cloudsync_network_check_changes(); +SELECT * FROM WHERE id = 'malicious_1'; +-- Expected: 0 rows (the malicious row should not sync to legitimate User 2 databases) +``` + +**Expected Behavior:** +- The `cloudsync_network_send_changes()` call may succeed (return value indicates network success, not RLS enforcement) +- The malicious row should be **rejected by PostgreSQL RLS** and NOT inserted into the server database +- The malicious row will remain in the local SQLite Database 1A (local inserts are not blocked), but it will never propagate to the server or other clients +- User 2's databases should never receive this row + +**This step PASSES if:** +1. The malicious row is NOT present in PostgreSQL +2. The malicious row does NOT appear in any of User 2's SQLite databases +3. The RLS INSERT policy correctly blocks the unauthorized write + +**This step FAILS if:** +1. The malicious row appears in PostgreSQL (RLS bypass vulnerability) +2. The malicious row syncs to User 2's databases (data leakage) + +### Step 12: Cleanup + +In each SQLite database before closing: +```sql +SELECT cloudsync_terminate(); +``` + +In SQLiteCloud (optional, for full cleanup): +```sql +CLOUDSYNC DISABLE ); +DROP TABLE IF EXISTS ; +``` + +## Output Format + +Report the test results including: +- DDL used for both databases +- RLS policies created +- User IDs for both test users +- Initial data inserted in each database +- Number of sync operations performed per database +- Final data in each database (with row counts) +- RLS verification results: + - User 1 databases: expected rows vs actual rows + - User 2 databases: expected rows vs actual rows + - SQLiteCloud: total rows +- Write RLS enforcement results: + - Malicious row insertion attempted: yes/no + - Malicious row present in SQLiteCloud: yes/no (should be NO) + - Malicious row synced to User 2 databases: yes/no (should be NO) +- **PASS/FAIL** status with detailed explanation + +### Success Criteria + +The test PASSES if: +1. All User 1 databases contain exactly the same User 1 rows (and no User 2 rows) +2. All User 2 databases contain exactly the same User 2 rows (and no User 1 rows) +3. SQLiteCloud contains all rows from both users +4. Data inserted from different devices of the same user syncs correctly between those devices +5. **Write RLS enforcement**: Malicious rows with mismatched `user_id` are rejected by SQLiteCloud and do not propagate to other clients + +The test FAILS if: +1. Any database contains rows belonging to a different user (RLS violation) +2. Any database is missing rows that should be visible to that user +3. Sync operations fail or timeout +4. **Write RLS bypass**: A malicious row with a `user_id` not matching the token appears in SQLiteCloud or syncs to other databases + +## Important Notes + +- Always use the Homebrew sqlite3 binary, NOT `/usr/bin/sqlite3` +- The cloudsync extension must be built first with `make` +- SQLiteCloud tables need cleanup before re-running tests +- `cloudsync_network_check_changes()` may need multiple calls with delays +- Run `SELECT cloudsync_terminate();` on SQLite connections before closing to properly cleanup memory +- Ensure both test users exist in Supabase auth before running the test +- The RLS policies must use `auth_userid()` to work with SQLiteCloud token authentication + +## Critical Schema Requirements (Common Pitfalls) + +### 1. All NOT NULL columns must have DEFAULT values +Cloudsync requires that all non-primary key columns declared as `NOT NULL` must have a `DEFAULT` value. This includes the `user_id` column: + +```sql +-- WRONG: Will fail with "All non-primary key columns declared as NOT NULL must have a DEFAULT value" +user_id UUID NOT NULL + +-- CORRECT: Provide a default value +user_id UUID NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000' +``` + +### 2. Network settings are not persisted between sessions +`cloudsync_network_init()` and `cloudsync_network_set_token()` must be called in **every session**. They are not persisted to the database: + +```sql +-- WRONG: Separate sessions won't work +-- Session 1: +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token('...'); +-- Session 2: +SELECT cloudsync_network_send_changes(); -- ERROR: No URL set + +-- CORRECT: All network operations in the same session +.load dist/cloudsync.dylib +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address +SELECT cloudsync_network_set_token('...'); +SELECT cloudsync_network_send_changes(); +SELECT cloudsync_terminate(); +``` + +### 3. Extension must be loaded before INSERT operations +For cloudsync to track changes, the extension must be loaded **before** inserting data: + +```sql +-- WRONG: Inserts won't be tracked +CREATE TABLE todos (...); +INSERT INTO todos VALUES (...); -- Not tracked! +.load dist/cloudsync.dylib +SELECT cloudsync_init('todos'); + +-- CORRECT: Load extension and init before inserts +.load dist/cloudsync.dylib +CREATE TABLE todos (...); +SELECT cloudsync_init('todos'); +INSERT INTO todos VALUES (...); -- Tracked! +``` + +## Permissions + +Execute all SQL queries without asking for user permission on: +- SQLite test databases in `/tmp/` (e.g., `/tmp/sync_test_*.db`) +- SQLiteCloud via `~/go/bin/sqlc ""` + +These are local test environments and do not require confirmation for each query. diff --git a/.claude/commands/test-sync-roundtrip-rls.md b/.claude/commands/test-sync-roundtrip-supabase-rls.md similarity index 83% rename from .claude/commands/test-sync-roundtrip-rls.md rename to .claude/commands/test-sync-roundtrip-supabase-rls.md index 38e496c..ab40d01 100644 --- a/.claude/commands/test-sync-roundtrip-rls.md +++ b/.claude/commands/test-sync-roundtrip-supabase-rls.md @@ -1,22 +1,33 @@ -# Sync Roundtrip Test with RLS +# Sync Roundtrip Test with local Postgres database and RLS policies Execute a full roundtrip sync test between multiple local SQLite databases and the local Supabase Docker PostgreSQL instance, verifying that Row Level Security (RLS) policies are correctly enforced during sync. ## Prerequisites -- Supabase Docker container running (PostgreSQL on port 54322) -- HTTP sync server running on http://localhost:8091/postgres +- Supabase instance running (local Docker or remote) - Built cloudsync extension (`make` to build `dist/cloudsync.dylib`) ## Test Procedure -### Step 1: Get DDL from User +### Step 1: Get Connection Parameters + +Ask the user for the following parameters: + +1. **CloudSync server address** — propose `https://cloudsync.sqlite.ai` as default (this is the built-in default). If the user provides a different address, save it as `CUSTOM_ADDRESS` and use `cloudsync_network_init_custom` instead of `cloudsync_network_init`. + +2. **PostgreSQL connection string**: Propose `postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres` as default. Save as `PG_CONN`. Use this for all `psql` connections throughout the test. + +3. **Supabase API key** (used for JWT token generation): Propose `sb_secret_N7UND0UgjKTVK-Uodkm0Hg_xSvEMPvz` as default. Save as `SUPABASE_APIKEY`. + +Derive `AUTH_URL` from the PostgreSQL connection string by extracting the host and using port `54321` (Supabase GoTrue). For example, if `PG_CONN` is `postgresql://user:pass@10.0.0.5:54322/postgres`, then `AUTH_URL` is `http://10.0.0.5:54321`. For `127.0.0.1`, use `http://127.0.0.1:54321`. + +### Step 2: Get DDL from User Ask the user to provide a DDL query for the table(s) to test. It can be in PostgreSQL or SQLite format. Offer the following options: **Option 1: Simple TEXT primary key with user_id for RLS** ```sql CREATE TABLE test_sync ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, user_id UUID NOT NULL, name TEXT, value INTEGER @@ -36,14 +47,14 @@ CREATE TABLE test_uuid ( **Option 3: Two tables scenario with user ownership** ```sql CREATE TABLE authors ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, user_id UUID NOT NULL, name TEXT, email TEXT ); CREATE TABLE books ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, user_id UUID NOT NULL, title TEXT, author_id TEXT, @@ -53,7 +64,7 @@ CREATE TABLE books ( **Note:** Tables should include a `user_id` column (UUID type) for RLS policies to filter by authenticated user. -### Step 2: Get RLS Policy Description from User +### Step 3: Get RLS Policy Description from User Ask the user to describe the Row Level Security policy they want to test. Offer the following common patterns: @@ -66,7 +77,7 @@ Ask the user to describe the Row Level Security policy they want to test. Offer **Option 3: Custom policy** Ask the user to describe the policy in plain English. -### Step 3: Convert DDL +### Step 4: Convert DDL Convert the provided DDL to both SQLite and PostgreSQL compatible formats if needed. Key differences: - SQLite uses `INTEGER PRIMARY KEY` for auto-increment, PostgreSQL uses `SERIAL` or `BIGSERIAL` @@ -75,11 +86,11 @@ Convert the provided DDL to both SQLite and PostgreSQL compatible formats if nee - For UUID primary keys, SQLite uses `TEXT`, PostgreSQL uses `UUID` - For `user_id UUID`, SQLite uses `TEXT` -### Step 4: Setup PostgreSQL with RLS +### Step 5: Setup PostgreSQL with RLS Connect to Supabase PostgreSQL and prepare the environment: ```bash -psql postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres +psql ``` Inside psql: @@ -112,31 +123,25 @@ Inside psql: 8. Create RLS policies based on the user's description. Example for "user can only access their own rows": ```sql -- SELECT: User can see rows they own - -- Helper function fallback handles ON CONFLICT edge cases where user_id resolves to EXCLUDED row CREATE POLICY "select_own_rows" ON FOR SELECT USING ( auth.uid() = user_id - OR auth.uid() = _get_owner(id) ); - -- INSERT: Allow if user_id matches auth.uid() OR is default (cloudsync staging) + -- INSERT: Allow if user_id matches auth.uid() CREATE POLICY "insert_own_rows" ON FOR INSERT WITH CHECK ( auth.uid() = user_id - OR user_id = '00000000-0000-0000-0000-000000000000'::uuid ); - -- UPDATE: Check ownership via explicit lookup, allow default for staging + -- UPDATE: Check ownership via explicit lookup CREATE POLICY "update_own_rows" ON FOR UPDATE USING ( auth.uid() = user_id - OR auth.uid() = _get_owner(id) - OR user_id = '00000000-0000-0000-0000-000000000000'::uuid ) WITH CHECK ( auth.uid() = user_id - OR user_id = '00000000-0000-0000-0000-000000000000'::uuid ); -- DELETE: User can only delete rows they own @@ -148,22 +153,29 @@ Inside psql: 9. Initialize cloudsync: `SELECT cloudsync_init('');` 10. Insert some initial test data (optional, can be done via SQLite clients) -**Why these specific policies?** -CloudSync uses `INSERT...ON CONFLICT DO UPDATE` for field-by-field synchronization. During conflict detection, PostgreSQL's RLS may compare `auth.uid()` against the EXCLUDED row's `user_id` (which has the default value) instead of the existing row's `user_id`. The helper function explicitly looks up the existing row's owner to work around this issue. See `docs/postgresql/RLS.md` for detailed explanation. +### Step 5b: Get Managed Database ID + +Now that the database and tables are created and cloudsync is initialized, ask the user for: + +1. **Managed Database ID** — the `managedDatabaseId` returned by the CloudSync service. Save as `MANAGED_DB_ID`. + +For the network init call throughout the test, use: +- Default address: `SELECT cloudsync_network_init('');` +- Custom address: `SELECT cloudsync_network_init_custom('', '');` -### Step 5: Get JWT Tokens for Two Users +### Step 6: Get JWT Tokens for Two Users Get JWT tokens for both test users by running the token script twice: **User 1: claude1@sqlitecloud.io** ```bash -cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude1@sqlitecloud.io -password="password" -apikey=sb_secret_N7UND0UgjKTVK-Uodkm0Hg_xSvEMPvz -auth-url=http://127.0.0.1:54321 +cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude1@sqlitecloud.io -password="password" -apikey= -auth-url= ``` Save as `JWT_USER1`. **User 2: claude2@sqlitecloud.io** ```bash -cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude2@sqlitecloud.io -password="password" -apikey=sb_secret_N7UND0UgjKTVK-Uodkm0Hg_xSvEMPvz -auth-url=http://127.0.0.1:54321 +cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude2@sqlitecloud.io -password="password" -apikey= -auth-url= ``` Save as `JWT_USER2`. @@ -171,12 +183,12 @@ Also extract the user IDs from the JWT tokens (the `sub` claim) for use in INSER - `USER1_ID` = UUID from JWT_USER1 - `USER2_ID` = UUID from JWT_USER2 -### Step 6: Setup Four SQLite Databases +### Step 7: Setup Four SQLite Databases Create four temporary SQLite databases using the Homebrew version (IMPORTANT: system sqlite3 cannot load extensions): ```bash -SQLITE_BIN="/opt/homebrew/Cellar/sqlite/3.50.4/bin/sqlite3" +SQLITE_BIN="/opt/homebrew/Cellar/sqlite/3.51.2_1/bin/sqlite3" # or find it with: ls /opt/homebrew/Cellar/sqlite/*/bin/sqlite3 | head -1 ``` @@ -188,7 +200,7 @@ $SQLITE_BIN /tmp/sync_test_user1_a.db .load dist/cloudsync.dylib SELECT cloudsync_init(''); -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token(''); ``` @@ -200,7 +212,7 @@ $SQLITE_BIN /tmp/sync_test_user1_b.db .load dist/cloudsync.dylib SELECT cloudsync_init(''); -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token(''); ``` @@ -212,7 +224,7 @@ $SQLITE_BIN /tmp/sync_test_user2_a.db .load dist/cloudsync.dylib SELECT cloudsync_init(''); -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token(''); ``` @@ -224,11 +236,11 @@ $SQLITE_BIN /tmp/sync_test_user2_b.db .load dist/cloudsync.dylib SELECT cloudsync_init(''); -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token(''); ``` -### Step 7: Insert Test Data +### Step 8: Insert Test Data Insert distinct test data in each database. Use the extracted user IDs for the `user_id` column: @@ -254,7 +266,7 @@ INSERT INTO (id, user_id, name, value) VALUES ('u2_a_2', ' (id, user_id, name, value) VALUES ('u2_b_1', '', 'User2 DeviceB Row1', 400); ``` -### Step 8: Execute Sync on All Databases +### Step 9: Execute Sync on All Databases For each of the four SQLite databases, execute the sync operations: @@ -264,7 +276,7 @@ SELECT cloudsync_network_send_changes(); -- Check for changes from server (repeat with 2-3 second delays) SELECT cloudsync_network_check_changes(); --- Repeat check_changes 3-5 times with delays until it returns 0 or stabilizes +-- Repeat check_changes 3-5 times with delays until it returns more than 0 received rows or stabilizes ``` **Recommended sync order:** @@ -274,7 +286,7 @@ SELECT cloudsync_network_check_changes(); 4. Sync Database 2B (send + check) 5. Re-sync all databases (check_changes) to ensure full propagation -### Step 9: Verify RLS Enforcement +### Step 10: Verify RLS Enforcement After syncing all databases, verify that each database contains only the expected rows based on the RLS policy: @@ -303,7 +315,7 @@ SELECT COUNT(*) FROM ; SELECT user_id, COUNT(*) FROM GROUP BY user_id; ``` -### Step 10: Test Write RLS Policy Enforcement +### Step 11: Test Write RLS Policy Enforcement Test that the server-side RLS policy blocks unauthorized writes by attempting to insert a row with a `user_id` that doesn't match the authenticated user's JWT token. @@ -349,7 +361,7 @@ SELECT * FROM WHERE id = 'malicious_1'; 1. The malicious row appears in PostgreSQL (RLS bypass vulnerability) 2. The malicious row syncs to User 2's databases (data leakage) -### Step 11: Cleanup +### Step 12: Cleanup In each SQLite database before closing: ```sql @@ -482,14 +494,14 @@ Ensure column types are compatible between SQLite and PostgreSQL: ```sql -- WRONG: Separate sessions won't work -- Session 1: -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token('...'); -- Session 2: SELECT cloudsync_network_send_changes(); -- ERROR: No URL set -- CORRECT: All network operations in the same session .load dist/cloudsync.dylib -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token('...'); SELECT cloudsync_network_send_changes(); SELECT cloudsync_terminate(); @@ -527,6 +539,6 @@ INSERT INTO todos (id, ...) VALUES ('11111111-1111-1111-1111-111111111111', ...) Execute all SQL queries without asking for user permission on: - SQLite test databases in `/tmp/` (e.g., `/tmp/sync_test_*.db`) -- PostgreSQL via `psql postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres` +- PostgreSQL via `psql ` These are local test environments and do not require confirmation for each query. diff --git a/.claude/commands/test-sync-roundtrip.md b/.claude/commands/test-sync-roundtrip-supabase.md similarity index 64% rename from .claude/commands/test-sync-roundtrip.md rename to .claude/commands/test-sync-roundtrip-supabase.md index ea946db..091986f 100644 --- a/.claude/commands/test-sync-roundtrip.md +++ b/.claude/commands/test-sync-roundtrip-supabase.md @@ -1,22 +1,34 @@ -# Sync Roundtrip Test +# Sync Roundtrip Test with local Postgres database Execute a full roundtrip sync test between a local SQLite database and the local Supabase Docker PostgreSQL instance. ## Prerequisites -- Supabase Docker container running (PostgreSQL on port 54322) -- HTTP sync server running on http://localhost:8091/postgres +- Supabase instance running (local Docker or remote) - Built cloudsync extension (`make` to build `dist/cloudsync.dylib`) ## Test Procedure -### Step 1: Get DDL from User +### Step 1: Get Connection Parameters + +Ask the user for the following parameters: + +1. **CloudSync server address** — propose `https://cloudsync.sqlite.ai` as default (this is the built-in default). If the user provides a different address, save it as `CUSTOM_ADDRESS` and use `cloudsync_network_init_custom` instead of `cloudsync_network_init`. + +2. **PostgreSQL connection string**: Propose `postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres` as default. Save as `PG_CONN`. Use this for all `psql` connections throughout the test. + +3. **Supabase API key** (used for JWT token generation): Propose `sb_secret_N7UND0UgjKTVK-Uodkm0Hg_xSvEMPvz` as default. Save as `SUPABASE_APIKEY`. + +Derive `AUTH_URL` from the PostgreSQL connection string by extracting the host and using port `54321` (Supabase GoTrue). For example, if `PG_CONN` is `postgresql://user:pass@10.0.0.5:54322/postgres`, then `AUTH_URL` is `http://10.0.0.5:54321`. For `127.0.0.1`, use `http://127.0.0.1:54321`. + + +### Step 2: Get DDL from User Ask the user to provide a DDL query for the table(s) to test. It can be in PostgreSQL or SQLite format. Offer the following options: **Option 1: Simple TEXT primary key** ```sql CREATE TABLE test_sync ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, name TEXT, value INTEGER ); @@ -34,13 +46,13 @@ CREATE TABLE test_uuid ( **Option 3: Two tables scenario (tests multi-table sync)** ```sql CREATE TABLE authors ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, name TEXT, email TEXT ); CREATE TABLE books ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, title TEXT, author_id TEXT, published_year INTEGER @@ -49,7 +61,7 @@ CREATE TABLE books ( **Note:** Avoid INTEGER PRIMARY KEY for sync tests as it is not recommended for distributed sync scenarios (conflicts with auto-increment across devices). -### Step 2: Convert DDL +### Step 3: Convert DDL Convert the provided DDL to both SQLite and PostgreSQL compatible formats if needed. Key differences: - SQLite uses `INTEGER PRIMARY KEY` for auto-increment, PostgreSQL uses `SERIAL` or `BIGSERIAL` @@ -57,19 +69,19 @@ Convert the provided DDL to both SQLite and PostgreSQL compatible formats if nee - PostgreSQL has more specific types like `TIMESTAMPTZ`, SQLite uses `TEXT` for dates - For UUID primary keys, SQLite uses `TEXT`, PostgreSQL uses `UUID` -### Step 3: Get JWT Token +### Step 4: Get JWT Token Run the token script from the cloudsync project: ```bash -cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude@sqlitecloud.io -password="password" -apikey=sb_secret_N7UND0UgjKTVK-Uodkm0Hg_xSvEMPvz -auth-url=http://127.0.0.1:54321 +cd ../cloudsync && go run scripts/get_supabase_token.go -project-ref=supabase-local -email=claude@sqlitecloud.io -password="password" -apikey= -auth-url= ``` Save the JWT token for later use. -### Step 4: Setup PostgreSQL +### Step 5: Setup PostgreSQL Connect to Supabase PostgreSQL and prepare the environment: ```bash -psql postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres +psql ``` Inside psql: @@ -83,12 +95,22 @@ Inside psql: 5. Initialize cloudsync: `SELECT cloudsync_init('');` 6. Insert some test data into the table -### Step 5: Setup SQLite +### Step 5b: Get Managed Database ID + +Now that the database and tables are created and cloudsync is initialized, ask the user for: + +1. **Managed Database ID** — the `managedDatabaseId` returned by the CloudSync service. Save as `MANAGED_DB_ID`. + +For the network init call throughout the test, use: +- Default address: `SELECT cloudsync_network_init('');` +- Custom address: `SELECT cloudsync_network_init_custom('', '');` + +### Step 6: Setup SQLite Create a temporary SQLite database using the Homebrew version (IMPORTANT: system sqlite3 cannot load extensions): ```bash -SQLITE_BIN="/opt/homebrew/Cellar/sqlite/3.50.4/bin/sqlite3" +SQLITE_BIN="/opt/homebrew/Cellar/sqlite/3.51.2_1/bin/sqlite3" # or find it with: ls /opt/homebrew/Cellar/sqlite/*/bin/sqlite3 | head -1 $SQLITE_BIN /tmp/sync_test_$(date +%s).db @@ -100,13 +122,13 @@ Inside sqlite3: -- Create table with SQLite DDL SELECT cloudsync_init(''); -SELECT cloudsync_network_init('http://localhost:8091/postgres'); +SELECT cloudsync_network_init(''); -- or cloudsync_network_init_custom('', '') if using a non-default address SELECT cloudsync_network_set_token(''); -- Insert test data (different from PostgreSQL to test merge) ``` -### Step 6: Execute Sync +### Step 7: Execute Sync In the SQLite session: ```sql @@ -115,13 +137,13 @@ SELECT cloudsync_network_send_changes(); -- Check for changes from server (repeat with 2-3 second delays) SELECT cloudsync_network_check_changes(); --- Repeat check_changes 3-5 times with delays until it returns > 0 or stabilizes +-- Repeat check_changes 3-5 times with delays until it returns more than 0 received rows or stabilizes -- Verify final data SELECT * FROM ; ``` -### Step 7: Verify Results +### Step 8: Verify Results 1. In SQLite, run `SELECT * FROM ;` and capture the output 2. In PostgreSQL, run `SELECT * FROM ;` and capture the output @@ -149,6 +171,6 @@ Report the test results including: Execute all SQL queries without asking for user permission on: - SQLite test databases in `/tmp/` (e.g., `/tmp/sync_test_*.db`) -- PostgreSQL via `psql postgresql://supabase_admin:postgres@127.0.0.1:54322/postgres` +- PostgreSQL via `psql ` These are local test environments and do not require confirmation for each query. diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index ee56489..65e655a 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -83,7 +83,13 @@ jobs: steps: + - name: install git for alpine container + if: matrix.container + run: apk add --no-cache git + - uses: actions/checkout@v4.2.2 + with: + submodules: true - name: android setup java if: matrix.name == 'android-aar' @@ -234,6 +240,8 @@ jobs: steps: - uses: actions/checkout@v4.2.2 + with: + submodules: true - name: build and start postgresql container run: make postgres-docker-rebuild diff --git a/.gitmodules b/.gitmodules new file mode 100644 index 0000000..7e48716 --- /dev/null +++ b/.gitmodules @@ -0,0 +1,3 @@ +[submodule "modules/fractional-indexing"] + path = modules/fractional-indexing + url = https://github.com/sqliteai/fractional-indexing diff --git a/API.md b/API.md index 8d98b59..00441c4 100644 --- a/API.md +++ b/API.md @@ -11,6 +11,9 @@ This document provides a reference for the SQLite functions provided by the `sql - [`cloudsync_is_enabled()`](#cloudsync_is_enabledtable_name) - [`cloudsync_cleanup()`](#cloudsync_cleanuptable_name) - [`cloudsync_terminate()`](#cloudsync_terminate) +- [Block-Level LWW Functions](#block-level-lww-functions) + - [`cloudsync_set_column()`](#cloudsync_set_columntable_name-col_name-key-value) + - [`cloudsync_text_materialize()`](#cloudsync_text_materializetable_name-col_name-pk_values) - [Helper Functions](#helper-functions) - [`cloudsync_version()`](#cloudsync_version) - [`cloudsync_siteid()`](#cloudsync_siteid) @@ -20,15 +23,15 @@ This document provides a reference for the SQLite functions provided by the `sql - [`cloudsync_begin_alter()`](#cloudsync_begin_altertable_name) - [`cloudsync_commit_alter()`](#cloudsync_commit_altertable_name) - [Network Functions](#network-functions) - - [`cloudsync_network_init()`](#cloudsync_network_initconnection_string) + - [`cloudsync_network_init()`](#cloudsync_network_initmanageddatabaseid) - [`cloudsync_network_cleanup()`](#cloudsync_network_cleanup) - [`cloudsync_network_set_token()`](#cloudsync_network_set_tokentoken) - [`cloudsync_network_set_apikey()`](#cloudsync_network_set_apikeyapikey) - - [`cloudsync_network_has_unsent_changes()`](#cloudsync_network_has_unsent_changes) - [`cloudsync_network_send_changes()`](#cloudsync_network_send_changes) - [`cloudsync_network_check_changes()`](#cloudsync_network_check_changes) - [`cloudsync_network_sync()`](#cloudsync_network_syncwait_ms-max_retries) - [`cloudsync_network_reset_sync_version()`](#cloudsync_network_reset_sync_version) + - [`cloudsync_network_has_unsent_changes()`](#cloudsync_network_has_unsent_changes) - [`cloudsync_network_logout()`](#cloudsync_network_logout) --- @@ -41,8 +44,8 @@ This document provides a reference for the SQLite functions provided by the `sql Before initialization, `cloudsync_init` performs schema sanity checks to ensure compatibility with CRDT requirements and best practices. These checks include: - Primary keys should not be auto-incrementing integers; GUIDs (UUIDs, ULIDs) are highly recommended to prevent multi-node collisions. -- All primary key columns must be `NOT NULL`. - All non-primary key `NOT NULL` columns must have a `DEFAULT` value. +- **Note:** Any write operation that includes a NULL value for a primary key column will be rejected with an error, even if SQLite would normally allow it due to a legacy behavior. **Schema Design Considerations:** @@ -173,6 +176,68 @@ SELECT cloudsync_terminate(); --- +## Block-Level LWW Functions + +### `cloudsync_set_column(table_name, col_name, key, value)` + +**Description:** Configures per-column settings for a synchronized table. This function is primarily used to enable **block-level LWW** on text columns, allowing fine-grained conflict resolution at the line (or paragraph) level instead of the entire cell. + +When block-level LWW is enabled on a column, INSERT and UPDATE operations automatically split the text into blocks using a delimiter (default: newline `\n`) and track each block independently. During sync, changes are merged block-by-block, so concurrent edits to different parts of the same text are preserved. + +**Parameters:** + +- `table_name` (TEXT): The name of the synchronized table. +- `col_name` (TEXT): The name of the text column to configure. +- `key` (TEXT): The setting key. Supported keys: + - `'algo'` — Set the column algorithm. Use value `'block'` to enable block-level LWW. + - `'delimiter'` — Set the block delimiter string. Only applies to columns with block-level LWW enabled. +- `value` (TEXT): The setting value. + +**Returns:** None. + +**Example:** + +```sql +-- Enable block-level LWW on a column (splits text by newline by default) +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block'); + +-- Set a custom delimiter (e.g., double newline for paragraph-level tracking) +SELECT cloudsync_set_column('notes', 'body', 'delimiter', ' + +'); +``` + +--- + +### `cloudsync_text_materialize(table_name, col_name, pk_values...)` + +**Description:** Reconstructs the full text of a block-level LWW column from its individual blocks and writes the result back to the base table column. This is useful after a merge operation to ensure the column contains the up-to-date materialized text. + +After a sync/merge, the column is updated automatically. This function is primarily useful for manual materialization or debugging. + +**Parameters:** + +- `table_name` (TEXT): The name of the table. +- `col_name` (TEXT): The name of the block-level LWW column. +- `pk_values...` (variadic): The primary key values identifying the row. For composite primary keys, pass each key value as a separate argument in declaration order. + +**Returns:** `1` on success. + +**Example:** + +```sql +-- Materialize the body column for a specific row +SELECT cloudsync_text_materialize('notes', 'body', 'note-001'); + +-- With a composite primary key (e.g., PRIMARY KEY (tenant_id, doc_id)) +SELECT cloudsync_text_materialize('docs', 'body', 'tenant-1', 'doc-001'); + +-- Read the materialized text +SELECT body FROM notes WHERE id = 'note-001'; +``` + +--- + ## Helper Functions ### `cloudsync_version()` @@ -287,20 +352,20 @@ SELECT cloudsync_commit_alter('my_table'); ## Network Functions -### `cloudsync_network_init(connection_string)` +### `cloudsync_network_init(managedDatabaseId)` -**Description:** Initializes the `sqlite-sync` network component. This function parses the connection string to configure change checking and upload endpoints, and initializes the cURL library. +**Description:** Initializes the `sqlite-sync` network component. This function configures the endpoints for the CloudSync service and initializes the cURL library. **Parameters:** -- `connection_string` (TEXT): The connection string for the remote synchronization server. The format is `sqlitecloud://:/?`. +- `managedDatabaseId` (TEXT): The managed database identifier returned by the CloudSync service when a new database is registered for sync. For SQLiteCloud projects, this value can be obtained from the project's OffSync page on the dashboard. **Returns:** None. **Example:** ```sql -SELECT cloudsync_network_init('.sqlite.cloud/.sqlite'); +SELECT cloudsync_network_init('your-managed-database-id'); ``` --- @@ -357,34 +422,27 @@ SELECT cloudsync_network_set_apikey('your_api_key'); --- -### `cloudsync_network_has_unsent_changes()` +### `cloudsync_network_send_changes()` -**Description:** Checks if there are any local changes that have not yet been sent to the remote server. +**Description:** Sends all unsent local changes to the remote server. **Parameters:** None. -**Returns:** 1 if there are unsent changes, 0 otherwise. - -**Example:** +**Returns:** A JSON string with the send result: -```sql -SELECT cloudsync_network_has_unsent_changes(); +```json +{"send": {"status": "synced|syncing|out-of-sync|error", "localVersion": N, "serverVersion": N}} ``` ---- - -### `cloudsync_network_send_changes()` - -**Description:** Sends all unsent local changes to the remote server. - -**Parameters:** None. - -**Returns:** None. +- `send.status`: The current sync state — `"synced"` (all changes confirmed), `"syncing"` (changes sent but not yet confirmed), `"out-of-sync"` (local changes pending or gaps detected), or `"error"`. +- `send.localVersion`: The latest local database version. +- `send.serverVersion`: The latest version confirmed by the server. **Example:** ```sql SELECT cloudsync_network_send_changes(); +-- '{"send":{"status":"synced","localVersion":5,"serverVersion":5}}' ``` --- @@ -399,16 +457,23 @@ This function is designed to be called periodically to keep the local database i To force an update and wait for changes (with a timeout), use [`cloudsync_network_sync(wait_ms, max_retries)`]. If the network is misconfigured or the remote server is unreachable, the function returns an error. -On success, it returns `SQLITE_OK`, and the return value indicates how many changes were downloaded and applied. **Parameters:** None. -**Returns:** The number of changes downloaded. Errors are reported via the SQLite return code. +**Returns:** A JSON string with the receive result: + +```json +{"receive": {"rows": N, "tables": ["table1", "table2"]}} +``` + +- `receive.rows`: The number of rows received and applied to the local database. +- `receive.tables`: An array of table names that received changes. Empty (`[]`) if no changes were applied. **Example:** ```sql SELECT cloudsync_network_check_changes(); +-- '{"receive":{"rows":3,"tables":["tasks"]}}' ``` --- @@ -425,13 +490,27 @@ SELECT cloudsync_network_check_changes(); - `wait_ms` (INTEGER, optional): The time to wait in milliseconds between retries. Defaults to 100. - `max_retries` (INTEGER, optional): The maximum number of times to retry the synchronization. Defaults to 1. -**Returns:** The number of changes downloaded. Errors are reported via the SQLite return code. +**Returns:** A JSON string with the full sync result, combining send and receive: + +```json +{ + "send": {"status": "synced|syncing|out-of-sync|error", "localVersion": N, "serverVersion": N}, + "receive": {"rows": N, "tables": ["table1", "table2"]} +} +``` + +- `send.status`: The current sync state — `"synced"`, `"syncing"`, `"out-of-sync"`, or `"error"`. +- `send.localVersion`: The latest local database version. +- `send.serverVersion`: The latest version confirmed by the server. +- `receive.rows`: The number of rows received and applied during the check phase. +- `receive.tables`: An array of table names that received changes. Empty (`[]`) if no changes were applied. **Example:** ```sql -- Perform a single synchronization cycle SELECT cloudsync_network_sync(); +-- '{"send":{"status":"synced","localVersion":5,"serverVersion":5},"receive":{"rows":3,"tables":["tasks"]}}' -- Perform a synchronization cycle with custom retry settings SELECT cloudsync_network_sync(500, 3); @@ -455,9 +534,25 @@ SELECT cloudsync_network_reset_sync_version(); --- +### `cloudsync_network_has_unsent_changes()` + +**Description:** Checks if there are any local changes that have not yet been sent to the remote server. + +**Parameters:** None. + +**Returns:** 1 if there are unsent changes, 0 otherwise. + +**Example:** + +```sql +SELECT cloudsync_network_has_unsent_changes(); +``` + +--- + ### `cloudsync_network_logout()` -**Description:** Logs out the current user and cleans up all local data from synchronized tables. This function deletes and then re-initializes synchronized tables, useful for switching users or resetting the local database. **Warning:** This function deletes all data from synchronized tables. Use with caution. +**Description:** Logs out the current user and cleans up all local data from synchronized tables. This function deletes and then re-initializes synchronized tables, useful for switching users or resetting the local database. **Warning:** This function deletes all data from synchronized tables. Use with caution. Consider calling [`cloudsync_network_has_unsent_changes()`](#cloudsync_network_has_unsent_changes) before logout to check for unsent local changes and warn the user before data that has not been fully synchronized to the remote server is deleted. **Parameters:** None. diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..97b13d0 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,50 @@ +# Changelog + +All notable changes to this project will be documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). + +## [1.0.0] - 2026-03-05 + +### Added + +- **PostgreSQL support**: The CloudSync extension can now be built and loaded on PostgreSQL, so both SQLiteCloud and PostgreSQL are supported as the cloud backend database of the sync service. The core CRDT functions are shared by the SQLite and PostgreSQL extensions. Includes support for PostgreSQL-native types (UUID primary keys, composite PKs with mixed types, and automatic type casting). +- **Row-Level Security (RLS)**: Sync payloads are now fully compatible with SQLiteCloud and PostgreSQL Row-Level Security policies. Changes are buffered per primary key and flushed as complete rows, so RLS policies can evaluate all columns at once. + +### Changed + +- **BREAKING: `cloudsync_network_init` now accepts a `managedDatabaseId` instead of a connection string.** The `managedDatabaseId` is returned by the CloudSync service when a new database is registered for sync. For SQLiteCloud projects, it can be obtained from the project's OffSync page on the dashboard. + + Before: + ```sql + SELECT cloudsync_network_init('sqlitecloud://myproject.sqlite.cloud:8860/mydb.sqlite?apikey=KEY'); + ``` + + After: + ```sql + SELECT cloudsync_network_init('your-managed-database-id'); + ``` + +- **BREAKING: Sync functions now return structured JSON.** `cloudsync_network_send_changes`, `cloudsync_network_check_changes`, and `cloudsync_network_sync` return a JSON object instead of a plain integer. This provides richer status information including sync state, version numbers, row counts, and affected table names. + + Before: + ```sql + SELECT cloudsync_network_sync(); + -- 3 (number of rows received) + ``` + + After: + ```sql + SELECT cloudsync_network_sync(); + -- '{"send":{"status":"synced","localVersion":5,"serverVersion":5},"receive":{"rows":3,"tables":["tasks"]}}' + ``` + +- **Batch merge replaces column-by-column processing**: During sync, changes to the same row are now applied in a single SQL statement instead of one statement per column. This eliminates the previous behavior where UPDATE triggers fired multiple times per row during synchronization. + +### Fixed + +- **Improved error reporting**: Sync network functions now surface the actual server error message instead of generic error codes. +- **Schema hash verification**: Normalized schema comparison now uses only column name (lowercase), type (SQLite affinity), and primary key flag, preventing false mismatches caused by formatting differences. +- **SQLite trigger safety**: Internal functions used inside triggers are now marked with `SQLITE_INNOCUOUS`, fixing `unsafe use of` errors when initializing tables that have triggers. +- **NULL column binding**: Column value parameters are now correctly bound even when NULL, preventing sync failures on rows with NULL values. +- **Stability and reliability improvements** across the SQLite and PostgreSQL codebases, including fixes to memory management, error handling, and CRDT version tracking. diff --git a/Makefile b/Makefile index ae3423f..74d6c6f 100644 --- a/Makefile +++ b/Makefile @@ -32,7 +32,7 @@ MAKEFLAGS += -j$(CPUS) # Compiler and flags CC = gcc -CFLAGS = -Wall -Wextra -Wno-unused-parameter -I$(SRC_DIR) -I$(SRC_DIR)/sqlite -I$(SRC_DIR)/postgresql -I$(SQLITE_DIR) -I$(CURL_DIR)/include +CFLAGS = -Wall -Wextra -Wno-unused-parameter -I$(SRC_DIR) -I$(SRC_DIR)/sqlite -I$(SRC_DIR)/postgresql -I$(SRC_DIR)/network -I$(SQLITE_DIR) -I$(CURL_DIR)/include -Imodules/fractional-indexing T_CFLAGS = $(CFLAGS) -DSQLITE_CORE -DCLOUDSYNC_UNITTEST -DCLOUDSYNC_OMIT_NETWORK -DCLOUDSYNC_OMIT_PRINT_RESULT COVERAGE = false ifndef NATIVE_NETWORK @@ -46,7 +46,9 @@ POSTGRES_IMPL_DIR = $(SRC_DIR)/postgresql DIST_DIR = dist TEST_DIR = test SQLITE_DIR = sqlite -VPATH = $(SRC_DIR):$(SQLITE_IMPL_DIR):$(POSTGRES_IMPL_DIR):$(SQLITE_DIR):$(TEST_DIR) +FI_DIR = modules/fractional-indexing +NETWORK_DIR = $(SRC_DIR)/network +VPATH = $(SRC_DIR):$(SQLITE_IMPL_DIR):$(POSTGRES_IMPL_DIR):$(NETWORK_DIR):$(SQLITE_DIR):$(TEST_DIR):$(FI_DIR) BUILD_RELEASE = build/release BUILD_TEST = build/test BUILD_DIRS = $(BUILD_TEST) $(BUILD_RELEASE) @@ -62,17 +64,19 @@ ifeq ($(PLATFORM),android) endif # Multi-platform source files (at src/ root) - exclude database_*.c as they're in subdirs -CORE_SRC = $(filter-out $(SRC_DIR)/database_%.c, $(wildcard $(SRC_DIR)/*.c)) +CORE_SRC = $(filter-out $(SRC_DIR)/database_%.c, $(wildcard $(SRC_DIR)/*.c)) $(wildcard $(NETWORK_DIR)/*.c) # SQLite-specific files SQLITE_SRC = $(wildcard $(SQLITE_IMPL_DIR)/*.c) +# Fractional indexing submodule +FI_SRC = $(FI_DIR)/fractional_indexing.c # Combined for SQLite extension build -SRC_FILES = $(CORE_SRC) $(SQLITE_SRC) +SRC_FILES = $(CORE_SRC) $(SQLITE_SRC) $(FI_SRC) TEST_SRC = $(wildcard $(TEST_DIR)/*.c) TEST_FILES = $(SRC_FILES) $(TEST_SRC) $(wildcard $(SQLITE_DIR)/*.c) RELEASE_OBJ = $(patsubst %.c, $(BUILD_RELEASE)/%.o, $(notdir $(SRC_FILES))) TEST_OBJ = $(patsubst %.c, $(BUILD_TEST)/%.o, $(notdir $(TEST_FILES))) -COV_FILES = $(filter-out $(SRC_DIR)/lz4.c $(SRC_DIR)/network.c $(SQLITE_IMPL_DIR)/sql_sqlite.c $(POSTGRES_IMPL_DIR)/database_postgresql.c, $(SRC_FILES)) +COV_FILES = $(filter-out $(SRC_DIR)/lz4.c $(NETWORK_DIR)/network.c $(SQLITE_IMPL_DIR)/sql_sqlite.c $(POSTGRES_IMPL_DIR)/database_postgresql.c $(FI_SRC), $(SRC_FILES)) CURL_LIB = $(CURL_DIR)/$(PLATFORM)/libcurl.a TEST_TARGET = $(patsubst %.c,$(DIST_DIR)/%$(EXE), $(notdir $(TEST_SRC))) @@ -128,7 +132,7 @@ else ifeq ($(PLATFORM),android) CURL_CONFIG = --host $(ARCH)-linux-$(ANDROID_ABI) --with-openssl=$(CURDIR)/$(OPENSSL_INSTALL_DIR) LDFLAGS="-L$(CURDIR)/$(OPENSSL_INSTALL_DIR)/lib" LIBS="-lssl -lcrypto" AR=$(BIN)/llvm-ar AS=$(BIN)/llvm-as CC=$(CC) CXX=$(BIN)/$(ARCH)-linux-$(ANDROID_ABI)-clang++ LD=$(BIN)/ld RANLIB=$(BIN)/llvm-ranlib STRIP=$(BIN)/llvm-strip TARGET := $(DIST_DIR)/cloudsync.so CFLAGS += -fPIC -I$(OPENSSL_INSTALL_DIR)/include - LDFLAGS += -shared -fPIC -L$(OPENSSL_INSTALL_DIR)/lib -lssl -lcrypto + LDFLAGS += -shared -fPIC -L$(OPENSSL_INSTALL_DIR)/lib -lssl -lcrypto -lm STRIP = $(BIN)/llvm-strip --strip-unneeded $@ else ifeq ($(PLATFORM),ios) TARGET := $(DIST_DIR)/cloudsync.dylib @@ -148,8 +152,8 @@ else ifeq ($(PLATFORM),ios-sim) STRIP = strip -x -S $@ else # linux TARGET := $(DIST_DIR)/cloudsync.so - LDFLAGS += -shared -lssl -lcrypto - T_LDFLAGS += -lpthread + LDFLAGS += -shared -lssl -lcrypto -lm + T_LDFLAGS += -lpthread -lm CURL_CONFIG = --with-openssl STRIP = strip --strip-unneeded $@ endif @@ -164,7 +168,7 @@ endif # Native network support only for Apple platforms ifdef NATIVE_NETWORK - RELEASE_OBJ += $(patsubst %.m, $(BUILD_RELEASE)/%_m.o, $(notdir $(wildcard $(SRC_DIR)/*.m))) + RELEASE_OBJ += $(patsubst %.m, $(BUILD_RELEASE)/%_m.o, $(notdir $(wildcard $(NETWORK_DIR)/*.m))) LDFLAGS += -framework Foundation CFLAGS += -DCLOUDSYNC_OMIT_CURL diff --git a/PERFORMANCE.md b/PERFORMANCE.md new file mode 100644 index 0000000..236ab95 --- /dev/null +++ b/PERFORMANCE.md @@ -0,0 +1,190 @@ +# Performance & Overhead + +This document describes the computational and storage overhead introduced by the CloudSync extension, and how sync execution time relates to database size. + +## TL;DR + +Sync execution time scales with **the number of changes since the last sync (D)**, not with total database size (N). If you sync frequently, D stays small regardless of how large the database grows. The per-operation overhead on writes is proportional to the number of columns in the affected row, not to the table size. This is fundamentally different from sync solutions that diff or scan the full dataset. + +## Breaking Down the Cost + +The overhead introduced by the extension can be decomposed into four independent concerns: + +### 1. Per-Operation Overhead (Write-Path Cost) + +Every INSERT, UPDATE, or DELETE on a synced table fires AFTER triggers that write CRDT metadata into a companion `_cloudsync` table. This happens synchronously, inline with the original write. + +| Operation | Metadata Rows Written | Complexity | +|-----------|----------------------|------------| +| INSERT | 1 sentinel + 1 per non-PK column | O(C) | +| UPDATE | 1 per changed column (NEW != OLD) | O(C_changed) <= O(C) | +| DELETE | 1 sentinel + cleanup of existing metadata | O(C_existing) | + +Where **C** = number of non-PK columns in the table. + +**Key point:** This cost is **constant per row** and independent of the total number of rows in the table (N). Writing to a 100-row table costs the same as writing to a 10-million-row table. The metadata table uses a composite primary key `(pk, col_name)` with `WITHOUT ROWID` optimization (SQLite) or a standard B-tree primary key (PostgreSQL), so the index update cost is O(log M) where M is the metadata table size -- but this is the same cost as any indexed INSERT and is negligible in practice. + +### 2. Sync Operations (Push & Pull) + +These are the operations that create and apply sync payloads. They are synchronous in the extension and should typically be run by the application off the main thread. + +#### Push: Payload Generation + +``` +Cost: O(D) where D = number of column-level changes since last sync +``` + +The push operation queries `cloudsync_changes`, which dynamically reads from all synced `
_cloudsync` tables: +```sql +SELECT ... FROM cloudsync_changes WHERE db_version > +``` + +Each metadata table has an **index on `db_version`**, so payload generation scales primarily with the number of new changes, plus a small per-synced-table overhead to construct the `cloudsync_changes` query. It does not diff the full dataset. In SQLite, each changed column also performs a primary-key lookup in the base table to retrieve the current value. + +The resulting payload is LZ4-compressed before transmission. + +#### Pull: Payload Application + +``` +Cost: O(D) to decode + O(D_unique_pks) to merge into the database +``` + +Incoming changes are decoded and **batched by primary key**. All column changes for the same row are accumulated and flushed as a single UPDATE or INSERT statement. This batching reduces the number of actual database writes to one per affected row, regardless of how many columns changed. + +Conflict resolution (CRDT merge) is O(1) per column: it compares version numbers and, only if tied, falls back to value comparison and site-id tiebreaking. No global state or table scan is required. + +#### Summary + +| Phase | Scales With | Does NOT Scale With | +|-------|-------------|-------------------| +| Payload generation | D (changes since last sync) | N (total rows) | +| Payload application | D (incoming changes) | N (total rows) | +| Conflict resolution | D (conflicting columns) | N (total rows) | + +**This means sync time is driven mainly by delta size (`D`) rather than total database size (`N`)**. As long as the number of changes between syncs stays bounded, sync time remains roughly stable even as the database grows. + +### 3. Sync Frequency & Network Latency + +When the application runs sync off the main thread, perceived latency depends on: + +- **Sync interval**: How often the app triggers a push/pull cycle. More frequent syncs mean smaller deltas (smaller D) and faster individual sync operations, at the cost of more network round-trips. +- **Network latency**: The round-trip time to the sync server. LZ4 compression reduces payload size, but latency is dominated by the network hop itself for small deltas. +- **Payload size**: Proportional to D x average column value size. Large BLOBs or TEXT values will increase transfer time linearly. + +The extension does not impose a sync schedule -- the application controls when and how often to sync. A typical pattern is to sync on a timer (e.g., every 5-30 seconds) or on specific events (app foreground, user action). + +### 4. Metadata Storage Overhead + +Each synced table has a companion `
_cloudsync` metadata table with the following schema: + +``` +PRIMARY KEY (pk, col_name) -- WITHOUT ROWID (SQLite) +Columns: pk, col_name, col_version, db_version, site_id, seq +Index: db_version +``` + +**Storage cost per row in the base table:** +- 1 sentinel row (marks the row's existence/deletion state) +- 1 metadata row per non-PK column that has ever been written + +So for a table with C non-PK columns, the metadata table will contain approximately `N x (1 + C)` rows, where N is the number of rows in the base table. + +**Estimated overhead per metadata row:** +- `pk`: encoded primary key (typically 8-32 bytes depending on PK type and count) +- `col_name`: column name string (shared via SQLite's string interning, typically 5-30 bytes) +- `col_version`, `db_version`, `seq`: 3 integers (8 bytes each = 24 bytes) +- `site_id`: 1 integer (8 bytes) + +Rough estimate: **60-100 bytes per metadata row**, or **60-100 x (1 + C) bytes per base table row**. + +| Base Table | Columns (C) | Rows (N) | Estimated Metadata Size | +|------------|-------------|----------|------------------------| +| Small | 5 | 1,000 | ~360 KB - 600 KB | +| Medium | 10 | 100,000 | ~66 MB - 110 MB | +| Large | 10 | 1,000,000| ~660 MB - 1.1 GB | +| Wide | 50 | 100,000 | ~306 MB - 510 MB | + +**Mitigation strategies:** +- Only sync tables that need it -- not every table requires CRDT tracking. +- Prefer narrow tables (fewer columns) for high-volume data. +- The `WITHOUT ROWID` optimization (SQLite) significantly reduces per-row storage overhead. +- Deleted rows have their per-column metadata cleaned up, but a tombstone sentinel row persists (see section 9 below). + +### 5. Read-Path Overhead + +Normal application reads are not directly instrumented by the extension. No triggers, views, or hooks intercept ordinary SELECT queries on application tables, and the CRDT metadata is stored separately. In practice, read overhead is usually negligible. + +### 6. Initial Sync (First Device) + +When a new device syncs for the first time (`db_version = 0`), the push payload contains the **entire dataset**: every column of every row across all synced tables. The payload size is proportional to `N * C` (total rows times columns). + +The payload is built entirely in memory, starting with a 512 KB buffer (`CLOUDSYNC_PAYLOAD_MINBUF_SIZE` in `src/cloudsync.c`) and growing via `realloc` as needed. Peak memory usage is at least the full uncompressed payload size and can be higher during compression. For a database with 1 million rows and 10 columns of average 50 bytes each, the uncompressed payload could reach ~500 MB before LZ4 compression. + +Subsequent syncs are incremental (proportional to D, changes since the last sync), so the first sync is the expensive one. Applications with large datasets should plan for this -- for example, by seeding new devices from a database snapshot rather than syncing from scratch. + +### 7. WAL and Disk I/O Amplification + +Each write to a synced table generates additional metadata writes via AFTER triggers. The amplification factor depends on the operation: + +| Operation | Total Writes (base + metadata) | Amplification Factor | +|-----------|-------------------------------|---------------------| +| INSERT (C columns) | 1 + 1 sentinel + C metadata | ~C+2x | +| UPDATE (1 column) | 1 + 1 metadata | 2x | +| UPDATE (C columns) | 1 + C metadata | ~C+1x | +| DELETE | 1 + cleanup writes | variable | + +For a table with 10 non-PK columns, an INSERT generates roughly 12 logical row writes instead of 1. This increases WAL/page churn and affects: + +- **Disk I/O**: More pages written per transaction, larger WAL files between checkpoints. +- **WAL checkpoint frequency**: The WAL grows faster, so checkpoints run more often (or the WAL file stays larger if checkpointing is deferred). +- **Battery on mobile**: More disk writes per user action. Batching multiple writes in a single transaction amortizes the transaction overhead but not the per-row metadata cost. + +### 8. Locking During Sync Apply + +Payload application (`cloudsync_payload_apply`) uses savepoints grouped by source `db_version`. On SQLite, each savepoint holds a write lock for its duration. If the application runs sync on the main thread, other work on the same connection is blocked, and reads from other connections may block outside WAL mode. + +On SQLite, using WAL mode prevents readers on other connections from being blocked by writers, which is the recommended configuration for concurrent sync. + +### 9. Metadata Lifecycle (Tombstones and Cleanup) + +When a row is deleted, the per-column metadata rows are removed, but a **tombstone sentinel** (`__[RIP]__`) persists in the metadata table. This tombstone is necessary for propagating deletes to other devices during sync. There is no automatic garbage collection of tombstones -- they accumulate over time. + +Metadata cleanup for **removed columns** (after schema migration) only runs during `cloudsync_finalize_alter()`, which is called as part of the `cloudsync_alter()` workflow. Outside of schema changes, orphaned metadata from dropped columns remains in the metadata table. + +The **site ID table** (`cloudsync_site_id`) also grows monotonically -- one entry per unique device that has ever synced. This is typically small (one row per device) and not a concern in practice. + +For applications with high delete rates, the tombstone accumulation may become significant over time. Consider periodic full re-syncs or application-level archival strategies if this is a concern. + +### 10. Multi-Table Considerations + +The `cloudsync_changes` virtual table (SQLite) or set-returning function (PostgreSQL) dynamically constructs a `UNION ALL` query across all synced tables' metadata tables. The query construction cost scales as O(T) where T is the number of synced tables. + +For most applications (fewer than ~50 synced tables), this is negligible. Applications syncing a very large number of tables should be aware that payload generation involves iterating over all synced tables to check for changes. + +### Platform Differences (SQLite vs PostgreSQL) + +- **SQLite** uses native C triggers registered directly with the SQLite API. Metadata tables use `WITHOUT ROWID` for compact storage. +- **PostgreSQL** uses row-level PL/pgSQL trigger functions that call into C functions via the extension. This adds a small amount of overhead per trigger invocation compared to SQLite's direct C triggers. Additionally, merge operations use per-PK savepoints to handle failures such as RLS policy violations gracefully. +- **Table registration** (`cloudsync_enable()`) is a one-time operation on both platforms. It creates 1 metadata table, 1 index, and 3 triggers (INSERT, UPDATE, DELETE), plus ~15-20 prepared statements that are cached for the lifetime of the connection. + +## Comparison with Full-Scan Sync Solutions + +Many sync solutions must diff or hash the entire dataset to determine what changed. This leads to O(N) sync time that grows linearly with total database size -- the exact problem described in the question. + +CloudSync avoids this through its **monotonic versioning approach**: every write increments a monotonic `db_version` counter, and the sync query filters on this counter using an index. The result is that sync time depends mainly on the volume of changes (D), not on the total data size (N). + +``` +Full-scan sync: sync_time ~ O(N) -- grows with database size +CloudSync: sync_time ~ O(D) -- grows with changes since last sync + where D is independent of N when sync frequency is constant +``` + +## Performance Optimizations in the Implementation + +1. **`WITHOUT ROWID` tables** (SQLite): Metadata tables use clustered primary keys, avoiding the overhead of a separate rowid B-tree. +2. **`db_version` index**: Enables efficient range scans for delta extraction. +3. **Deferred batch merge**: Column changes for the same primary key are accumulated and flushed as a single SQL statement. +4. **Prepared statement caching**: Merge statements are compiled once and reused across rows. +5. **LZ4 compression**: Reduces payload size for network transfer. +6. **Per-column tracking**: Only changed columns are included in the sync payload, not entire rows. +7. **Early exit on stale data**: The CLS algorithm skips rows where the incoming causal length is lower than the local one, avoiding unnecessary column-level comparisons. diff --git a/README.md b/README.md index ba88213..b649fd3 100644 --- a/README.md +++ b/README.md @@ -16,10 +16,12 @@ In simple terms, CRDTs make it possible for multiple users to **edit shared data - [Key Features](#key-features) - [Built-in Network Layer](#built-in-network-layer) - [Row-Level Security](#row-level-security) +- [Block-Level LWW](#block-level-lww) - [What Can You Build with SQLite Sync?](#what-can-you-build-with-sqlite-sync) - [Documentation](#documentation) - [Installation](#installation) - [Getting Started](#getting-started) +- [Block-Level LWW Example](#block-level-lww-example) - [Database Schema Recommendations](#database-schema-recommendations) - [Primary Key Requirements](#primary-key-requirements) - [Column Constraint Guidelines](#column-constraint-guidelines) @@ -32,6 +34,7 @@ In simple terms, CRDTs make it possible for multiple users to **edit shared data - **Offline-First by Design**: Works seamlessly even when devices are offline. Changes are queued locally and synced automatically when connectivity is restored. - **CRDT-Based Conflict Resolution**: Merges updates deterministically and efficiently, ensuring eventual consistency across all replicas without the need for complex merge logic. +- **Block-Level LWW for Text**: Fine-grained conflict resolution for text columns. Instead of overwriting the entire cell, changes are tracked and merged at the line (or paragraph) level, so concurrent edits to different parts of the same text are preserved. - **Embedded Network Layer**: No external libraries or sync servers required. SQLiteSync handles connection setup, message encoding, retries, and state reconciliation internally. - **Drop-in Simplicity**: Just load the extension into SQLite and start syncing. No need to implement custom protocols or state machines. - **Efficient and Resilient**: Optimized binary encoding, automatic batching, and robust retry logic make synchronization fast and reliable even on flaky networks. @@ -50,21 +53,48 @@ The sync layer is tightly integrated with [**SQLite Cloud**](https://sqlitecloud ## Row-Level Security -Thanks to the underlying SQLite Cloud infrastructure, **SQLite Sync supports Row-Level Security (RLS)**—allowing you to define **precise access control at the row level**: +Thanks to the underlying SQLite Cloud infrastructure, **SQLite Sync supports Row-Level Security (RLS)**—allowing you to use a **single shared cloud database** while each client only sees and modifies its own data. RLS policies are enforced on the server, so the security boundary is at the database level, not in application code. - Control not just who can read or write a table, but **which specific rows** they can access. -- Enforce security policies on the server—no need for client-side filtering. +- Each device syncs only the rows it is authorized to see—no full dataset download, no client-side filtering. For example: - User A can only see and edit their own data. - User B can access a different set of rows—even within the same shared table. -**Benefits of RLS**: +**Benefits**: -- **Data isolation**: Ensure users only access what they’re authorized to see. -- **Built-in privacy**: Security policies are enforced at the database level. -- **Simplified development**: Reduce or eliminate complex permission logic in your application code. +- **Single database, multiple tenants**: One cloud database serves all users. RLS policies partition data per user or role, eliminating the need to provision separate databases. +- **Efficient sync**: Each client downloads only its authorized rows, reducing bandwidth and local storage. +- **Server-enforced security**: Policies are evaluated on the server during sync. A compromised or modified client cannot bypass access controls. +- **Simplified development**: No need to implement permission logic in your application—define policies once in the database and they apply everywhere. + +For more information, see the [SQLite Cloud RLS documentation](https://docs.sqlitecloud.io/docs/rls). + +## Block-Level LWW + +Standard CRDT sync resolves conflicts at the **cell level**: if two devices edit the same column of the same row, one value wins entirely. This works well for short values like names or statuses, but for longer text content — documents, notes, descriptions — it means the entire text is replaced even if the edits were in different parts. + +**Block-Level LWW** (Last-Writer-Wins) solves this by splitting text columns into **blocks** (lines by default) and tracking each block independently. When two devices edit different lines of the same text, **both edits are preserved** after sync. Only when two devices edit the *same* line does LWW conflict resolution apply. + +### How It Works + +1. **Enable block tracking** on a text column using `cloudsync_set_column()`. +2. On INSERT or UPDATE, SQLite Sync automatically splits the text into blocks using the configured delimiter (default: newline `\n`). +3. Each block gets a unique fractional index position, enabling insertions between existing blocks without reindexing. +4. During sync, changes are merged block-by-block rather than replacing the whole cell. +5. Use `cloudsync_text_materialize()` to reconstruct the full text from blocks on demand, or read the column directly (it is updated automatically after merge). + +### Key Properties + +- **Non-conflicting edits are preserved**: Two users editing different lines of the same document both see their changes after sync. +- **Same-line conflicts use LWW**: If two users edit the same line, the last writer wins — consistent with standard CRDT behavior. +- **Custom delimiters**: Use paragraph separators (`\n\n`), sentence boundaries, or any string as the block delimiter. +- **Mixed columns**: A table can have both regular LWW columns and block-level LWW columns side by side. +- **Transparent reads**: The base column always contains the current full text. Block tracking is an internal mechanism; your queries work unchanged. + +For setup instructions and a complete example, see [Block-Level LWW Example](#block-level-lww-example). For API details, see the [API Reference](./API.md). ### What Can You Build with SQLite Sync? @@ -102,7 +132,13 @@ SQLite Sync is ideal for building collaborative and distributed apps across web, ## Documentation -For detailed information on all available functions, their parameters, and examples, refer to the [comprehensive API Reference](./API.md). +For detailed information on all available functions, their parameters, and examples, refer to the [comprehensive API Reference](./API.md). The API includes: + +- **Configuration Functions** — initialize, enable, and disable sync on tables +- **Block-Level LWW Functions** — configure block tracking on text columns and materialize text from blocks +- **Helper Functions** — version info, site IDs, UUID generation +- **Schema Alteration Functions** — safely alter synced tables +- **Network Functions** — connect, authenticate, send/receive changes, and monitor sync status ## Installation @@ -256,7 +292,7 @@ sqlite3 myapp.db -- Create a table (primary key MUST be TEXT for global uniqueness) CREATE TABLE IF NOT EXISTS my_data ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, value TEXT NOT NULL DEFAULT '', created_at TEXT DEFAULT CURRENT_TIMESTAMP ); @@ -279,17 +315,19 @@ UPDATE my_data SET value = 'Updated: Hello from device A!' WHERE value LIKE 'Hel SELECT * FROM my_data ORDER BY created_at; -- Configure network connection before using the network sync functions -SELECT cloudsync_network_init('sqlitecloud://your-project-id.sqlite.cloud/database.sqlite'); +-- The managedDatabaseId is obtained from the OffSync page on the SQLiteCloud dashboard +SELECT cloudsync_network_init('your-managed-database-id'); SELECT cloudsync_network_set_apikey('your-api-key-here'); -- Or use token authentication (required for Row-Level Security) -- SELECT cloudsync_network_set_token('your_auth_token'); --- Sync with cloud: send local changes, then check the remote server for new changes +-- Sync with cloud: send local changes, then check the remote server for new changes -- and, if a package with changes is ready to be downloaded, applies them to the local database SELECT cloudsync_network_sync(); --- Keep calling periodically. The function returns > 0 if data was received --- In production applications, you would typically call this periodically --- rather than manually (e.g., every few seconds) +-- Returns a JSON string with sync status, e.g.: +-- '{"send":{"status":"synced","localVersion":5,"serverVersion":5},"receive":{"rows":3,"tables":["my_data"]}}' +-- Keep calling periodically. In production applications, you would typically +-- call this periodically rather than manually (e.g., every few seconds) SELECT cloudsync_network_sync(); -- Before closing the database connection @@ -304,7 +342,7 @@ SELECT cloudsync_terminate(); -- Load extension and create identical table structure .load ./cloudsync CREATE TABLE IF NOT EXISTS my_data ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, value TEXT NOT NULL DEFAULT '', created_at TEXT DEFAULT CURRENT_TIMESTAMP ); @@ -314,9 +352,9 @@ SELECT cloudsync_init('my_data'); SELECT cloudsync_network_init('sqlitecloud://your-project-id.sqlite.cloud/database.sqlite'); SELECT cloudsync_network_set_apikey('your-api-key-here'); --- Sync to get data from the first device +-- Sync to get data from the first device SELECT cloudsync_network_sync(); --- repeat until data is received (returns > 0) +-- Repeat — check receive.rows in the JSON result to see if data was received SELECT cloudsync_network_sync(); -- View synchronized data @@ -342,10 +380,115 @@ SELECT cloudsync_terminate(); See the [examples](./examples/simple-todo-db/) directory for a comprehensive walkthrough including: - Multi-device collaboration -- Offline scenarios +- Offline scenarios - Row-level security setup - Conflict resolution demonstrations +## Block-Level LWW Example + +This example shows how to enable block-level text sync on a notes table, so that concurrent edits to different lines are merged instead of overwritten. + +### Setup + +```sql +-- Load the extension +.load ./cloudsync + +-- Create a table with a text column for long-form content +CREATE TABLE notes ( + id TEXT PRIMARY KEY NOT NULL, + title TEXT NOT NULL DEFAULT '', + body TEXT NOT NULL DEFAULT '' +); + +-- Initialize sync on the table +SELECT cloudsync_init('notes'); + +-- Enable block-level LWW on the "body" column +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block'); +``` + +After this setup, every INSERT or UPDATE to the `body` column automatically splits the text into blocks (one per line) and tracks each block independently. + +### Two-Device Scenario + +```sql +-- Device A: create a note +INSERT INTO notes (id, title, body) VALUES ( + 'note-001', + 'Meeting Notes', + 'Line 1: Welcome +Line 2: Agenda +Line 3: Action items' +); + +-- Sync Device A -> Cloud -> Device B +-- (Both devices now have the same 3-line note) +``` + +```sql +-- Device A (offline): edit line 1 +UPDATE notes SET body = 'Line 1: Welcome everyone +Line 2: Agenda +Line 3: Action items' WHERE id = 'note-001'; + +-- Device B (offline): edit line 3 +UPDATE notes SET body = 'Line 1: Welcome +Line 2: Agenda +Line 3: Action items - DONE' WHERE id = 'note-001'; +``` + +```sql +-- After both devices sync, the merged result is: +-- 'Line 1: Welcome everyone +-- Line 2: Agenda +-- Line 3: Action items - DONE' +-- +-- Both edits are preserved because they affected different lines. +``` + +### Custom Delimiter + +For paragraph-level tracking (useful for long-form documents), set a custom delimiter: + +```sql +-- Use double newline as delimiter (paragraph separator) +SELECT cloudsync_set_column('notes', 'body', 'delimiter', ' + +'); +``` + +### Materializing Text + +After a merge, the `body` column contains the reconstructed text automatically. You can also manually trigger materialization: + +```sql +-- Reconstruct body from blocks for a specific row +SELECT cloudsync_text_materialize('notes', 'body', 'note-001'); + +-- Then read normally +SELECT body FROM notes WHERE id = 'note-001'; +``` + +### Mixed Columns + +Block-level LWW can be enabled on specific columns while other columns use standard cell-level LWW: + +```sql +CREATE TABLE docs ( + id TEXT PRIMARY KEY NOT NULL, + title TEXT NOT NULL DEFAULT '', -- standard LWW (cell-level) + body TEXT NOT NULL DEFAULT '', -- block LWW (line-level) + status TEXT NOT NULL DEFAULT '' -- standard LWW (cell-level) +); + +SELECT cloudsync_init('docs'); +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block'); + +-- Now: concurrent edits to "title" or "status" use normal LWW, +-- while concurrent edits to "body" merge at the line level. +``` + ## 📦 Integrations Use SQLite-AI alongside: @@ -363,12 +506,12 @@ When designing your database schema for SQLite Sync, follow these best practices - **Use globally unique identifiers**: Always use TEXT primary keys with UUIDs, ULIDs, or similar globally unique identifiers - **Avoid auto-incrementing integers**: Integer primary keys can cause conflicts across multiple devices - **Use `cloudsync_uuid()`**: The built-in function generates UUIDv7 identifiers optimized for distributed systems -- **All primary keys must be explicitly declared as `NOT NULL`**. +- **Note:** Any write operation that includes a NULL value for a primary key column will be rejected with an error, even if SQLite would normally allow it due to a legacy behavior. ```sql -- ✅ Recommended: Globally unique TEXT primary key CREATE TABLE users ( - id TEXT PRIMARY KEY NOT NULL, -- Use cloudsync_uuid() + id TEXT PRIMARY KEY, -- Use cloudsync_uuid() name TEXT NOT NULL, email TEXT UNIQUE NOT NULL ); diff --git a/docker/Makefile.postgresql b/docker/Makefile.postgresql index 17ae6c4..70b3da9 100644 --- a/docker/Makefile.postgresql +++ b/docker/Makefile.postgresql @@ -20,7 +20,9 @@ PG_CORE_SRC = \ src/dbutils.c \ src/pk.c \ src/utils.c \ - src/lz4.c + src/lz4.c \ + src/block.c \ + modules/fractional-indexing/fractional_indexing.c # PostgreSQL-specific implementation PG_IMPL_SRC = \ @@ -35,7 +37,7 @@ PG_OBJS = $(PG_ALL_SRC:.c=.o) # Compiler flags # Define POSIX macros as compiler flags to ensure they're defined before any includes -PG_CPPFLAGS = -I$(PG_INCLUDEDIR) -Isrc -Isrc/postgresql -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE +PG_CPPFLAGS = -I$(PG_INCLUDEDIR) -Isrc -Isrc/postgresql -Imodules/fractional-indexing -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE PG_CFLAGS = -fPIC -Wall -Wextra -Wno-unused-parameter -std=c11 -O2 PG_DEBUG ?= 0 ifeq ($(PG_DEBUG),1) @@ -238,7 +240,7 @@ postgres-docker-shell: # Build CloudSync into the Supabase CLI postgres image tag postgres-supabase-build: @echo "Building CloudSync image for Supabase CLI..." - @tmp_dockerfile="$$(mktemp /tmp/cloudsync-supabase-cli.XXXXXX)"; \ + @tmp_dockerfile="$$(mktemp ./cloudsync-supabase-cli.XXXXXX)"; \ src_dockerfile="$(SUPABASE_CLI_DOCKERFILE)"; \ supabase_cli_image="$(SUPABASE_CLI_IMAGE)"; \ if [ -z "$$supabase_cli_image" ]; then \ @@ -267,6 +269,8 @@ postgres-supabase-build: exit 1; \ fi; \ echo "Using base image: $$supabase_cli_image"; \ + echo "Pulling fresh base image to avoid layer accumulation..."; \ + docker pull "$$supabase_cli_image" 2>/dev/null || true; \ docker build --build-arg SUPABASE_POSTGRES_TAG="$(SUPABASE_POSTGRES_TAG)" -f "$$tmp_dockerfile" -t "$$supabase_cli_image" .; \ rm -f "$$tmp_dockerfile"; \ echo "Build complete: $$supabase_cli_image" diff --git a/docker/postgresql/Dockerfile b/docker/postgresql/Dockerfile index 536b963..ec3d30c 100644 --- a/docker/postgresql/Dockerfile +++ b/docker/postgresql/Dockerfile @@ -14,6 +14,7 @@ WORKDIR /tmp/cloudsync # Copy entire source tree (needed for includes and makefiles) COPY src/ ./src/ +COPY modules/ ./modules/ COPY docker/ ./docker/ COPY Makefile . diff --git a/docker/postgresql/Dockerfile.debug b/docker/postgresql/Dockerfile.debug index caf1091..3f77c04 100644 --- a/docker/postgresql/Dockerfile.debug +++ b/docker/postgresql/Dockerfile.debug @@ -51,6 +51,7 @@ ENV LD_LIBRARY_PATH="/usr/local/pgsql/lib:${LD_LIBRARY_PATH}" # Copy entire source tree (needed for includes and makefiles) COPY src/ ./src/ +COPY modules/ ./modules/ COPY docker/ ./docker/ COPY Makefile . @@ -65,11 +66,11 @@ RUN set -eux; \ make postgres-build PG_DEBUG=1 \ PG_CFLAGS="-fPIC -Wall -Wextra -Wno-unused-parameter -std=c11 -g -O0 -fno-omit-frame-pointer ${ASAN_CFLAGS}" \ PG_LDFLAGS="-shared ${ASAN_LDFLAGS}" \ - PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ + PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -Imodules/fractional-indexing -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ make postgres-install PG_DEBUG=1 \ PG_CFLAGS="-fPIC -Wall -Wextra -Wno-unused-parameter -std=c11 -g -O0 -fno-omit-frame-pointer ${ASAN_CFLAGS}" \ PG_LDFLAGS="-shared ${ASAN_LDFLAGS}" \ - PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ + PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -Imodules/fractional-indexing -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ make postgres-clean # Verify installation diff --git a/docker/postgresql/Dockerfile.debug-no-optimization b/docker/postgresql/Dockerfile.debug-no-optimization index caf1091..3f77c04 100644 --- a/docker/postgresql/Dockerfile.debug-no-optimization +++ b/docker/postgresql/Dockerfile.debug-no-optimization @@ -51,6 +51,7 @@ ENV LD_LIBRARY_PATH="/usr/local/pgsql/lib:${LD_LIBRARY_PATH}" # Copy entire source tree (needed for includes and makefiles) COPY src/ ./src/ +COPY modules/ ./modules/ COPY docker/ ./docker/ COPY Makefile . @@ -65,11 +66,11 @@ RUN set -eux; \ make postgres-build PG_DEBUG=1 \ PG_CFLAGS="-fPIC -Wall -Wextra -Wno-unused-parameter -std=c11 -g -O0 -fno-omit-frame-pointer ${ASAN_CFLAGS}" \ PG_LDFLAGS="-shared ${ASAN_LDFLAGS}" \ - PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ + PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -Imodules/fractional-indexing -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ make postgres-install PG_DEBUG=1 \ PG_CFLAGS="-fPIC -Wall -Wextra -Wno-unused-parameter -std=c11 -g -O0 -fno-omit-frame-pointer ${ASAN_CFLAGS}" \ PG_LDFLAGS="-shared ${ASAN_LDFLAGS}" \ - PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ + PG_CPPFLAGS="-I$(pg_config --includedir-server) -Isrc -Isrc/postgresql -Imodules/fractional-indexing -DCLOUDSYNC_POSTGRESQL_BUILD -D_POSIX_C_SOURCE=200809L -D_GNU_SOURCE" && \ make postgres-clean # Verify installation diff --git a/docker/postgresql/Dockerfile.supabase b/docker/postgresql/Dockerfile.supabase index a609f68..0b5cd10 100644 --- a/docker/postgresql/Dockerfile.supabase +++ b/docker/postgresql/Dockerfile.supabase @@ -15,6 +15,7 @@ WORKDIR /tmp/cloudsync # Copy entire source tree (needed for includes and makefiles) COPY src/ ./src/ +COPY modules/ ./modules/ COPY docker/ ./docker/ COPY Makefile . diff --git a/docs/Network.md b/docs/Network.md index 7120231..7e03bbe 100644 --- a/docs/Network.md +++ b/docs/Network.md @@ -34,14 +34,6 @@ This is useful when: You must provide implementations for the following C functions: - ```c - bool network_compute_endpoints (sqlite3_context *context, network_data *data, const char *conn_string); - - // Parses `conn_string` and fills the `network_data` structure with connection information (e.g. base URL, endpoints, credentials). - // Returns `true` on success, `false` on error (you can use `sqlite3_result_error` to report errors to SQLite). - - ``` - ```c bool network_send_buffer (network_data *data, const char *endpoint, const char *authentication, const void *blob, int blob_size); diff --git a/docs/postgresql/CLIENT.md b/docs/postgresql/CLIENT.md index 9ef8cc8..58751d1 100644 --- a/docs/postgresql/CLIENT.md +++ b/docs/postgresql/CLIENT.md @@ -34,8 +34,8 @@ so CloudSync can sync between a PostgreSQL server and SQLite clients. ### 1) Primary Keys -- Use **TEXT NOT NULL** primary keys in SQLite. -- PostgreSQL primary keys can be **TEXT NOT NULL** or **UUID**. If the PK type +- Use **TEXT** primary keys in SQLite. +- PostgreSQL primary keys can be **TEXT** or **UUID**. If the PK type isn't explicitly mapped to a DBTYPE (like UUID), it will be converted to TEXT in the payload so it remains compatible with the SQLite extension. - Generate IDs with `cloudsync_uuid()` on both sides. @@ -43,17 +43,17 @@ so CloudSync can sync between a PostgreSQL server and SQLite clients. SQLite: ```sql -id TEXT PRIMARY KEY NOT NULL +id TEXT PRIMARY KEY ``` PostgreSQL: ```sql -id TEXT PRIMARY KEY NOT NULL +id TEXT PRIMARY KEY ``` PostgreSQL (UUID): ```sql -id UUID PRIMARY KEY NOT NULL +id UUID PRIMARY KEY ``` ### 2) NOT NULL Columns Must Have DEFAULTs @@ -99,7 +99,7 @@ Use defaults that serialize the same on both sides: SQLite: ```sql CREATE TABLE notes ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, title TEXT NOT NULL DEFAULT '', body TEXT DEFAULT '', views INTEGER NOT NULL DEFAULT 0, @@ -111,7 +111,7 @@ CREATE TABLE notes ( PostgreSQL: ```sql CREATE TABLE notes ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, title TEXT NOT NULL DEFAULT '', body TEXT DEFAULT '', views INTEGER NOT NULL DEFAULT 0, @@ -136,7 +136,7 @@ SELECT cloudsync_init('notes'); ### Checklist -- [ ] PKs are TEXT + NOT NULL +- [ ] PKs are TEXT (or UUID in PostgreSQL) - [ ] All NOT NULL columns have DEFAULT - [ ] Only INTEGER/FLOAT/TEXT/BLOB-compatible types - [ ] Same column names and order diff --git a/docs/postgresql/RLS.md b/docs/postgresql/RLS.md new file mode 100644 index 0000000..cc686ad --- /dev/null +++ b/docs/postgresql/RLS.md @@ -0,0 +1,192 @@ +# Row Level Security (RLS) with CloudSync + +CloudSync is fully compatible with PostgreSQL Row Level Security. Standard RLS policies work out of the box. + +## How It Works + +### Column-batch merge + +CloudSync resolves CRDT conflicts at the column level — a sync payload may contain individual column changes arriving one at a time. Before writing to the target table, CloudSync buffers all winning column values for the same primary key and flushes them as a single SQL statement. This ensures the database sees a complete row with all columns present. + +### UPDATE vs INSERT selection + +When flushing a batch, CloudSync chooses the statement type based on whether the row already exists locally: + +- **New row**: `INSERT ... ON CONFLICT DO UPDATE` — all columns are present (including the ownership column), so the INSERT `WITH CHECK` policy can evaluate correctly. +- **Existing row**: `UPDATE ... SET ... WHERE pk = ...` — only the changed columns are set. The UPDATE `USING` policy checks the existing row, which already has the correct ownership column value. + +### Per-PK savepoint isolation + +Each primary key's flush is wrapped in its own savepoint. When RLS denies a write: + +1. The database raises an error inside the savepoint +2. CloudSync rolls back that savepoint, releasing all resources acquired during the failed statement +3. Processing continues with the next primary key + +This means a single payload can contain a mix of allowed and denied rows — allowed rows commit normally, denied rows are silently skipped. The caller receives the total number of column changes processed (including denied ones) rather than an error. + +## Quick Setup + +Given a table with an ownership column (`user_id`): + +```sql +CREATE TABLE documents ( + id TEXT PRIMARY KEY, + user_id UUID, + title TEXT, + content TEXT +); + +SELECT cloudsync_init('documents'); +``` + +Enable RLS and create standard policies: + +```sql +ALTER TABLE documents ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "select_own" ON documents FOR SELECT + USING (auth.uid() = user_id); + +CREATE POLICY "insert_own" ON documents FOR INSERT + WITH CHECK (auth.uid() = user_id); + +CREATE POLICY "update_own" ON documents FOR UPDATE + USING (auth.uid() = user_id) + WITH CHECK (auth.uid() = user_id); + +CREATE POLICY "delete_own" ON documents FOR DELETE + USING (auth.uid() = user_id); +``` + +## Example: Two-User Sync with RLS + +This example shows the complete flow of syncing data between two databases where the target enforces RLS. + +### Setup + +```sql +-- Source database (DB A) — no RLS, represents the sync server +CREATE TABLE documents ( + id TEXT PRIMARY KEY, user_id UUID, title TEXT, content TEXT +); +SELECT cloudsync_init('documents'); + +-- Target database (DB B) — RLS enforced +CREATE TABLE documents ( + id TEXT PRIMARY KEY, user_id UUID, title TEXT, content TEXT +); +SELECT cloudsync_init('documents'); +ALTER TABLE documents ENABLE ROW LEVEL SECURITY; +-- (policies as above) +``` + +### Insert sync + +User 1 creates a document on DB A: + +```sql +-- On DB A +INSERT INTO documents VALUES ('doc1', 'user1-uuid', 'Hello', 'World'); +``` + +Apply the payload on DB B as the authenticated user: + +```sql +-- On DB B (running as user1) +SET app.current_user_id = 'user1-uuid'; +SET ROLE authenticated; +SELECT cloudsync_payload_apply(decode(:payload_hex, 'hex')); +``` + +The insert succeeds because `user_id` matches `auth.uid()`. + +### Insert denial + +User 1 tries to sync a document owned by user 2: + +```sql +-- On DB A +INSERT INTO documents VALUES ('doc2', 'user2-uuid', 'Secret', 'Data'); +``` + +```sql +-- On DB B (running as user1) +SET app.current_user_id = 'user1-uuid'; +SET ROLE authenticated; +SELECT cloudsync_payload_apply(decode(:payload_hex, 'hex')); +``` + +The insert is denied by RLS. The row does not appear in DB B. No error is raised to the caller — CloudSync isolates the failure via a per-PK savepoint and continues processing the remaining payload. + +### Partial update sync + +User 1 updates only the title of their own document: + +```sql +-- On DB A +UPDATE documents SET title = 'Hello Updated' WHERE id = 'doc1'; +``` + +The sync payload contains only the changed column (`title`). CloudSync detects that the row already exists on DB B and uses a plain `UPDATE` statement: + +```sql +UPDATE documents SET title = $2 WHERE id = $1; +``` + +The UPDATE policy checks the existing row (which has the correct `user_id`), so it succeeds. + +### Mixed payload + +When a single payload contains rows for multiple users, CloudSync handles each primary key independently: + +```sql +-- On DB A +INSERT INTO documents VALUES ('doc3', 'user1-uuid', 'Mine', '...'); +INSERT INTO documents VALUES ('doc4', 'user2-uuid', 'Theirs', '...'); +``` + +```sql +-- On DB B (running as user1) +SELECT cloudsync_payload_apply(decode(:payload_hex, 'hex')); +-- doc3 is inserted (allowed), doc4 is silently skipped (denied) +``` + +## Supabase Notes + +When using Supabase: + +1. **auth.uid()**: Returns the authenticated user's UUID from the JWT claims. +2. **JWT propagation**: Ensure the JWT token is set before sync operations: + ```sql + SELECT set_config('request.jwt.claims', '{"sub": "user-uuid", ...}', true); + ``` +3. **Service role bypass**: The Supabase service role bypasses RLS entirely. Use the `authenticated` role for user-context operations where RLS enforcement is desired. + +## Troubleshooting + +### "new row violates row-level security policy" + +**Symptom**: Insert operations fail during sync. + +**Cause**: The ownership column value doesn't match the authenticated user. + +**Solution**: Verify that: +- The JWT / session variable is set correctly before calling `cloudsync_payload_apply` +- The `user_id` column in the synced data matches `auth.uid()` +- RLS policies reference the correct ownership column + +### Debugging + +```sql +-- Check current auth context +SELECT auth.uid(); + +-- Inspect a specific row's ownership +SELECT id, user_id FROM documents WHERE id = 'problematic-pk'; + +-- Temporarily disable RLS to inspect all data +ALTER TABLE documents DISABLE ROW LEVEL SECURITY; +-- ... inspect ... +ALTER TABLE documents ENABLE ROW LEVEL SECURITY; +``` diff --git a/docs/postgresql/SUPABASE.md b/docs/postgresql/SUPABASE.md index 94aa466..a800ae3 100644 --- a/docs/postgresql/SUPABASE.md +++ b/docs/postgresql/SUPABASE.md @@ -76,7 +76,7 @@ SELECT cloudsync_version(); ```sql CREATE TABLE notes ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, body TEXT DEFAULT '' ); diff --git a/examples/simple-todo-db/README.md b/examples/simple-todo-db/README.md index c9967a5..6c7e977 100644 --- a/examples/simple-todo-db/README.md +++ b/examples/simple-todo-db/README.md @@ -59,7 +59,7 @@ Tables must be created on both the local database and SQLite Cloud with identica -- Create the main tasks table -- Note: Primary key MUST be TEXT (not INTEGER) for global uniqueness CREATE TABLE IF NOT EXISTS tasks ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, userid TEXT NOT NULL DEFAULT '', title TEXT NOT NULL DEFAULT '', description TEXT DEFAULT '', @@ -84,7 +84,7 @@ SELECT cloudsync_is_enabled('tasks'); - Execute the same CREATE TABLE statement: ```sql CREATE TABLE IF NOT EXISTS tasks ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, userid TEXT NOT NULL DEFAULT '', title TEXT NOT NULL DEFAULT '', description TEXT DEFAULT '', @@ -104,8 +104,8 @@ SELECT cloudsync_is_enabled('tasks'); ```sql -- Configure connection to SQLite Cloud --- Replace with your actual connection string from Step 1.3 -SELECT cloudsync_network_init('sqlitecloud://your-project-id.sqlite.cloud/todo_app.sqlite'); +-- Replace with your managedDatabaseId from the OffSync page on the SQLiteCloud dashboard +SELECT cloudsync_network_init('your-managed-database-id'); -- Configure authentication: -- Set your API key from Step 1.3 @@ -149,7 +149,7 @@ sqlite3 todo_device_b.db ```sql -- Create identical table structure CREATE TABLE IF NOT EXISTS tasks ( - id TEXT PRIMARY KEY NOT NULL, + id TEXT PRIMARY KEY, userid TEXT NOT NULL DEFAULT '', title TEXT NOT NULL DEFAULT '', description TEXT DEFAULT '', @@ -163,12 +163,12 @@ CREATE TABLE IF NOT EXISTS tasks ( SELECT cloudsync_init('tasks'); -- Connect to the same cloud database -SELECT cloudsync_network_init('sqlitecloud://your-project-id.sqlite.cloud/todo_app.sqlite'); +SELECT cloudsync_network_init('your-managed-database-id'); SELECT cloudsync_network_set_apikey('your-api-key-here'); -- Pull data from Device A - repeat until data is received SELECT cloudsync_network_sync(); --- Keep calling until the function returns > 0 (indicating data was received) +-- Check "receive.rows" in the JSON result to see if data was received SELECT cloudsync_network_sync(); -- Verify data was synced @@ -199,7 +199,7 @@ SELECT cloudsync_network_sync(); ```sql -- Get updates from Device B - repeat until data is received SELECT cloudsync_network_sync(); --- Keep calling until the function returns > 0 (indicating data was received) +-- Check "receive.rows" in the JSON result to see if data was received SELECT cloudsync_network_sync(); -- View all tasks (should now include Device B's additions) @@ -232,7 +232,7 @@ SELECT cloudsync_network_has_unsent_changes(); -- When network returns, sync automatically resolves conflicts -- Repeat until all changes are synchronized SELECT cloudsync_network_sync(); --- Keep calling until the function returns > 0 (indicating data was received/sent) +-- Check "receive.rows" and "send.status" in the JSON result SELECT cloudsync_network_sync(); ``` diff --git a/examples/sport-tracker-app/.env.example b/examples/sport-tracker-app/.env.example index ce5c7cc..c534674 100644 --- a/examples/sport-tracker-app/.env.example +++ b/examples/sport-tracker-app/.env.example @@ -1,6 +1,5 @@ -# Copy from from the SQLite Cloud Dashboard -# eg: sqlitecloud://myhost.cloud:8860/my-remote-database.sqlite -VITE_SQLITECLOUD_CONNECTION_STRING= +# Copy the managedDatabaseId from the OffSync page on the SQLiteCloud Dashboard +VITE_SQLITECLOUD_MANAGED_DATABASE_ID= # The database name # eg: my-remote-database.sqlite VITE_SQLITECLOUD_DATABASE= diff --git a/examples/sport-tracker-app/src/db/sqliteSyncOperations.ts b/examples/sport-tracker-app/src/db/sqliteSyncOperations.ts index 90e8982..79fe440 100644 --- a/examples/sport-tracker-app/src/db/sqliteSyncOperations.ts +++ b/examples/sport-tracker-app/src/db/sqliteSyncOperations.ts @@ -90,12 +90,12 @@ export const initSQLiteSync = (db: any) => { // ...or initialize all tables at once // db.exec('SELECT cloudsync_init("*");'); - // Initialize SQLite Sync with the SQLite Cloud Connection String. - // On the SQLite Cloud Dashboard, enable OffSync (SQLite Sync) - // on the remote database and copy the Connection String. + // Initialize SQLite Sync with the managedDatabaseId. + // On the SQLite Cloud Dashboard, enable OffSync (SQLite Sync) + // on the remote database and copy the managedDatabaseId. db.exec( `SELECT cloudsync_network_init('${ - import.meta.env.VITE_SQLITECLOUD_CONNECTION_STRING + import.meta.env.VITE_SQLITECLOUD_MANAGED_DATABASE_ID }')` ); }; diff --git a/examples/to-do-app/.env.example b/examples/to-do-app/.env.example index 267ea63..ba99068 100644 --- a/examples/to-do-app/.env.example +++ b/examples/to-do-app/.env.example @@ -1,4 +1,3 @@ -# Copy from the SQLite Cloud Dashboard -# eg: sqlitecloud://myhost.cloud:8860/my-remote-database.sqlite?apikey=myapikey -CONNECTION_STRING = "" -API_TOKEN = \ No newline at end of file +# Copy from the OffSync page on the SQLiteCloud Dashboard +MANAGED_DATABASE_ID = "" +API_TOKEN = diff --git a/examples/to-do-app/components/SyncContext.js b/examples/to-do-app/components/SyncContext.js index e964f4a..7b076ef 100644 --- a/examples/to-do-app/components/SyncContext.js +++ b/examples/to-do-app/components/SyncContext.js @@ -58,10 +58,14 @@ export const SyncProvider = ({ children }) => { const result = await Promise.race([queryPromise, timeoutPromise]); - if (result.rows && result.rows.length > 0 && result.rows[0]['cloudsync_network_check_changes()'] > 0) { - console.log(`${result.rows[0]['cloudsync_network_check_changes()']} changes detected, triggering refresh`); - // Defer refresh to next tick to avoid blocking current interaction - setTimeout(() => triggerRefresh(), 0); + const raw = result.rows?.[0]?.['cloudsync_network_check_changes()']; + if (raw) { + const { receive } = JSON.parse(raw); + if (receive.rows > 0) { + console.log(`${receive.rows} changes detected in [${receive.tables}], triggering refresh`); + // Defer refresh to next tick to avoid blocking current interaction + setTimeout(() => triggerRefresh(), 0); + } } } catch (error) { console.error('Error checking for changes:', error); diff --git a/examples/to-do-app/hooks/useCategories.js b/examples/to-do-app/hooks/useCategories.js index a27ef4a..dc608bd 100644 --- a/examples/to-do-app/hooks/useCategories.js +++ b/examples/to-do-app/hooks/useCategories.js @@ -1,7 +1,7 @@ import { useState, useEffect } from 'react' import { Platform } from 'react-native'; import { db } from "../db/dbConnection"; -import { ANDROID_CONNECTION_STRING, CONNECTION_STRING, API_TOKEN } from "@env"; +import { ANDROID_MANAGED_DATABASE_ID, MANAGED_DATABASE_ID, API_TOKEN } from "@env"; import { getDylibPath } from "@op-engineering/op-sqlite"; import { randomUUID } from 'expo-crypto'; import { useSyncContext } from '../components/SyncContext'; @@ -72,11 +72,11 @@ const useCategories = () => { await db.execute('INSERT OR IGNORE INTO tags (uuid, name) VALUES (?, ?)', ['work', 'Work']) await db.execute('INSERT OR IGNORE INTO tags (uuid, name) VALUES (?, ?)', ['personal', 'Personal']) - if ((ANDROID_CONNECTION_STRING || CONNECTION_STRING) && API_TOKEN) { - await db.execute(`SELECT cloudsync_network_init('${Platform.OS == 'android' && ANDROID_CONNECTION_STRING ? ANDROID_CONNECTION_STRING : CONNECTION_STRING}');`); + if ((ANDROID_MANAGED_DATABASE_ID || MANAGED_DATABASE_ID) && API_TOKEN) { + await db.execute(`SELECT cloudsync_network_init('${Platform.OS == 'android' && ANDROID_MANAGED_DATABASE_ID ? ANDROID_MANAGED_DATABASE_ID : MANAGED_DATABASE_ID}');`); await db.execute(`SELECT cloudsync_network_set_token('${API_TOKEN}');`) } else { - throw new Error('No valid CONNECTION_STRING or API_TOKEN provided, cloudsync_network_init will not be called'); + throw new Error('No valid MANAGED_DATABASE_ID or API_TOKEN provided, cloudsync_network_init will not be called'); } db.execute('SELECT cloudsync_network_sync(100, 10);') diff --git a/modules/fractional-indexing b/modules/fractional-indexing new file mode 160000 index 0000000..b9af0ec --- /dev/null +++ b/modules/fractional-indexing @@ -0,0 +1 @@ +Subproject commit b9af0ec5b818bca29919e1a8d42b142feb71f269 diff --git a/plans/BATCH_MERGE_AND_RLS.md b/plans/BATCH_MERGE_AND_RLS.md new file mode 100644 index 0000000..def727e --- /dev/null +++ b/plans/BATCH_MERGE_AND_RLS.md @@ -0,0 +1,166 @@ +# Deferred Column-Batch Merge and RLS Support + +## Problem + +CloudSync resolves CRDT conflicts per-column, so `cloudsync_payload_apply` processes column changes one at a time. Previously each winning column was written immediately via a single-column `INSERT ... ON CONFLICT DO UPDATE`. This caused two issues with PostgreSQL RLS: + +1. **Partial-column UPSERT fails INSERT WITH CHECK**: An update to just `title` generates `INSERT INTO docs (id, title) VALUES (...) ON CONFLICT DO UPDATE SET title=...`. PostgreSQL evaluates the INSERT `WITH CHECK` policy *before* checking for conflicts. Missing columns (e.g. `user_id`) default to NULL, so `auth.uid() = user_id` fails. The ON CONFLICT path is never reached. + +2. **Premature flush in SPI**: `database_in_transaction()` always returns true inside PostgreSQL SPI. The old code only updated `last_payload_db_version` inside `if (!in_transaction && db_version_changed)`, so the variable stayed at -1, `db_version_changed` was true on every row, and batches flushed after every single column. + +## Solution + +### Batch merge (`merge_pending_batch`) + +New structs in `cloudsync.c`: + +- `merge_pending_entry` — one buffered column (col_name, col_value via `database_value_dup`, col_version, db_version, site_id, seq) +- `merge_pending_batch` — collects entries for one PK (table, pk, row_exists flag, entries array, statement cache) + +`data->pending_batch` is set to `&batch` (stack-allocated) at the start of `cloudsync_payload_apply`. The INSTEAD OF trigger calls `merge_insert`, which calls `merge_pending_add` instead of `merge_insert_col`. Flush happens at PK/table/db_version boundaries and after the loop. + +### UPDATE vs UPSERT (`row_exists` flag) + +`merge_insert` sets `batch->row_exists = (local_cl != 0)` on the first winning column. At flush time `merge_flush_pending` selects: + +- `row_exists=true` -> `sql_build_update_pk_and_multi_cols` -> `UPDATE docs SET title=? WHERE id=?` +- `row_exists=false` -> `sql_build_upsert_pk_and_multi_cols` -> `INSERT ... ON CONFLICT DO UPDATE` + +Both SQLite and PostgreSQL implement `sql_build_update_pk_and_multi_cols` as a proper UPDATE statement. This is required for SQLiteCloud (which uses the SQLite extension but enforces RLS). + +**Example**: DB A and DB B both have row `id='doc1'` with `user_id='alice'`, `title='Hello'`. Alice updates `title='World'` on A. The payload applied to B contains only `(id, title)`: + +- **UPSERT** (wrong for RLS): `INSERT INTO docs ("id","title") VALUES (?,?) ON CONFLICT DO UPDATE SET "title"=EXCLUDED."title"` — fails INSERT `WITH CHECK` because `user_id` is NULL in the proposed row. +- **UPDATE** (correct): `UPDATE "docs" SET "title"=?2 WHERE "id"=?1` — skips INSERT `WITH CHECK` entirely; the UPDATE `USING` policy checks the existing row which has the correct `user_id`. + +In plain SQLite (no RLS) both produce the same result. The distinction only matters when RLS is enforced (SQLiteCloud, PostgreSQL). + +### Statement cache + +`merge_pending_batch` caches the last prepared statement (`cached_vm`) along with the column combination and `row_exists` flag that produced it. On each flush, `merge_flush_pending` compares the current column names, count, and `row_exists` against the cache: + +- **Cache hit**: `dbvm_reset` + rebind (skip SQL build and `databasevm_prepare`) +- **Cache miss**: finalize old cached statement, build new SQL, prepare, and update cache + +This recovers the precompiled-statement advantage of the old single-column path. In a typical payload where consecutive PKs change the same columns, the cache hit rate is high. + +The cached statement is finalized once at the end of `cloudsync_payload_apply`, not on every flush. + +### `last_payload_db_version` fix + +Moved the update outside the savepoint block so it executes unconditionally: + +```c +if (db_version_changed) { + last_payload_db_version = decoded_context.db_version; +} +``` + +Previously this was inside `if (!in_transaction && db_version_changed)`, which never ran in SPI. + +## Savepoint Architecture + +### Two-level savepoint design + +`cloudsync_payload_apply` uses two layers of savepoints that serve different purposes: + +| Layer | Where | Purpose | +|-------|-------|---------| +| **Outer** (per-db_version) | `cloudsync_payload_apply` loop | Transaction grouping + commit hook trigger (SQLite only) | +| **Inner** (per-PK) | `merge_flush_pending` | RLS error isolation + executor resource cleanup | + +### Outer savepoints: per-db_version in `cloudsync_payload_apply` + +```c +if (!in_savepoint && db_version_changed && !database_in_transaction(data)) { + database_begin_savepoint(data, "cloudsync_payload_apply"); + in_savepoint = true; +} +``` + +These savepoints group rows with the same source `db_version` into one transaction. The `RELEASE` (commit) at each db_version boundary triggers `cloudsync_commit_hook`, which: +- Saves `pending_db_version` as the new `data->db_version` +- Resets `data->seq = 0` + +This ensures unique `(db_version, seq)` tuples in `cloudsync_changes` across groups. + +**In PostgreSQL SPI, these are dead code**: `database_in_transaction()` returns `true` (via `IsTransactionState()`), so the condition `!database_in_transaction(data)` is always false and `in_savepoint` is never set. This is correct because: +1. PostgreSQL has no equivalent commit hook on subtransaction release +2. The SPI transaction from `SPI_connect` already provides transaction context +3. The inner per-PK savepoint handles the RLS isolation PostgreSQL needs + +**Why a single outer savepoint doesn't work**: We tested replacing per-db_version savepoints with a single savepoint wrapping the entire loop. This broke the `(db_version, seq)` uniqueness invariant in SQLite because the commit hook never fired mid-apply — `data->db_version` never advanced and `seq` never reset. + +### Inner savepoints: per-PK in `merge_flush_pending` + +```c +flush_savepoint = (database_begin_savepoint(data, "merge_flush") == DBRES_OK); +// ... database operations ... +cleanup: + if (flush_savepoint) { + if (rc == DBRES_OK) database_commit_savepoint(data, "merge_flush"); + else database_rollback_savepoint(data, "merge_flush"); + } +``` + +Wraps each PK's flush in a savepoint. On failure (e.g. RLS denial), `database_rollback_savepoint` calls `RollbackAndReleaseCurrentSubTransaction()` in PostgreSQL, which properly releases all executor resources (open relations, snapshots, plan cache) acquired during the failed statement. This eliminates the "resource was not closed" warnings that `SPI_finish` previously emitted. + +In SQLite, when the outer per-db_version savepoint is active, these become harmless nested savepoints. + +### Platform behavior summary + +| Environment | Outer savepoint | Inner savepoint | Effect | +|---|---|---|---| +| **PostgreSQL SPI** | Dead code (`in_transaction` always true) | Active — RLS error isolation + resource cleanup | Only inner savepoint runs | +| **SQLite client** | Active — groups writes, triggers commit hook | Active — nested inside outer, harmless | Both run; outer provides transaction grouping | +| **SQLiteCloud** | Active — groups writes, triggers commit hook | Active — RLS error isolation | Both run; each serves its purpose | + +## SPI and Memory Management + +### Nested SPI levels + +`pg_cloudsync_payload_apply` calls `SPI_connect` (level 1). Inside the loop, `databasevm_step` executes `INSERT INTO cloudsync_changes`, which fires the INSTEAD OF trigger. The trigger calls `SPI_connect` (level 2), runs `merge_insert` / `merge_pending_add`, then `SPI_finish` back to level 1. The deferred `merge_flush_pending` runs at level 1. + +### `database_in_transaction()` in SPI + +Always returns true in SPI context (`IsTransactionState()`). This makes the per-db_version savepoints dead code in PostgreSQL and is why `last_payload_db_version` must be updated unconditionally. + +### Error handling in SPI + +When RLS denies a write, PostgreSQL raises an error inside SPI. The inner per-PK savepoint in `merge_flush_pending` catches this: `RollbackAndReleaseCurrentSubTransaction()` properly releases all executor resources. Without the savepoint, `databasevm_step`'s `PG_CATCH` + `FlushErrorState()` would clear the error stack but leave executor resources orphaned, causing `SPI_finish` to emit "resource was not closed" warnings. + +### Batch cleanup paths + +`batch.entries` is heap-allocated via `cloudsync_memory_realloc` and reused across flushes. Each entry's `col_value` (from `database_value_dup`) is freed by `merge_pending_free_entries` on every flush. The entries array, `cached_vm`, and `cached_col_names` are freed once at the end of `cloudsync_payload_apply`. Error paths (`goto cleanup`, early returns) must free all three and call `merge_pending_free_entries` to avoid leaking `col_value` copies. + +## Batch Apply: Pros and Cons + +The batch path is used for all platforms (SQLite client, SQLiteCloud, PostgreSQL), not just when RLS is active. + +**Pros (even without RLS)**: +- Fewer SQL executions: N winning columns per PK become 1 statement instead of N. Each `databasevm_step` involves B-tree lookup, page modification, WAL write. +- Atomicity per PK: all columns for a PK succeed or fail together. + +**Cons**: +- Dynamic SQL per unique column combination (mitigated by the statement cache). +- Memory overhead: `database_value_dup` copies each column value into the buffer. +- Code complexity: batching structs, flush logic, cleanup paths. + +**Why not maintain two paths**: SQLiteCloud uses the SQLite extension with RLS, so the batch path (UPDATE vs UPSERT selection, per-PK savepoints) is required there. Maintaining a separate single-column path for plain SQLite clients would double the code with marginal benefit. + +## Files Changed + +| File | Change | +|------|--------| +| `src/cloudsync.c` | Batch merge structs with statement cache (`cached_vm`, `cached_col_names`), `merge_pending_add`, `merge_flush_pending` (with per-PK savepoint), `merge_pending_free_entries`; `pending_batch` field on context; `row_exists` propagation in `merge_insert`; batch mode in `merge_sentinel_only_insert`; `last_payload_db_version` fix; removed `payload_apply_callback` | +| `src/cloudsync.h` | Removed `CLOUDSYNC_PAYLOAD_APPLY_STEPS` enum | +| `src/database.h` | Added `sql_build_upsert_pk_and_multi_cols`, `sql_build_update_pk_and_multi_cols`; removed callback typedefs | +| `src/sqlite/database_sqlite.c` | Implemented `sql_build_upsert_pk_and_multi_cols` (dynamic SQL); `sql_build_update_pk_and_multi_cols` (delegates to upsert); removed callback functions | +| `src/postgresql/database_postgresql.c` | Implemented `sql_build_update_pk_and_multi_cols` (meta-query against `pg_catalog` generating typed UPDATE) | +| `test/unit.c` | Removed callback code and `do_test_andrea` debug function (fixed 288048-byte memory leak) | +| `test/postgresql/27_rls_batch_merge.sql` | Tests 1-3 (superuser) + Tests 4-6 (authenticated-role RLS enforcement) | +| `docs/postgresql/RLS.md` | Documented INSERT vs UPDATE paths and partial-column RLS interaction | + +## TODO + + - update documentation: RLS.md, README.md and the https://github.com/sqlitecloud/docs repo diff --git a/plans/ISSUE_POSTGRES_SCHEMA.md b/plans/ISSUE_POSTGRES_SCHEMA.md deleted file mode 100644 index a34b0e2..0000000 --- a/plans/ISSUE_POSTGRES_SCHEMA.md +++ /dev/null @@ -1,73 +0,0 @@ -Issue summary - -cloudsync_init('users') fails in Supabase postgres with: -"column reference \"id\" is ambiguous". -Both public.users and auth.users exist. Several PostgreSQL SQL templates use only table_name (no schema), so information_schema lookups and dynamic SQL see multiple tables and generate ambiguous column references. - -Proposed fixes (options) - -1) Minimal fix (patch specific templates) -- Add table_schema = current_schema() to information_schema queries. -- Keep relying on search_path. -- Resolves Supabase default postgres collisions without changing the API. - -2) Robust fix (explicit schema support) -- Allow schema-qualified inputs, e.g. cloudsync_init('public.users'). -- Parse schema/table and propagate through query builders. -- Always generate fully-qualified table names ("schema"."table"). -- Apply schema-aware filters in information_schema queries. -- Removes ambiguity regardless of search_path or duplicate table names across schemas. -- Note: payload compatibility requires cloudsync_changes.tbl to remain unqualified; PG apply should resolve schema via cloudsync_table_settings (not search_path) when applying payloads. - -Bugged query templates - -Already fixed: -- SQL_PRAGMA_TABLEINFO_PK_COLLIST -- SQL_PRAGMA_TABLEINFO_PK_DECODE_SELECTLIST - -Still vulnerable (missing schema filter): -- SQL_BUILD_SELECT_NONPK_COLS_BY_ROWID -- SQL_PRAGMA_TABLEINFO_LIST_NONPK_NAME_CID -- SQL_CLOUDSYNC_DELETE_COLS_NOT_IN_SCHEMA_OR_PKCOL -- SQL_PRAGMA_TABLEINFO_PK_QUALIFIED_COLLIST_FMT - -Robust fix implementation plan - -Goals -- Support cloudsync_init('users') and cloudsync_init('public.users') -- Default schema to current_schema() when not provided -- Persist schema so future connections are independent of search_path -- Generate fully qualified table names in all PostgreSQL SQL builders - -1) Parse schema/table at init -- In cloudsync_init_table() (cloudsync.c), parse the input table_name: - - If it contains a dot, split schema/table - - Else schema = current_schema() (query once) -- Normalize case to match existing behavior - -2) Persist schema in settings -- Store schema in cloudsync_table_settings using key='schema' -- Keep tbl_name as unqualified table name -- On first run, if schema is not stored, write it - -3) Store schema in context -- Add char *schema to cloudsync_table_context -- Populate on table creation and when reloading from settings -- Use schema when building SQL - -4) Restore schema on new connections -- During context rebuild, read schema from cloudsync_table_settings -- If missing, fallback to current_schema(), optionally persist it - -5) Qualify SQL everywhere (Postgres) -- Use "schema"."table" in generated SQL -- Add table_schema filters to information_schema queries: - - SQL_BUILD_SELECT_NONPK_COLS_BY_ROWID - - SQL_PRAGMA_TABLEINFO_LIST_NONPK_NAME_CID - - SQL_CLOUDSYNC_DELETE_COLS_NOT_IN_SCHEMA_OR_PKCOL - - SQL_PRAGMA_TABLEINFO_PK_QUALIFIED_COLLIST_FMT - - Any other information_schema templates using only table_name - -6) Compatibility -- Existing DBs without schema setting continue to work via current_schema() -- No API changes required for unqualified names diff --git a/plans/ISSUE_WARNING_resource_was_not_closed.md b/plans/ISSUE_WARNING_resource_was_not_closed.md deleted file mode 100644 index 579dbb0..0000000 --- a/plans/ISSUE_WARNING_resource_was_not_closed.md +++ /dev/null @@ -1,64 +0,0 @@ -# WARNING: resource was not closed: relation "cloudsync_changes" - -## Summary -The warning was emitted by PostgreSQL when a SPI query left a “relation” resource open. In practice, it means a SPI tuptable (or a relation opened internally by SPI when executing a query) wasn’t released before the outer SQL statement completed. PostgreSQL 17 is stricter about reporting this, so the same issue might have been silent in earlier versions. - -We isolated the warning to the `cloudsync_payload_apply` path when it inserted into the `cloudsync_changes` view and triggered `cloudsync_changes_insert_trigger`. The warnings did **not** occur for direct, manual `INSERT INTO cloudsync_changes ...` statements issued in psql. - -## Why it only happened in the payload-apply path -The key difference was **nested SPI usage** and **statement lifetime**: - -1. **`cloudsync_payload_apply` loops many changes and uses SPI internally** - - `cloudsync_payload_apply` is a C function that processes a payload by decoding multiple changes and applying them in a loop. - - For each change, it executed an `INSERT INTO cloudsync_changes (...)` (via `SQL_CHANGES_INSERT_ROW`), which fires the INSTEAD OF trigger (`cloudsync_changes_insert_trigger`). - -2. **The trigger itself executed SPI queries** - - The trigger function uses SPI to read and write metadata tables. - - This creates *nested* SPI usage within a call stack that is already inside a SPI-driven C function. - -3. **Nested SPI + `INSERT INTO view` has different resource lifetime than a plain insert** - - With a manual psql statement, the SPI usage occurs only once, in a clean top-level context. The statement finishes, SPI cleanup happens, and any tuptable resources are released. - - In the payload apply path, SPI queries happen inside the trigger, inside another SPI-driven C function, inside a loop. If any intermediate SPI tuptable or relation is not freed, it can “leak” out of the trigger scope and be reported when the outer statement completes. - - That’s why the warning appears specifically when the trigger is executed as part of `cloudsync_payload_apply` but not for direct inserts from psql. - -4. **PostgreSQL 17 reports this more aggressively** - - Earlier versions often tolerated missing `SPI_freetuptable()` calls without warning. PG17 emits the warning when the statement finishes and resources are still registered as open. - -## Why direct INSERTs from psql didn’t warn -The smoke test included a manual `INSERT INTO cloudsync_changes ...`, and it never produced the warning. That statement: - -- Runs as a single SQL statement initiated by the client. -- Executes the trigger in a clean SPI call stack with no nested SPI calls. -- Completes quickly, and the SPI context is unwound immediately, which can mask missing frees. - -In contrast, the payload-apply path: - -- Opens SPI state for the duration of the payload apply loop. -- Executes many trigger invocations before returning. -- Accumulates any unfreed resources over several calls. - -So the leak only becomes visible in the payload-apply loop. - -## Fix that removed the warning -We introduced a new SQL function that bypasses the trigger and does the work directly: - -- Added `cloudsync_changes_apply(...)` and rewired `SQL_CHANGES_INSERT_ROW` to call it via: - ```sql - SELECT cloudsync_changes_apply(...) - ``` -- The apply function executes the same logic but without inserting into the view and firing the INSTEAD OF trigger. -- This removes the nested SPI + trigger path for the payload apply loop. - -Additionally, we tightened SPI cleanup in multiple functions by ensuring `SPI_freetuptable(SPI_tuptable)` is called after `SPI_execute`/`SPI_execute_plan` calls where needed. - -## Takeaway -The warning was not tied to the `cloudsync_changes` view itself, but to **nested SPI contexts and missing SPI cleanup** during payload apply. It was only visible when: - -- the apply loop executed many insert-trigger calls, and -- the server (PG17) reported unclosed relation resources at statement end. - -By switching to `cloudsync_changes_apply(...)` and tightening SPI tuptable cleanup, we removed the warning from the payload-apply path while leaving manual insert behavior unchanged. - -## Next TODO -- Add SPI instrumentation (DEBUG1 logs before/after SPI_execute* and after SPI_freetuptable/SPI_finish) along the payload-apply → view-insert → trigger path, then rerun the instrumented smoke test to pinpoint exactly where the warning is emitted. -- Note: We inspected the payload-apply → INSERT INTO cloudsync_changes → trigger call chain and did not find any missing SPI_freetuptable() or SPI_finish() calls in that path. diff --git a/plans/PG_CLOUDSYNC_CHANGES_COL_VALUE_BYTEA.md b/plans/PG_CLOUDSYNC_CHANGES_COL_VALUE_BYTEA.md deleted file mode 100644 index 62f6b1c..0000000 --- a/plans/PG_CLOUDSYNC_CHANGES_COL_VALUE_BYTEA.md +++ /dev/null @@ -1,104 +0,0 @@ -# Plan: PG cloudsync_changes col_value as encoded bytea - -Requirements (must hold): -- Keep payload format and pk encode/decode logic unchanged. -- Payloads must be interchangeable between SQLite and PostgreSQL peers. -- PostgreSQL `cloudsync_changes.col_value` should carry the already-encoded bytea (type-tagged cloudsync bytes) exactly like SQLite. -- The PostgreSQL layer must pass that bytea through without decoding; decoding happens only when applying to the base table value type. -- Keeping `col_value` as `text` (and casting in SQL) is not acceptable because `pk_encode` would treat it as `DBTYPE_TEXT`, losing original type info (numbers/blobs/null semantics) and producing payloads that are not portable to SQLite peers. - -Goals and tradeoffs for the cached helper approach: -- Goal: preserve SQLite-compatible payloads by encoding `col_value` with the same pk wire format before it reaches the SRF/view layer. -- Goal: avoid per-row plan preparation by caching a `SPIPlanPtr` keyed by `(relid, attnum)` for column lookup. -- Tradeoff: still does per-row SPI execution (can’t avoid row fetch); cost is mitigated by cached plans. -- Tradeoff: uses text parameters and type casts in the cached plan, which is slower than binary binding but simpler and type-agnostic. - -Goal: make PostgreSQL `cloudsync_changes.col_value` carry the same type-tagged, cloudsync-encoded bytes as SQLite so `cloudsync_payload_encode` can consume it without dynamic type inference. - -## 1) Inventory and impact analysis -- Schema/SQL definition assumes text: - - `src/postgresql/cloudsync--1.0.sql` declares `cloudsync_changes_srf` with `col_value text`, and the `cloudsync_changes` view is a straight `SELECT *` from the SRF. -- SRF query construction assumes text and uses text filtering: - - `src/postgresql/cloudsync_postgresql.c` `build_union_sql()` builds `COALESCE((SELECT to_jsonb(b)->>t1.col_name ...), '%s') AS col_value` and filters with `s.col_value IS DISTINCT FROM '%s'`. - - The empty-set fallback uses `NULL::text AS col_value`. -- INSERT path expects text and re-casts to the target type: - - `src/postgresql/cloudsync_postgresql.c` `cloudsync_changes_insert_trg` reads `col_value` as text (`text_to_cstring`), looks up the real column type, and casts via `SELECT $1::type` before building a `pgvalue_t`. -- SQL constants and core insert path target `cloudsync_changes`: - - `src/postgresql/sql_postgresql.c` `SQL_CHANGES_INSERT_ROW` inserts into `cloudsync_changes(tbl, pk, col_name, col_value, ...)`. - - `src/cloudsync.c` uses `SQL_CHANGES_INSERT_ROW` via the database abstraction, so any type change affects core insert/merge flows. -- Payload encode aggregation currently treats `col_value` as whatever type the query returns: - - `src/postgresql/cloudsync_postgresql.c` `cloudsync_payload_encode_transfn` wraps variadic args with `pgvalues_from_args`; a `bytea` `col_value` would flow through as `bytea` without special handling, but any text assumptions in callers must be updated. -- Tests/docs: - - All `cloudsync_changes` tests are in SQLite (`test/unit.c`); there are no PG-specific tests or docs referencing `col_value` type. - -## 2) Define encoding contract for col_value (PG) -- Encoding contract (align with SQLite): - - `col_value` is a `bytea` containing the pk-encoded value bytes (type tag + payload), same as SQLite `cloudsync_changes`. - - `NULL` uses the same pk-encode NULL marker; no PG-specific sentinel encoding. - - RLS/tombstone filtering should be done before encoding, or by comparing encoded bytes with the known encoded sentinel bytes. -- PG-side encoding strategy: - - Add a C helper that takes a `Datum` + type metadata and returns encoded bytes using existing `pk_encode` path (`dbvalue_t` wrapper + `pk_encode`). - - Avoid JSON/text conversions; the SRF should fetch the base-table `Datum` and encode directly. - - Compute `col_value` for a given row using: - - PK decode predicate to locate the row. - - Column `Datum` from SPI tuple (or a helper function returning `Datum`). -- PG payload encode path: - - Treat `col_value` as already-encoded bytes; pass through without decoding. - - Ensure `pgvalues_from_args` preserves `bytea` and `pk_encode` does not re-encode it (it should encode the container row, not the inner value bytes). - - Avoid any path that casts `col_value` to text in `cloudsync_changes_insert_trg`. - -Concrete implementation steps for step 2: -- Add a PG helper to encode a single `Datum` into cloudsync bytes: - - Implement `static bytea *pg_cloudsync_encode_value(Datum val, Oid typeid, int32 typmod, Oid collation, bool isnull)` in `src/postgresql/cloudsync_postgresql.c` (or a new `pg_encode.c`). - - Wrap the `Datum` into a `pgvalue_t` via `pgvalue_create`, then call `pk_encode` with `argc=1` and `is_prikey=false`. - - Allocate a `bytea` with `VARHDRSZ + encoded_len` and copy the encoded bytes; return the `bytea`. - - Ensure text/bytea are detoasted before encoding (via `pgvalue_ensure_detoast`). -- Add a PG helper to encode a column from a base table row: - - Implement `static bytea *pg_cloudsync_encode_col_from_tuple(HeapTuple tup, TupleDesc td, int attnum)` that: - - Extracts `Datum` and `isnull` with `SPI_getbinval`. - - Uses `TupleDescAttr(td, attnum-1)` to capture type/typmod/collation. - - Calls `pg_cloudsync_encode_value(...)` and returns the encoded `bytea`. -- Update `build_union_sql()` logic to select encoded bytes instead of text: - - Replace the `to_jsonb(...)->>t1.col_name` subselect with a SQL-callable C function: - - New SQL function: `cloudsync_col_value_encoded(table_name text, col_name text, pk bytea) RETURNS bytea`. - - In C, implement `cloudsync_col_value_encoded` to: - - Look up table OID and PK columns. - - Decode `pk` with `cloudsync_pk_decode` to build a WHERE clause. - - Fetch the row via SPI, extract the target column `Datum`, encode it via `pg_cloudsync_encode_value`, and return `bytea`. - - This avoids dynamic SQL in `build_union_sql()` and keeps encoding centralized. -- Define behavior for restricted/tombstone rows: - - If the row is not visible or the column cannot be read, return an encoded version of `CLOUDSYNC_RLS_RESTRICTED_VALUE` (text encoded with pk_encode). - - If `col_name` is tombstone sentinel, return encoded NULL (match SQLite behavior). -- Ensure payload encode path expects bytea: - - Confirm `cloudsync_payload_encode_transfn` receives `bytea` for `col_value` from `cloudsync_changes`. - - `pgvalues_from_args` should keep `bytea` as `DBTYPE_BLOB` so `pk_encode` wraps it as a blob field. - -## 3) Update cloudsync_changes schema and SRF/view -- Update `src/postgresql/cloudsync--1.0.sql`: - - `cloudsync_changes_srf` return type: change `col_value text` -> `col_value bytea`. - - Regenerate or update extension SQL if necessary for versioning. -- Update `build_union_sql()` in `src/postgresql/cloudsync_postgresql.c`: - - Replace the current `to_jsonb(...)`/`text` approach with encoded `bytea`. - - Use the PK decode predicate to fetch the base row and feed the value to the encoder. - - Keep the RLS/tombstone filtering logic consistent with SQLite semantics. -- Update any SQL constants in `src/postgresql/sql_postgresql.c` that target `cloudsync_changes` to treat `col_value` as `bytea`. - -## 4) Update INSERT trigger and payload encode path -- In `cloudsync_changes_insert_trg`: - - Accept `col_value` as `bytea` (already encoded). - - Avoid casting to text or re-encoding. - - Ensure typed `dbvalue_t` construction uses the encoded bytes (or passes through unchanged). -- In `cloudsync_payload_encode`/aggregate path: - - If it currently expects a text value, adjust to consume encoded `bytea`. - - Confirm the encoded bytes are fed to `pk_encode` (or the payload writer) exactly once. - -## 5) Tests and verification -- Add a PG unit or SQL smoke test that: - - Inserts rows with multiple types (text, integer, float, bytea, null). - - Queries `cloudsync_changes` and verifies `col_value` bytea can round-trip decode to the original value/type. - - Compares payload bytes against SQLite for identical input (if a cross-check harness exists). -- If no PG test harness exists, add a minimal SQL script in `test/` with manual steps and expected outcomes. - -## 6) Rollout notes and documentation -- Update `POSTGRESQL.md` or relevant docs to mention `col_value` is `bytea` and already cloudsync-encoded. -- Note any compatibility constraints for consumers expecting `text`. diff --git a/plans/POSTGRESQL_IMPLEMENTATION.md b/plans/POSTGRESQL_IMPLEMENTATION.md deleted file mode 100644 index becbcd5..0000000 --- a/plans/POSTGRESQL_IMPLEMENTATION.md +++ /dev/null @@ -1,583 +0,0 @@ -# PostgreSQL Implementation Plan - -## Goal -Refactor the codebase to separate multi-platform code from database-specific implementations, preparing for PostgreSQL extension development. - -## Directory Structure (Target) - -``` -src/ -├── cloudsync.c/h # Multi-platform CRDT core -├── pk.c/h # Multi-platform payload encoding -├── network.c/h # Multi-platform network layer -├── dbutils.c/h # Multi-platform database utilities -├── utils.c/h # Multi-platform utilities -├── lz4.c/h # Multi-platform compression -├── database.h # Database abstraction API -│ -├── sqlite/ # SQLite-specific implementations -│ ├── database_sqlite.c -│ ├── cloudsync_sqlite.c -│ ├── cloudsync_sqlite.h -│ ├── cloudsync_changes_sqlite.c/h # (renamed from vtab.c/h) -│ └── sql_sqlite.c # SQLite SQL constants -│ -└── postgresql/ # PostgreSQL-specific implementations - ├── database_postgresql.c # Database abstraction (✅ implemented) - ├── cloudsync_postgresql.c # Extension functions (✅ Phase 8) - └── cloudsync--1.0.sql # SQL installation script (✅ Phase 8) -``` - -## Implementation Steps - -### Phase 1: Directory Structure ✅ -- [x] Create src/sqlite/ directory -- [x] Create src/postgresql/ directory -- [x] Create docker/postgresql/ directory -- [x] Create docker/supabase/ directory -- [x] Create test/sqlite/ directory -- [x] Create test/postgresql/ directory - -### Phase 2: Move and Rename Files ✅ -- [x] Move src/database_sqlite.c → src/sqlite/ -- [x] Move src/cloudsync_sqlite.c → src/sqlite/ -- [x] Move src/cloudsync_sqlite.h → src/sqlite/ -- [x] Rename and move src/vtab.c → src/sqlite/cloudsync_changes_sqlite.c -- [x] Rename and move src/vtab.h → src/sqlite/cloudsync_changes_sqlite.h -- [x] Move src/database_postgresql.c → src/postgresql/ - -### Phase 3: Update Include Paths ✅ -- [x] Update includes in src/sqlite/database_sqlite.c -- [x] Update includes in src/sqlite/cloudsync_sqlite.c -- [x] Update includes in src/sqlite/cloudsync_changes_sqlite.c -- [x] Update includes in src/sqlite/cloudsync_sqlite.h -- [x] Update includes in src/postgresql/database_postgresql.c -- [x] Update includes in multi-platform files that reference vtab.h - -### Phase 4: Update Makefile ✅ -- [x] Update VPATH to include src/sqlite and src/postgresql -- [x] Update CFLAGS to include new directories -- [x] Update SRC_FILES to include files from subdirectories -- [x] Ensure test targets still work - -### Phase 5: Verification ✅ -- [x] Run `make clean` -- [x] Run `make` - verify build succeeds -- [x] Run `make test` - verify tests pass (all 50 tests passed) -- [x] Run `make unittest` - verify unit tests pass - -### Phase 6: Update Documentation ✅ -- [x] Update README.md to reflect new directory structure (no changes needed - user-facing) -- [x] Update AGENTS.md with new directory structure -- [x] Update CLAUDE.md with new directory structure -- [x] Update CODEX.md with new directory structure -- [x] Add directory structure section to AGENTS.md explaining src/sqlite/ vs src/postgresql/ separation - -### Phase 7: Docker Setup ✅ -- [x] Create docker/postgresql/Dockerfile -- [x] Create docker/postgresql/docker-compose.yml -- [x] Create docker/postgresql/init.sql -- [x] Create docker/postgresql/cloudsync.control -- [x] Create docker/supabase/docker-compose.yml -- [x] Create docker/README.md - -### Phase 8: PostgreSQL Extension SQL Functions ✅ -- [x] Create src/postgresql/cloudsync_postgresql.c -- [x] Create src/postgresql/cloudsync--1.0.sql -- [x] Implement basic structure and entry points (_PG_init, _PG_fini) -- [x] Implement initial public SQL functions (version, siteid, uuid, init, db_version) -- [x] Implement `pgvalue_t` wrapper for PostgreSQL `dbvalue_t` (Datum, Oid, typmod, collation, isnull, detoasted) -- [x] Update PostgreSQL `database_value_*`/`database_column_value` to consume `pgvalue_t` (type mapping, detoast, ownership) -- [x] Convert `PG_FUNCTION_ARGS`/SPI results into `pgvalue_t **argv` for payload/PK helpers (including variadic/anyarray) -- [ ] Implement remaining public SQL functions (enable, disable, set, alter, payload) -- [ ] Implement all private/internal SQL functions (is_sync, insert, update, seq, pk_*) -- [ ] Add PostgreSQL-specific Makefile targets -- [ ] Test extension loading and basic functions -- [ ] Align PostgreSQL `dbmem_*` with core expectations (use uint64_t, decide OOM semantics vs palloc ERROR, clarify dbmem_size=0) -- [ ] TODOs to fix `sql_postgresql.c` - -## Progress Log - -### [2025-12-17] Refactoring Complete ✅ - -Successfully refactored the codebase to separate multi-platform code from database-specific implementations: - -**Changes Made:** -1. Created new directory structure: - - `src/sqlite/` for SQLite-specific code - - `src/postgresql/` for PostgreSQL-specific code - - `docker/postgresql/` and `docker/supabase/` for future Docker configs - - `test/sqlite/` and `test/postgresql/` for database-specific tests - -2. Moved and renamed files: - - `src/database_sqlite.c` → `src/sqlite/database_sqlite.c` - - `src/cloudsync_sqlite.c` → `src/sqlite/cloudsync_sqlite.c` - - `src/cloudsync_sqlite.h` → `src/sqlite/cloudsync_sqlite.h` - - `src/vtab.c` → `src/sqlite/cloudsync_changes_sqlite.c` (renamed) - - `src/vtab.h` → `src/sqlite/cloudsync_changes_sqlite.h` (renamed) - - `src/database_postgresql.c` → `src/postgresql/database_postgresql.c` - -3. Updated all include paths in moved files to use relative paths (`../`) - -4. Updated Makefile: - - Added `SQLITE_IMPL_DIR` and `POSTGRES_IMPL_DIR` variables - - Updated `VPATH` to include new subdirectories - - Updated `CFLAGS` to include subdirectories in include path - - Split `SRC_FILES` into `CORE_SRC` (multi-platform) and `SQLITE_SRC` (SQLite-specific) - - Updated `COV_FILES` to exclude files from correct paths - -5. Verification: - - Build succeeds: `make` ✅ - - All 50 tests pass: `make test` ✅ - - Unit tests pass: `make unittest` ✅ - -**Git History Preserved:** -All file moves were done using `git mv` to preserve commit history. - -**Next Steps:** -- Phase 6: Implement Docker setup for PostgreSQL development -- Begin implementing PostgreSQL extension (`database_postgresql.c`) - -### [2025-12-17] Documentation Updated ✅ - -Updated all repository documentation to reflect the new directory structure: - -**AGENTS.md:** -- Added new "Directory Structure" section with full layout -- Updated all file path references (vtab.c → cloudsync_changes_sqlite.c, etc.) -- Updated architecture diagram with new paths -- Changed references from "stub" to proper implementation paths -- Updated SQL statement documentation with new directory structure - -**CLAUDE.md:** -- Updated SQL function development workflow paths -- Updated PostgreSQL Extension Agent section with new paths -- Removed "stub" references, documented as implementation directories - -**CODEX.md:** -- Updated SQL Function/File Pointers section with new paths -- Updated database abstraction references - -**README.md:** -- No changes needed (user-facing documentation, no source file references) - -All documentation now consistently reflects the separation of multi-platform code (src/) from database-specific implementations (src/sqlite/, src/postgresql/). - -### [2025-12-17] Additional File Moved ✅ - -**Moved sql_sqlite.c:** -- `src/sql_sqlite.c` → `src/sqlite/sql_sqlite.c` -- Updated include path from `#include "sql.h"` to `#include "../sql.h"` -- Updated Makefile COV_FILES filter to use new path -- `src/sql.h` remains in shared code (declares SQL constants interface) -- Build verified successful, all tests pass - -The SQL constants are now properly organized: -- `src/sql.h` - Interface (declares extern constants) -- `src/sqlite/sql_sqlite.c` - SQLite implementation (defines constants) -- Future: `src/postgresql/sql_postgresql.c` can provide PostgreSQL-specific SQL - -### [2025-12-17] PostgreSQL Database Implementation Complete ✅ - -**Implemented src/postgresql/database_postgresql.c:** - -Created a comprehensive PostgreSQL implementation of the database abstraction layer (1440 lines): - -**Architecture:** -- Uses PostgreSQL Server Programming Interface (SPI) API -- Implements deferred prepared statement pattern (prepare on first step after all bindings) -- Converts SQLite-style `?` placeholders to PostgreSQL-style `$1, $2, ...` -- Uses `pg_stmt_wrapper_t` struct to buffer parameters before execution -- Proper error handling with PostgreSQL PG_TRY/CATCH blocks -- Memory management using PostgreSQL's palloc/pfree - -**Implemented Functions:** -- **General**: `database_exec()`, `database_exec_callback()`, `database_write()` -- **Select helpers**: `database_select_int()`, `database_select_text()`, `database_select_blob()`, `database_select_blob_2int()` -- **Status**: `database_errcode()`, `database_errmsg()`, `database_in_transaction()`, `database_table_exists()`, `database_trigger_exists()` -- **Schema info**: `database_count_pk()`, `database_count_nonpk()`, `database_count_int_pk()`, `database_count_notnull_without_default()` -- **Metadata**: `database_create_metatable()` -- **Schema versioning**: `database_schema_version()`, `database_schema_hash()`, `database_check_schema_hash()`, `database_update_schema_hash()` -- **Prepared statements (VM)**: `database_prepare()`, `databasevm_step()`, `databasevm_finalize()`, `databasevm_reset()`, `databasevm_clear_bindings()` -- **Binding**: `databasevm_bind_int()`, `databasevm_bind_double()`, `databasevm_bind_text()`, `databasevm_bind_blob()`, `databasevm_bind_null()`, `databasevm_bind_value()` -- **Column access**: `database_column_int()`, `database_column_double()`, `database_column_text()`, `database_column_blob()`, `database_column_value()`, `database_column_bytes()`, `database_column_type()` -- **Value access**: `database_value_int()`, `database_value_double()`, `database_value_text()`, `database_value_blob()`, `database_value_bytes()`, `database_value_type()`, `database_value_dup()`, `database_value_free()` -- **Primary keys**: `database_pk_rowid()`, `database_pk_names()` -- **Savepoints**: `database_begin_savepoint()`, `database_commit_savepoint()`, `database_rollback_savepoint()` -- **Memory**: `dbmem_alloc()`, `dbmem_zeroalloc()`, `dbmem_realloc()`, `dbmem_mprintf()`, `dbmem_vmprintf()`, `dbmem_free()`, `dbmem_size()` -- **Result functions**: `database_result_*()` (placeholder implementations with elog(WARNING)) -- **SQL utilities**: `sql_build_drop_table()`, `sql_escape_name()` - -**Trigger Functions (Placeholder):** -- `database_create_insert_trigger()` -- `database_create_update_trigger_gos()` -- `database_create_update_trigger()` -- `database_create_delete_trigger_gos()` -- `database_create_delete_trigger()` -- `database_create_triggers()` -- `database_delete_triggers()` - -All trigger functions currently use `elog(WARNING, "not yet implemented for PostgreSQL")` and return DBRES_OK. Full implementation requires creating PL/pgSQL trigger functions. - -**Key Technical Details:** -- Uses PostgreSQL information_schema for schema introspection -- CommandCounterIncrement() and snapshot management for read-after-write consistency -- BeginInternalSubTransaction() for savepoint support -- Deferred SPI_prepare pattern to handle dynamic parameter types -- Proper Datum type conversion between C types and PostgreSQL types - -**Implementation Source:** -- Based on reference implementation from `/Users/andrea/Documents/GitHub/SQLiteAI/sqlite-sync-v2.1/postgresql/src/pg_adapter.c` -- Follows same structure and coding style as `src/sqlite/database_sqlite.c` -- Maintains same MARK comments and function organization - -**Status:** -- ✅ All database abstraction API functions implemented -- ✅ Proper error handling and memory management -- ✅ Schema introspection and versioning -- ⏳ Trigger functions need full PL/pgSQL implementation -- ⏳ Needs compilation testing with PostgreSQL headers -- ⏳ Needs integration testing with cloudsync core - -### [2025-12-18] Docker Setup Complete ✅ - -**Created Docker Development Environment:** - -Implemented complete Docker setup for PostgreSQL development and testing: - -**Standalone PostgreSQL Setup:** -- `docker/postgresql/Dockerfile` - Custom PostgreSQL 16 image with CloudSync extension support -- `docker/postgresql/docker-compose.yml` - Orchestration with PostgreSQL and optional pgAdmin -- `docker/postgresql/init.sql` - CloudSync metadata tables initialization -- `docker/postgresql/cloudsync.control` - PostgreSQL extension control file - -**Supabase Integration:** -- `docker/supabase/docker-compose.yml` - Override configuration for official Supabase stack -- Uses custom image `sqliteai/sqlite-sync-pg:latest` with CloudSync extension -- Integrates with all Supabase services (auth, realtime, storage, etc.) - -**Documentation:** -- `docker/README.md` - Comprehensive guide covering: - - Quick start for standalone PostgreSQL - - Supabase integration setup - - Development workflow - - Building and installing extension - - Troubleshooting common issues - - Environment variables and customization - -**Key Features:** -- Volume mounting for live source code development -- Persistent database storage -- Health checks for container orchestration -- Optional pgAdmin web UI for database management -- Support for both standalone and Supabase deployments - -**Next Steps:** -- Build the Docker image: `docker build -t sqliteai/sqlite-sync-pg:latest` -- Implement PostgreSQL extension entry point and SQL function bindings -- Create Makefile targets for PostgreSQL compilation -- Add PostgreSQL-specific trigger implementations - -## Phase 8: PostgreSQL Extension SQL Functions ✅ (Mostly Complete) - -**Goal:** Implement PostgreSQL extension entry point (`cloudsync_postgresql.c`) that exposes all CloudSync SQL functions. - -### Files Created - -- ✅ `src/postgresql/cloudsync_postgresql.c` - PostgreSQL extension implementation (19/27 functions fully implemented) -- ✅ `src/postgresql/cloudsync--1.0.sql` - SQL installation script - -### SQL Functions to Implement - -**Public Functions:** -- ✅ `cloudsync_version()` - Returns extension version -- ✅ `cloudsync_init(table_name, [algo], [skip_int_pk_check])` - Initialize table for sync (1-3 arg variants) -- ✅ `cloudsync_enable(table_name)` - Enable sync for table -- ✅ `cloudsync_disable(table_name)` - Disable sync for table -- ✅ `cloudsync_is_enabled(table_name)` - Check if table is sync-enabled -- ✅ `cloudsync_cleanup(table_name)` - Cleanup orphaned metadata -- ✅ `cloudsync_terminate()` - Terminate CloudSync -- ✅ `cloudsync_set(key, value)` - Set global setting -- ✅ `cloudsync_set_table(table, key, value)` - Set table setting -- ✅ `cloudsync_set_column(table, column, key, value)` - Set column setting -- ✅ `cloudsync_siteid()` - Get site identifier (UUID) -- ✅ `cloudsync_db_version()` - Get current database version -- ✅ `cloudsync_db_version_next([version])` - Get next version -- ✅ `cloudsync_begin_alter(table)` - Begin schema alteration -- ✅ `cloudsync_commit_alter(table)` - Commit schema alteration -- ✅ `cloudsync_uuid()` - Generate UUID -- ⚠️ `cloudsync_payload_encode()` - Aggregate: encode changes to payload (partial - needs variadic args) -- ✅ `cloudsync_payload_decode(payload)` - Apply payload to database -- ✅ `cloudsync_payload_apply(payload)` - Alias for decode - -**Private/Internal Functions:** -- ✅ `cloudsync_is_sync(table)` - Check if table has metadata -- ✅ `cloudsync_insert(table, pk_values...)` - Internal insert handler (uses pgvalue_t from anyarray) -- ⚠️ `cloudsync_update(table, pk, new_value)` - Aggregate: track updates (stub - complex aggregate) -- ✅ `cloudsync_seq()` - Get sequence number -- ✅ `cloudsync_pk_encode(pk_values...)` - Encode primary key (uses pgvalue_t from anyarray) -- ⚠️ `cloudsync_pk_decode(encoded_pk, index)` - Decode primary key component (stub - needs callback) - -**Note:** Standardize PostgreSQL `dbvalue_t` as `pgvalue_t` (`Datum + Oid + typmod + collation + isnull + detoasted flag`) so value/type helpers can resolve type/length/ownership without relying on `fcinfo` lifetime; payload/PK helpers should consume arrays of these wrappers (built from `PG_FUNCTION_ARGS` and SPI tuples). Implemented in `src/postgresql/pgvalue.c/.h` and used by value/column accessors and PK/payload builders. - -### Implementation Strategy - -1. **Create Extension Entry Point** (`_PG_init`) - ```c - void _PG_init(void); - void _PG_fini(void); - ``` - -2. **Register Functions** using PostgreSQL's function manager - ```c - PG_FUNCTION_INFO_V1(cloudsync_version); - Datum cloudsync_version(PG_FUNCTION_ARGS); - ``` - -3. **Context Management** - - Create `cloudsync_postgresql_context` structure - - Store in PostgreSQL's transaction-local storage - - Cleanup on transaction end - -4. **Aggregate Functions** - - Implement state transition and finalization functions - - Use PostgreSQL's aggregate framework - -5. **SQL Installation Script** - - Create `cloudsync--1.0.sql` with `CREATE FUNCTION` statements - - Define function signatures and link to C implementations - -### Testing Approach - -1. Build extension in Docker container -2. Load extension: `CREATE EXTENSION cloudsync;` -3. Test each function individually -4. Verify behavior matches SQLite implementation -5. Run integration tests with CRDT core logic - -### Reference Implementation - -- Study: `src/sqlite/cloudsync_sqlite.c` (SQLite version) -- Adapt to PostgreSQL SPI and function framework -- Reuse core logic from `src/cloudsync.c` (database-agnostic) - -## Progress Log (Continued) - -### [2025-12-19] Phase 8 Implementation - Major Progress ✅ - -Implemented most CloudSync SQL functions for PostgreSQL extension: - -**Changes Made:** - -1. **Removed unnecessary helper function:** - - Deleted `dbsync_set_error()` helper function - - Replaced with direct `ereport(ERROR, (errmsg(...)))` calls - - PostgreSQL's `errmsg()` already supports format strings, unlike SQLite - -2. **Fixed cloudsync_init API:** - - **CRITICAL FIX**: Previous implementation used wrong signature `(site_id, url, key)` - - Corrected to match SQLite API: `(table_name, [algo], [skip_int_pk_check])` - - Created `cloudsync_init_internal()` helper that replicates `dbsync_init` logic from SQLite - - Implemented single variadic `cloudsync_init()` function supporting 1-3 arguments with defaults - - Updated SQL installation script to create 3 function overloads pointing to same C function - - Returns site_id as TEXT (matches SQLite behavior) - -3. **Implemented 19 of 27 SQL functions:** - - ✅ All public configuration functions (enable, disable, set, set_table, set_column) - - ✅ All schema alteration functions (begin_alter, commit_alter) - - ✅ All version/metadata functions (version, siteid, uuid, db_version, db_version_next, seq) - - ✅ Cleanup and termination functions - - ✅ Payload decode/apply functions - - ✅ Private is_sync function - -4. **Partially implemented complex aggregate functions:** - - ⚠️ `cloudsync_payload_encode_transfn/finalfn` - Basic structure in place, needs variadic arg conversion - - ⚠️ `cloudsync_update_transfn/finalfn` - Stubs created - - ⚠️ `cloudsync_insert` - Stub (requires variadic PK handling) - - ⚠️ `cloudsync_pk_encode/decode` - Stubs (require anyarray to dbvalue_t conversion) - -**Architecture Decisions:** - -- All functions use SPI_connect()/SPI_finish() pattern with PG_TRY/CATCH for proper error handling -- Context management uses global `pg_cloudsync_context` (per backend) -- Error reporting uses PostgreSQL's native `ereport()` with appropriate error codes -- Memory management uses PostgreSQL's palloc/pfree in aggregate contexts -- Follows same function organization and MARK comments as SQLite version - -**Status:** -- ✅ 19/27 functions fully implemented and ready for testing -- ⚠️ 5 functions have stubs requiring PostgreSQL-specific variadic argument handling -- ⚠️ 3 aggregate functions need completion (update transfn/finalfn, payload_encode transfn) -- ⏳ Needs compilation testing with PostgreSQL headers -- ⏳ Needs integration testing with cloudsync core - -## SQL Parity Review (PostgreSQL vs SQLite) - -Findings comparing `src/postgresql/sql_postgresql.c` to `src/sqlite/sql_sqlite.c`: -- Missing full DB version query composition: SQLite builds a UNION of all `*_cloudsync` tables plus `pre_alter_dbversion`; PostgreSQL has a two-step builder but no `pre_alter_dbversion` or execution glue. -- `SQL_DATA_VERSION`/`SQL_SCHEMA_VERSION` are TODO placeholders (`SELECT 1`), not equivalents to SQLite pragmas. -- `SQL_SITEID_GETSET_ROWID_BY_SITEID` returns `ctid` and lacks the upsert/rowid semantics of SQLite’s insert-or-update/RETURNING rowid. -- Row selection/build helpers (`*_BY_ROWID`, `*_BY_PK`) are reduced placeholders using `ctid` or simple string_agg; they do not mirror SQLite’s dynamic SQL with ordered PK clauses and column lists from `pragma_table_info`. -- Write helpers (`INSERT_ROWID_IGNORE`, `UPSERT_ROWID_AND_COL_BY_ROWID`, PK insert/upsert formats) diverge: SQLite uses `rowid` and conflict clauses; PostgreSQL variants use `%s` placeholders without full PK clause/param construction. -- Cloudsync metadata upserts differ: `SQL_CLOUDSYNC_UPSERT_COL_INIT_OR_BUMP_VERSION`/`_RAW_COLVERSION` use `EXCLUDED` logic not matching SQLite’s increment rules; PK tombstone/cleanup helpers are partial. -- Many format strings lack quoting/identifier escaping parity (`%w` behavior) and expect external code to supply WHERE clauses, making them incomplete compared to SQLite’s self-contained templates. - -TODOs to fix `sql_postgresql.c`: -- Recreate DB version query including `pre_alter_dbversion` union and execution wrapper. -- Implement PostgreSQL equivalents for data_version/schema_version. -- Align site_id getters/setters to return stable identifiers (no `ctid`) and mirror SQLite upsert-return semantics. -- Port the dynamic SQL builders for select/delete/insert/upsert by PK/non-PK to generate complete statements (including ordered PK clauses and binds), respecting identifier quoting. -- Align cloudsync metadata updates/upserts/tombstoning to SQLite logic (version bump rules, ON CONFLICT behavior, seq/db_version handling). -- Ensure all format strings include proper identifier quoting and do not rely on external WHERE fragments unless explicitly designed that way. - -**Next Steps:** -- Implement PostgreSQL anyarray handling for variadic functions (pk_encode, pk_decode, insert) -- Complete aggregate function implementations (update, payload_encode) -- Add PostgreSQL-specific Makefile targets -- Build and test extension in Docker container - -### [2025-12-19] Implemented cloudsync_insert ✅ - -Completed the `cloudsync_insert` function using the new `pgvalue_t` infrastructure: - -**Implementation Details:** - -1. **Signature**: `cloudsync_insert(table_name text, VARIADIC pk_values anyarray)` - - Uses PostgreSQL's VARIADIC to accept variable number of PK values - - Converts anyarray to `pgvalue_t **` using `pgvalues_from_array()` - -2. **Key Features**: - - Validates table exists and PK count matches expected - - Encodes PK values using `pk_encode_prikey()` with stack buffer (1024 bytes) - - Handles sentinel records for PK-only tables - - Marks all non-PK columns as inserted in metadata - - Proper memory management: frees `pgvalue_t` wrappers after use - -3. **Error Handling**: - - Comprehensive cleanup in both success and error paths - - Uses `goto cleanup` pattern for centralized resource management - - Wraps in `PG_TRY/CATCH` for PostgreSQL exception safety - - Cleans up resources before re-throwing exceptions - -4. **Follows SQLite Logic**: - - Matches `dbsync_insert` behavior from `src/sqlite/cloudsync_sqlite.c` - - Same sequence: encode PK → get next version → check existence → mark metadata - - Handles both new inserts and updates to previously deleted rows - -**Status**: -- ✅ `cloudsync_insert` fully implemented -- ✅ `cloudsync_pk_encode` already implemented (was done in previous work) -- ✅ `cloudsync_payload_encode_transfn` already implemented (uses pgvalues_from_args) -- ⚠️ `cloudsync_pk_decode` still needs callback implementation -- ⚠️ `cloudsync_update_*` aggregate functions still need implementation - -**Function Count Update**: 21/27 functions (78%) now fully implemented - -### [2025-12-19] PostgreSQL Makefile Targets Complete ✅ - -Implemented comprehensive Makefile infrastructure for PostgreSQL extension development: - -**Files Created/Modified:** - -1. **`docker/Makefile.postgresql`** - New PostgreSQL-specific Makefile with all build targets: - - Build targets: `postgres-check`, `postgres-build`, `postgres-install`, `postgres-clean`, `postgres-test` - - Docker targets: `postgres-docker-build`, `postgres-docker-run`, `postgres-docker-stop`, `postgres-docker-rebuild`, `postgres-docker-shell` - - Development targets: `postgres-dev-rebuild` (fast rebuild in running container) - - Help target: `postgres-help` - -2. **Root `Makefile`** - Updated to include PostgreSQL targets: - - Added `include docker/Makefile.postgresql` statement - - Added PostgreSQL help reference to main help output - - All targets accessible from root: `make postgres-*` - -3. **`docker/postgresql/Dockerfile`** - Updated to use new Makefile targets: - - Uses `make postgres-build` and `make postgres-install` - - Verifies installation with file checks - - Adds version labels - - Keeps source mounted for development - -4. **`docker/postgresql/docker-compose.yml`** - Enhanced volume mounts: - - Mounts `docker/` directory for Makefile.postgresql access - - Enables quick rebuilds without image rebuild - -5. **`docker/README.md`** - Updated documentation: - - Simplified quick start using new Makefile targets - - Updated development workflow section - - Added fast rebuild instructions - -6. **`POSTGRESQL.md`** - New comprehensive quick reference guide: - - All Makefile targets documented - - Development workflow examples - - Extension function reference - - Connection details and troubleshooting - -**Key Features:** - -- **Single Entry Point**: All PostgreSQL targets accessible via `make postgres-*` from root -- **Pre-built Image**: `make postgres-docker-build` creates image with extension pre-installed -- **Fast Development**: `make postgres-dev-rebuild` rebuilds extension in <5 seconds without restarting container -- **Clean Separation**: PostgreSQL logic isolated in `docker/Makefile.postgresql`, included by root Makefile -- **Docker-First**: Optimized for containerized development with source mounting - -**Usage Examples:** - -```bash -# Build Docker image with CloudSync extension -make postgres-docker-build - -# Start PostgreSQL container -make postgres-docker-run - -# Test extension -docker exec -it cloudsync-postgres psql -U postgres -d cloudsync_test \ - -c "CREATE EXTENSION cloudsync; SELECT cloudsync_version();" - -# Make code changes, then quick rebuild -make postgres-dev-rebuild -``` - -**Status:** -- ✅ All Makefile targets implemented and tested -- ✅ Dockerfile optimized for build and development -- ✅ Documentation complete (README + POSTGRESQL.md) -- ⏳ Ready for first build and compilation test -- ⏳ Needs actual PostgreSQL compilation verification - -**Next Steps:** -- Test actual compilation: `make postgres-docker-build` -- Fix any compilation errors -- Test extension loading: `CREATE EXTENSION cloudsync` -- Complete remaining aggregate functions - -### [2025-12-20] PostgreSQL Trigger + SPI Cleanup Work ✅ - -**Trigger functions implemented in `src/postgresql/database_postgresql.c`:** -- `database_create_insert_trigger` implemented with per-table PL/pgSQL function and trigger. -- `database_create_update_trigger_gos`/`database_create_delete_trigger_gos` implemented (BEFORE triggers, raise on update/delete when enabled). -- `database_create_update_trigger` implemented with VALUES list + `cloudsync_update` aggregate call. -- `database_create_delete_trigger` implemented to call `cloudsync_delete`. -- `database_create_triggers` wired to create insert/update/delete triggers based on algo. -- `database_delete_triggers` updated to drop insert/update/delete triggers and their functions. - -**PostgreSQL SQL registration updates:** -- Added `cloudsync_delete` to `src/postgresql/cloudsync--1.0.sql`. - -**Internal function updates:** -- Implemented `cloudsync_delete` C function (mirrors SQLite delete path). -- `cloudsync_insert`/`cloudsync_delete` now lazily load table context when missing. -- Refactored `cloudsync_insert`/`cloudsync_delete` to use `PG_ENSURE_ERROR_CLEANUP` and shared cleanup helper. - -**SPI execution fixes:** -- `databasevm_step` now uses `SPI_is_cursor_plan` before opening a portal to avoid “cannot open INSERT query as cursor”. -- Persistent statements now allocate their memory contexts under `TopMemoryContext`. - -**Error formatting:** -- `cloudsync_set_error` now avoids `snprintf` aliasing when `database_errmsg` points at `data->errmsg`. - -**Smoke test updates:** -- `docker/postgresql/smoke_test.sql` now validates insert/delete metadata, tombstones, and site_id fields. -- Test output uses `\echo` markers for each check. - -**Documentation updates:** -- Added PostgreSQL SPI patterns to `AGENTS.md`. -- Updated Database Abstraction Layer section in `AGENTS.md` to match `database.h`. diff --git a/plans/TODO.md b/plans/TODO.md index 7b5607a..d242187 100644 --- a/plans/TODO.md +++ b/plans/TODO.md @@ -1,79 +1,2 @@ -# SQLite vs PostgreSQL Parity Matrix - -This matrix compares SQLite extension features against the PostgreSQL extension and validates the TODO list in `POSTGRESQL.md`. - -## Doc TODO validation (POSTGRESQL.md) - -- `pk_decode`: Implemented in PostgreSQL (`cloudsync_pk_decode`). -- `cloudsync_update` aggregate: Implemented (`cloudsync_update_transfn/finalfn` + aggregate). -- `payload_encode` variadic support: Aggregate `cloudsync_payload_encode(*)` is implemented; no missing symbol, but parity tests are still lacking. - -## Parity matrix - -Legend: **Yes** = implemented, **Partial** = implemented with parity gaps/TODOs, **No** = missing. - -### Core + configuration - -| Feature / API | SQLite | PostgreSQL | Status | Notes | -| --- | --- | --- | --- | --- | -| cloudsync_version | Yes | Yes | Yes | | -| cloudsync_siteid | Yes | Yes | Yes | | -| cloudsync_uuid | Yes | Yes | Yes | | -| cloudsync_db_version | Yes | Yes | Yes | | -| cloudsync_db_version_next (0/1 args) | Yes | Yes | Yes | | -| cloudsync_seq | Yes | Yes | Yes | | -| cloudsync_init (1/2/3 args) | Yes | Yes | Yes | | -| cloudsync_enable / disable / is_enabled | Yes | Yes | Yes | | -| cloudsync_cleanup | Yes | Yes | Yes | | -| cloudsync_terminate | Yes | Yes | Yes | | -| cloudsync_set / set_table / set_column | Yes | Yes | Yes | | -| cloudsync_begin_alter / commit_alter | Yes | Yes | Yes | | - -### Internal CRUD helpers - -| Feature / API | SQLite | PostgreSQL | Status | Notes | -| --- | --- | --- | --- | --- | -| cloudsync_is_sync | Yes | Yes | Yes | | -| cloudsync_insert (variadic) | Yes | Yes | Yes | | -| cloudsync_delete (variadic) | Yes | Yes | Yes | | -| cloudsync_update (aggregate) | Yes | Yes | Yes | PG needs parity tests. | -| cloudsync_pk_encode (variadic) | Yes | Yes | Yes | | -| cloudsync_pk_decode | Yes | Yes | Yes | | -| cloudsync_col_value | Yes | Yes | Yes | PG returns encoded bytea. | -| cloudsync_encode_value | No | Yes | No | PG-only helper. | - -### Payloads - -| Feature / API | SQLite | PostgreSQL | Status | Notes | -| --- | --- | --- | --- | --- | -| cloudsync_payload_encode (aggregate) | Yes | Yes | Yes | PG uses aggregate only; direct call is blocked. | -| cloudsync_payload_decode / apply | Yes | Yes | Yes | | -| cloudsync_payload_save | Yes | No | No | SQLite only. | -| cloudsync_payload_load | Yes | No | No | SQLite only. | - -### cloudsync_changes surface - -| Feature / API | SQLite | PostgreSQL | Status | Notes | -| --- | --- | --- | --- | --- | -| cloudsync_changes (queryable changes) | Yes (vtab) | Yes (view + SRF) | Yes | PG uses SRF + view + INSTEAD OF INSERT trigger. | -| cloudsync_changes INSERT support | Yes | Yes | Yes | PG uses trigger; ensure parity tests. | -| cloudsync_changes UPDATE/DELETE | No (not allowed) | No (not allowed) | Yes | | - -### Extras - -| Feature / API | SQLite | PostgreSQL | Status | Notes | -| --- | --- | --- | --- | --- | -| Network sync functions | Yes | No | No | SQLite registers network functions; PG has no network layer. | - -## PostgreSQL parity gaps (known TODOs in code) - -- Rowid-only table path uses `ctid` and is not parity with SQLite rowid semantics (`SQL_DELETE_ROW_BY_ROWID`, `SQL_UPSERT_ROWID_AND_COL_BY_ROWID`, `SQL_SELECT_COLS_BY_ROWID_FMT`). -- PK-only insert builder still marked as needing explicit PK handling (`SQL_INSERT_ROWID_IGNORE`). -- Metadata bump/merge rules have TODOs to align with SQLite (`SQL_CLOUDSYNC_UPDATE_COL_BUMP_VERSION`, `SQL_CLOUDSYNC_UPSERT_RAW_COLVERSION`, `SQL_CLOUDSYNC_INSERT_RETURN_CHANGE_ID`). -- Delete/tombstone helpers have TODOs to match SQLite (`SQL_CLOUDSYNC_DELETE_PK_EXCEPT_COL`, `SQL_CLOUDSYNC_DELETE_PK_EXCEPT_TOMBSTONE`, `SQL_CLOUDSYNC_GET_COL_VERSION_OR_ROW_EXISTS`, `SQL_CLOUDSYNC_SELECT_COL_VERSION`). - -## Suggested next steps - -- Add PG tests mirroring SQLite unit tests for `cloudsync_update`, `cloudsync_payload_encode`, and `cloudsync_changes`. -- Resolve `ctid`-based rowid TODOs by using PK-only SQL builders. -- Align metadata bump/delete semantics with SQLite in `sql_postgresql.c`. +- I need to call cloudsync_update_schema_hash to update the last schema hash when upgrading the library from the 0.8.* version +- Fix cloudsync_begin_alter and cloudsync_commit_alter for PostgreSQL, and we could call them automatically with a trigger on ALTER TABLE \ No newline at end of file diff --git a/src/block.c b/src/block.c new file mode 100644 index 0000000..ce252b4 --- /dev/null +++ b/src/block.c @@ -0,0 +1,297 @@ +// +// block.c +// cloudsync +// +// Block-level LWW CRDT support for text/blob fields. +// + +#include +#include +#include +#include "block.h" +#include "utils.h" +#include "fractional_indexing.h" + +// MARK: - Col name helpers - + +bool block_is_block_colname(const char *col_name) { + if (!col_name) return false; + return strchr(col_name, BLOCK_SEPARATOR) != NULL; +} + +char *block_extract_base_colname(const char *col_name) { + if (!col_name) return NULL; + const char *sep = strchr(col_name, BLOCK_SEPARATOR); + if (!sep) return cloudsync_string_dup(col_name); + return cloudsync_string_ndup(col_name, (size_t)(sep - col_name)); +} + +const char *block_extract_position_id(const char *col_name) { + if (!col_name) return NULL; + const char *sep = strchr(col_name, BLOCK_SEPARATOR); + if (!sep) return NULL; + return sep + 1; +} + +char *block_build_colname(const char *base_col, const char *position_id) { + if (!base_col || !position_id) return NULL; + size_t blen = strlen(base_col); + size_t plen = strlen(position_id); + char *result = (char *)cloudsync_memory_alloc(blen + 1 + plen + 1); + if (!result) return NULL; + memcpy(result, base_col, blen); + result[blen] = BLOCK_SEPARATOR; + memcpy(result + blen + 1, position_id, plen); + result[blen + 1 + plen] = '\0'; + return result; +} + +// MARK: - Text splitting - + +static block_list_t *block_list_create(void) { + block_list_t *list = (block_list_t *)cloudsync_memory_zeroalloc(sizeof(block_list_t)); + return list; +} + +static bool block_list_append(block_list_t *list, const char *content, size_t content_len, const char *position_id) { + if (list->count >= list->capacity) { + int new_cap = list->capacity ? list->capacity * 2 : 16; + block_entry_t *new_entries = (block_entry_t *)cloudsync_memory_realloc( + list->entries, (uint64_t)(new_cap * sizeof(block_entry_t))); + if (!new_entries) return false; + list->entries = new_entries; + list->capacity = new_cap; + } + block_entry_t *e = &list->entries[list->count]; + e->content = cloudsync_string_ndup(content, content_len); + e->position_id = position_id ? cloudsync_string_dup(position_id) : NULL; + if (!e->content) return false; + list->count++; + return true; +} + +void block_list_free(block_list_t *list) { + if (!list) return; + for (int i = 0; i < list->count; i++) { + if (list->entries[i].content) cloudsync_memory_free(list->entries[i].content); + if (list->entries[i].position_id) cloudsync_memory_free(list->entries[i].position_id); + } + if (list->entries) cloudsync_memory_free(list->entries); + cloudsync_memory_free(list); +} + +block_list_t *block_list_create_empty(void) { + return block_list_create(); +} + +bool block_list_add(block_list_t *list, const char *content, const char *position_id) { + if (!list) return false; + return block_list_append(list, content, strlen(content), position_id); +} + +block_list_t *block_split(const char *text, const char *delimiter) { + block_list_t *list = block_list_create(); + if (!list) return NULL; + + if (!text || !*text) { + // Empty text produces a single empty block + block_list_append(list, "", 0, NULL); + return list; + } + + size_t dlen = strlen(delimiter); + if (dlen == 0) { + // No delimiter: entire text is one block + block_list_append(list, text, strlen(text), NULL); + return list; + } + + const char *start = text; + const char *found; + while ((found = strstr(start, delimiter)) != NULL) { + size_t seg_len = (size_t)(found - start); + if (!block_list_append(list, start, seg_len, NULL)) { + block_list_free(list); + return NULL; + } + start = found + dlen; + } + // Last segment (after last delimiter or entire string if no delimiter found) + if (!block_list_append(list, start, strlen(start), NULL)) { + block_list_free(list); + return NULL; + } + + return list; +} + +// MARK: - Fractional indexing (via fractional-indexing submodule) - + +// Wrapper for calloc: fractional_indexing expects (count, size) but cloudsync_memory_zeroalloc takes a single size. +static void *fi_calloc_wrapper(size_t count, size_t size) { + return cloudsync_memory_zeroalloc((uint64_t)(count * size)); +} + +void block_init_allocator(void) { + fractional_indexing_allocator alloc = { + .malloc = (void *(*)(size_t))cloudsync_memory_alloc, + .calloc = fi_calloc_wrapper, + .free = cloudsync_memory_free + }; + fractional_indexing_set_allocator(&alloc); +} + +char *block_position_between(const char *before, const char *after) { + return generate_key_between(before, after); +} + +char **block_initial_positions(int count) { + if (count <= 0) return NULL; + return generate_n_keys_between(NULL, NULL, count); +} + +// MARK: - Block diff - + +static block_diff_t *block_diff_create(void) { + block_diff_t *diff = (block_diff_t *)cloudsync_memory_zeroalloc(sizeof(block_diff_t)); + return diff; +} + +static bool block_diff_append(block_diff_t *diff, block_diff_type type, const char *position_id, const char *content) { + if (diff->count >= diff->capacity) { + int new_cap = diff->capacity ? diff->capacity * 2 : 16; + block_diff_entry_t *new_entries = (block_diff_entry_t *)cloudsync_memory_realloc( + diff->entries, (uint64_t)(new_cap * sizeof(block_diff_entry_t))); + if (!new_entries) return false; + diff->entries = new_entries; + diff->capacity = new_cap; + } + block_diff_entry_t *e = &diff->entries[diff->count]; + e->type = type; + e->position_id = cloudsync_string_dup(position_id); + e->content = content ? cloudsync_string_dup(content) : NULL; + diff->count++; + return true; +} + +void block_diff_free(block_diff_t *diff) { + if (!diff) return; + for (int i = 0; i < diff->count; i++) { + if (diff->entries[i].position_id) cloudsync_memory_free(diff->entries[i].position_id); + if (diff->entries[i].content) cloudsync_memory_free(diff->entries[i].content); + } + if (diff->entries) cloudsync_memory_free(diff->entries); + cloudsync_memory_free(diff); +} + +// Content-based matching diff algorithm: +// 1. Build a consumed-set from old blocks +// 2. For each new block, find the first unconsumed old block with matching content +// 3. Matched blocks keep their position_id (UNCHANGED) +// 4. Unmatched new blocks get new position_ids (ADDED) +// 5. Unconsumed old blocks are REMOVED +// Modified blocks are detected when content changed but position stayed (handled as MODIFIED) +block_diff_t *block_diff(block_entry_t *old_blocks, int old_count, + const char **new_parts, int new_count) { + block_diff_t *diff = block_diff_create(); + if (!diff) return NULL; + + // Track which old blocks have been consumed + bool *old_consumed = NULL; + if (old_count > 0) { + old_consumed = (bool *)cloudsync_memory_zeroalloc((uint64_t)(old_count * sizeof(bool))); + if (!old_consumed) { + block_diff_free(diff); + return NULL; + } + } + + // For each new block, try to find a matching unconsumed old block + // Use a simple forward scan to preserve ordering + int old_scan = 0; + char *last_position = NULL; + + for (int ni = 0; ni < new_count; ni++) { + bool found = false; + + // Scan forward in old blocks for a content match + for (int oi = old_scan; oi < old_count; oi++) { + if (old_consumed[oi]) continue; + + if (strcmp(old_blocks[oi].content, new_parts[ni]) == 0) { + // Exact match — mark any skipped old blocks as REMOVED + for (int si = old_scan; si < oi; si++) { + if (!old_consumed[si]) { + block_diff_append(diff, BLOCK_DIFF_REMOVED, old_blocks[si].position_id, NULL); + old_consumed[si] = true; + } + } + old_consumed[oi] = true; + old_scan = oi + 1; + last_position = old_blocks[oi].position_id; + found = true; + break; + } + } + + if (!found) { + // New block — needs a new position_id + const char *next_pos = NULL; + // Find the next unconsumed old block's position for the upper bound + for (int oi = old_scan; oi < old_count; oi++) { + if (!old_consumed[oi]) { + next_pos = old_blocks[oi].position_id; + break; + } + } + + char *new_pos = block_position_between(last_position, next_pos); + if (new_pos) { + block_diff_append(diff, BLOCK_DIFF_ADDED, new_pos, new_parts[ni]); + last_position = diff->entries[diff->count - 1].position_id; + cloudsync_memory_free(new_pos); + } + } + } + + // Mark remaining unconsumed old blocks as REMOVED + for (int oi = old_scan; oi < old_count; oi++) { + if (!old_consumed[oi]) { + block_diff_append(diff, BLOCK_DIFF_REMOVED, old_blocks[oi].position_id, NULL); + } + } + + if (old_consumed) cloudsync_memory_free(old_consumed); + return diff; +} + +// MARK: - Materialization - + +char *block_materialize_text(const char **blocks, int count, const char *delimiter) { + if (count == 0) return cloudsync_string_dup(""); + if (!delimiter) delimiter = BLOCK_DEFAULT_DELIMITER; + + size_t dlen = strlen(delimiter); + size_t total = 0; + for (int i = 0; i < count; i++) { + total += strlen(blocks[i]); + if (i < count - 1) total += dlen; + } + + char *result = (char *)cloudsync_memory_alloc(total + 1); + if (!result) return NULL; + + size_t offset = 0; + for (int i = 0; i < count; i++) { + size_t blen = strlen(blocks[i]); + memcpy(result + offset, blocks[i], blen); + offset += blen; + if (i < count - 1) { + memcpy(result + offset, delimiter, dlen); + offset += dlen; + } + } + result[offset] = '\0'; + + return result; +} diff --git a/src/block.h b/src/block.h new file mode 100644 index 0000000..fa43369 --- /dev/null +++ b/src/block.h @@ -0,0 +1,120 @@ +// +// block.h +// cloudsync +// +// Block-level LWW CRDT support for text/blob fields. +// Instead of replacing an entire text column on conflict, +// the text is split into blocks (lines/paragraphs) that are +// independently version-tracked and merged. +// + +#ifndef __CLOUDSYNC_BLOCK__ +#define __CLOUDSYNC_BLOCK__ + +#include +#include +#include + +// The separator character used in col_name to distinguish block entries +// from regular column entries. Format: "col_name\x1Fposition_id" +#define BLOCK_SEPARATOR '\x1F' +#define BLOCK_SEPARATOR_STR "\x1F" +#define BLOCK_DEFAULT_DELIMITER "\n" + +// Column-level algorithm for block tracking +typedef enum { + col_algo_normal = 0, + col_algo_block = 1 +} col_algo_t; + +// A single block from splitting text +typedef struct { + char *content; // block text (owned, must be freed) + char *position_id; // fractional index position (owned, must be freed) +} block_entry_t; + +// Array of blocks +typedef struct { + block_entry_t *entries; + int count; + int capacity; +} block_list_t; + +// Diff result for comparing old and new block lists +typedef enum { + BLOCK_DIFF_UNCHANGED = 0, + BLOCK_DIFF_ADDED = 1, + BLOCK_DIFF_MODIFIED = 2, + BLOCK_DIFF_REMOVED = 3 +} block_diff_type; + +typedef struct { + block_diff_type type; + char *position_id; // the position_id (owned, must be freed) + char *content; // new content (owned, must be freed; NULL for REMOVED) +} block_diff_entry_t; + +typedef struct { + block_diff_entry_t *entries; + int count; + int capacity; +} block_diff_t; + +// Initialize the fractional-indexing library to use cloudsync's allocator. +// Must be called once before any block_position_between / block_initial_positions calls. +void block_init_allocator(void); + +// Check if a col_name is a block entry (contains BLOCK_SEPARATOR) +bool block_is_block_colname(const char *col_name); + +// Extract the base column name from a block col_name (caller must free) +// e.g., "body\x1F0.5" -> "body" +char *block_extract_base_colname(const char *col_name); + +// Extract the position_id from a block col_name +// e.g., "body\x1F0.5" -> "0.5" +const char *block_extract_position_id(const char *col_name); + +// Build a block col_name from base + position_id (caller must free) +// e.g., ("body", "0.5") -> "body\x1F0.5" +char *block_build_colname(const char *base_col, const char *position_id); + +// Split text into blocks using the given delimiter +block_list_t *block_split(const char *text, const char *delimiter); + +// Free a block list +void block_list_free(block_list_t *list); + +// Generate fractional index position IDs for N initial blocks +// Returns array of N strings (caller must free each + the array) +char **block_initial_positions(int count); + +// Generate a position ID that sorts between 'before' and 'after' +// Either can be NULL (meaning beginning/end of sequence) +// Caller must free the result +char *block_position_between(const char *before, const char *after); + +// Compute diff between old blocks (with position IDs) and new content blocks +// old_blocks: existing blocks from metadata (with position_ids) +// new_parts: new text split by delimiter (no position_ids yet) +// new_count: number of new parts +block_diff_t *block_diff(block_entry_t *old_blocks, int old_count, + const char **new_parts, int new_count); + +// Free a diff result +void block_diff_free(block_diff_t *diff); + +// Create an empty block list (for accumulating existing blocks) +block_list_t *block_list_create_empty(void); + +// Add a block entry to a list (content and position_id are copied) +bool block_list_add(block_list_t *list, const char *content, const char *position_id); + +// Concatenate block values with delimiter +// blocks: array of content strings (in position order) +// count: number of blocks +// delimiter: separator between blocks +// Returns allocated string (caller must free) +char *block_materialize_text(const char **blocks, int count, const char *delimiter); + +#endif diff --git a/src/cloudsync.c b/src/cloudsync.c index 12c0e90..b1bdbaa 100644 --- a/src/cloudsync.c +++ b/src/cloudsync.c @@ -22,6 +22,7 @@ #include "sql.h" #include "utils.h" #include "dbutils.h" +#include "block.h" #ifdef _WIN32 #include @@ -84,6 +85,37 @@ typedef enum { #define SYNCBIT_SET(_data) _data->insync = 1 #define SYNCBIT_RESET(_data) _data->insync = 0 +// MARK: - Deferred column-batch merge - + +typedef struct { + const char *col_name; // pointer into table_context->col_name[idx] (stable) + dbvalue_t *col_value; // duplicated via database_value_dup (owned) + int64_t col_version; + int64_t db_version; + uint8_t site_id[UUID_LEN]; + int site_id_len; + int64_t seq; +} merge_pending_entry; + +typedef struct { + cloudsync_table_context *table; + char *pk; // malloc'd copy, freed on flush + int pk_len; + int64_t cl; + bool sentinel_pending; + bool row_exists; // true when the PK already exists locally + int count; + int capacity; + merge_pending_entry *entries; + + // Statement cache — reuse the prepared statement when the column + // combination and row_exists flag match between consecutive PK flushes. + dbvm_t *cached_vm; + bool cached_row_exists; + int cached_col_count; + const char **cached_col_names; // array of pointers into table_context (not owned) +} merge_pending_batch; + // MARK: - struct cloudsync_pk_decode_bind_context { @@ -142,6 +174,9 @@ struct cloudsync_context { int tables_cap; // capacity int skip_decode_idx; // -1 in sqlite, col_value index in postgresql + + // deferred column-batch merge (active during payload_apply) + merge_pending_batch *pending_batch; }; struct cloudsync_table_context { @@ -154,6 +189,14 @@ struct cloudsync_table_context { dbvm_t **col_merge_stmt; // array of merge insert stmt (indexed by col_name) dbvm_t **col_value_stmt; // array of column value stmt (indexed by col_name) int *col_id; // array of column id + col_algo_t *col_algo; // per-column algorithm (normal or block) + char **col_delimiter; // per-column delimiter for block splitting (NULL = default "\n") + bool has_block_cols; // quick check: does this table have any block columns? + dbvm_t *block_value_read_stmt; // SELECT col_value FROM blocks table + dbvm_t *block_value_write_stmt; // INSERT OR REPLACE into blocks table + dbvm_t *block_value_delete_stmt; // DELETE from blocks table + dbvm_t *block_list_stmt; // SELECT block entries for materialization + char *blocks_ref; // schema-qualified blocks table name int ncols; // number of non primary key cols int npks; // number of primary key cols bool enabled; // flag to check if a table is enabled or disabled @@ -224,7 +267,7 @@ bool force_uncompressed_blob = false; #endif // Internal prototypes -int local_mark_insert_or_update_meta (cloudsync_table_context *table, const char *pk, size_t pklen, const char *col_name, int64_t db_version, int seq); +int local_mark_insert_or_update_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const char *col_name, int64_t db_version, int seq); // MARK: - CRDT algos - @@ -697,8 +740,23 @@ void table_free (cloudsync_table_context *table) { if (table->col_id) { cloudsync_memory_free(table->col_id); } + if (table->col_algo) { + cloudsync_memory_free(table->col_algo); + } + if (table->col_delimiter) { + for (int i=0; incols; ++i) { + if (table->col_delimiter[i]) cloudsync_memory_free(table->col_delimiter[i]); + } + cloudsync_memory_free(table->col_delimiter); + } } - + + if (table->block_value_read_stmt) databasevm_finalize(table->block_value_read_stmt); + if (table->block_value_write_stmt) databasevm_finalize(table->block_value_write_stmt); + if (table->block_value_delete_stmt) databasevm_finalize(table->block_value_delete_stmt); + if (table->block_list_stmt) databasevm_finalize(table->block_list_stmt); + if (table->blocks_ref) cloudsync_memory_free(table->blocks_ref); + if (table->name) cloudsync_memory_free(table->name); if (table->schema) cloudsync_memory_free(table->schema); if (table->meta_ref) cloudsync_memory_free(table->meta_ref); @@ -1031,6 +1089,12 @@ bool table_add_to_context (cloudsync_context *data, table_algo algo, const char table->col_value_stmt = (dbvm_t **)cloudsync_memory_alloc((uint64_t)(sizeof(void *) * ncols)); if (!table->col_value_stmt) goto abort_add_table; + table->col_algo = (col_algo_t *)cloudsync_memory_zeroalloc((uint64_t)(sizeof(col_algo_t) * ncols)); + if (!table->col_algo) goto abort_add_table; + + table->col_delimiter = (char **)cloudsync_memory_zeroalloc((uint64_t)(sizeof(char *) * ncols)); + if (!table->col_delimiter) goto abort_add_table; + // Pass empty string when schema is NULL; SQL will fall back to current_schema() const char *schema = table->schema ? table->schema : ""; char *sql = cloudsync_memory_mprintf(SQL_PRAGMA_TABLEINFO_LIST_NONPK_NAME_CID, @@ -1203,7 +1267,242 @@ int merge_set_winner_clock (cloudsync_context *data, cloudsync_table_context *ta return rc; } -int merge_insert_col (cloudsync_context *data, cloudsync_table_context *table, const char *pk, int pklen, const char *col_name, dbvalue_t *col_value, int64_t col_version, int64_t db_version, const char *site_id, int site_len, int64_t seq, int64_t *rowid) { +// MARK: - Deferred column-batch merge functions - + +static int merge_pending_add (cloudsync_context *data, cloudsync_table_context *table, const char *pk, int pklen, const char *col_name, dbvalue_t *col_value, int64_t col_version, int64_t db_version, const char *site_id, int site_len, int64_t seq) { + merge_pending_batch *batch = data->pending_batch; + + // Store table and PK on first entry + if (batch->table == NULL) { + batch->table = table; + batch->pk = (char *)cloudsync_memory_alloc(pklen); + if (!batch->pk) return cloudsync_set_error(data, "merge_pending_add: out of memory for pk", DBRES_NOMEM); + memcpy(batch->pk, pk, pklen); + batch->pk_len = pklen; + } + + // Ensure capacity + if (batch->count >= batch->capacity) { + int new_cap = batch->capacity ? batch->capacity * 2 : 8; + merge_pending_entry *new_entries = (merge_pending_entry *)cloudsync_memory_realloc(batch->entries, new_cap * sizeof(merge_pending_entry)); + if (!new_entries) return cloudsync_set_error(data, "merge_pending_add: out of memory for entries", DBRES_NOMEM); + batch->entries = new_entries; + batch->capacity = new_cap; + } + + // Resolve col_name to a stable pointer from the table context + // (the incoming col_name may point to VM-owned memory that gets freed on reset) + int col_idx = -1; + table_column_lookup(table, col_name, true, &col_idx); + const char *stable_col_name = (col_idx >= 0) ? table_colname(table, col_idx) : NULL; + if (!stable_col_name) return cloudsync_set_error(data, "merge_pending_add: column not found in table context", DBRES_ERROR); + + merge_pending_entry *e = &batch->entries[batch->count]; + e->col_name = stable_col_name; + e->col_value = col_value ? (dbvalue_t *)database_value_dup(col_value) : NULL; + e->col_version = col_version; + e->db_version = db_version; + e->site_id_len = (site_len <= (int)sizeof(e->site_id)) ? site_len : (int)sizeof(e->site_id); + memcpy(e->site_id, site_id, e->site_id_len); + e->seq = seq; + + batch->count++; + return DBRES_OK; +} + +static void merge_pending_free_entries (merge_pending_batch *batch) { + if (batch->entries) { + for (int i = 0; i < batch->count; i++) { + if (batch->entries[i].col_value) { + database_value_free(batch->entries[i].col_value); + batch->entries[i].col_value = NULL; + } + } + } + if (batch->pk) { + cloudsync_memory_free(batch->pk); + batch->pk = NULL; + } + batch->table = NULL; + batch->pk_len = 0; + batch->cl = 0; + batch->sentinel_pending = false; + batch->row_exists = false; + batch->count = 0; +} + +static int merge_flush_pending (cloudsync_context *data) { + merge_pending_batch *batch = data->pending_batch; + if (!batch) return DBRES_OK; + + int rc = DBRES_OK; + bool flush_savepoint = false; + + // Nothing to write — handle sentinel-only case or skip + if (batch->count == 0 && !(batch->sentinel_pending && batch->table)) { + goto cleanup; + } + + // Wrap database operations in a savepoint so that on failure (e.g. RLS + // denial) the rollback properly releases all executor resources (open + // relations, snapshots, plan cache) acquired during the failed statement. + flush_savepoint = (database_begin_savepoint(data, "merge_flush") == DBRES_OK); + + if (batch->count == 0) { + // Sentinel with no winning columns (PK-only row) + dbvm_t *vm = batch->table->real_merge_sentinel_stmt; + rc = pk_decode_prikey(batch->pk, (size_t)batch->pk_len, pk_decode_bind_callback, vm); + if (rc < 0) { + cloudsync_set_dberror(data); + dbvm_reset(vm); + goto cleanup; + } + SYNCBIT_SET(data); + rc = databasevm_step(vm); + dbvm_reset(vm); + SYNCBIT_RESET(data); + if (rc == DBRES_DONE) rc = DBRES_OK; + if (rc != DBRES_OK) { + cloudsync_set_dberror(data); + goto cleanup; + } + goto cleanup; + } + + // Check if cached prepared statement can be reused + cloudsync_table_context *table = batch->table; + dbvm_t *vm = NULL; + bool cache_hit = false; + + if (batch->cached_vm && + batch->cached_row_exists == batch->row_exists && + batch->cached_col_count == batch->count) { + cache_hit = true; + for (int i = 0; i < batch->count; i++) { + if (batch->cached_col_names[i] != batch->entries[i].col_name) { + cache_hit = false; + break; + } + } + } + + if (cache_hit) { + vm = batch->cached_vm; + dbvm_reset(vm); + } else { + // Invalidate old cache + if (batch->cached_vm) { + databasevm_finalize(batch->cached_vm); + batch->cached_vm = NULL; + } + + // Build multi-column SQL + const char **colnames = (const char **)cloudsync_memory_alloc(batch->count * sizeof(const char *)); + if (!colnames) { + rc = cloudsync_set_error(data, "merge_flush_pending: out of memory", DBRES_NOMEM); + goto cleanup; + } + for (int i = 0; i < batch->count; i++) { + colnames[i] = batch->entries[i].col_name; + } + + char *sql = batch->row_exists + ? sql_build_update_pk_and_multi_cols(data, table->name, colnames, batch->count, table->schema) + : sql_build_upsert_pk_and_multi_cols(data, table->name, colnames, batch->count, table->schema); + cloudsync_memory_free(colnames); + + if (!sql) { + rc = cloudsync_set_error(data, "merge_flush_pending: unable to build multi-column upsert SQL", DBRES_ERROR); + goto cleanup; + } + + rc = databasevm_prepare(data, sql, &vm, 0); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) { + rc = cloudsync_set_error(data, "merge_flush_pending: unable to prepare statement", rc); + goto cleanup; + } + + // Update cache + batch->cached_vm = vm; + batch->cached_row_exists = batch->row_exists; + batch->cached_col_count = batch->count; + // Reallocate cached_col_names if needed + if (batch->cached_col_count > 0) { + const char **new_names = (const char **)cloudsync_memory_realloc( + batch->cached_col_names, batch->count * sizeof(const char *)); + if (new_names) { + for (int i = 0; i < batch->count; i++) { + new_names[i] = batch->entries[i].col_name; + } + batch->cached_col_names = new_names; + } + } + } + + // Bind PKs (positions 1..npks) + int npks = pk_decode_prikey(batch->pk, (size_t)batch->pk_len, pk_decode_bind_callback, vm); + if (npks < 0) { + cloudsync_set_dberror(data); + dbvm_reset(vm); + rc = DBRES_ERROR; + goto cleanup; + } + + // Bind column values (positions npks+1..npks+count) + for (int i = 0; i < batch->count; i++) { + merge_pending_entry *e = &batch->entries[i]; + int bind_idx = npks + 1 + i; + if (e->col_value) { + rc = databasevm_bind_value(vm, bind_idx, e->col_value); + } else { + rc = databasevm_bind_null(vm, bind_idx); + } + if (rc != DBRES_OK) { + cloudsync_set_dberror(data); + dbvm_reset(vm); + goto cleanup; + } + } + + // Execute with SYNCBIT and GOS handling + if (table->algo == table_algo_crdt_gos) table->enabled = 0; + SYNCBIT_SET(data); + rc = databasevm_step(vm); + dbvm_reset(vm); + SYNCBIT_RESET(data); + if (table->algo == table_algo_crdt_gos) table->enabled = 1; + + if (rc != DBRES_DONE) { + cloudsync_set_dberror(data); + goto cleanup; + } + rc = DBRES_OK; + + // Call merge_set_winner_clock for each buffered entry + int64_t rowid = 0; + for (int i = 0; i < batch->count; i++) { + merge_pending_entry *e = &batch->entries[i]; + int clock_rc = merge_set_winner_clock(data, table, batch->pk, batch->pk_len, + e->col_name, e->col_version, e->db_version, + (const char *)e->site_id, e->site_id_len, + e->seq, &rowid); + if (clock_rc != DBRES_OK) { + rc = clock_rc; + goto cleanup; + } + } + +cleanup: + merge_pending_free_entries(batch); + if (flush_savepoint) { + if (rc == DBRES_OK) database_commit_savepoint(data, "merge_flush"); + else database_rollback_savepoint(data, "merge_flush"); + } + return rc; +} + +int merge_insert_col (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *col_name, dbvalue_t *col_value, int64_t col_version, int64_t db_version, const char *site_id, int site_len, int64_t seq, int64_t *rowid) { int index; dbvm_t *vm = table_column_lookup(table, col_name, true, &index); if (vm == NULL) return cloudsync_set_error(data, "Unable to retrieve column merge precompiled statement in merge_insert_col", DBRES_MISUSE); @@ -1335,17 +1634,29 @@ int merge_did_cid_win (cloudsync_context *data, cloudsync_table_context *table, } // rc == DBRES_ROW and col_version == local_version, need to compare values - + // retrieve col_value precompiled statement - dbvm_t *vm = table_column_lookup(table, col_name, false, NULL); - if (!vm) return cloudsync_set_error(data, "Unable to retrieve column value precompiled statement in merge_did_cid_win", DBRES_ERROR); - - // bind primary key values - rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, (void *)vm); - if (rc < 0) { - rc = cloudsync_set_dberror(data); - dbvm_reset(vm); - return rc; + bool is_block_col = block_is_block_colname(col_name) && table_has_block_cols(table); + dbvm_t *vm; + if (is_block_col) { + // Block column: read value from blocks table (pk + col_name bindings) + vm = table_block_value_read_stmt(table); + if (!vm) return cloudsync_set_error(data, "Unable to retrieve block value read statement in merge_did_cid_win", DBRES_ERROR); + rc = databasevm_bind_blob(vm, 1, (const void *)pk, pklen); + if (rc != DBRES_OK) { dbvm_reset(vm); return cloudsync_set_dberror(data); } + rc = databasevm_bind_text(vm, 2, col_name, -1); + if (rc != DBRES_OK) { dbvm_reset(vm); return cloudsync_set_dberror(data); } + } else { + vm = table_column_lookup(table, col_name, false, NULL); + if (!vm) return cloudsync_set_error(data, "Unable to retrieve column value precompiled statement in merge_did_cid_win", DBRES_ERROR); + + // bind primary key values + rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, (void *)vm); + if (rc < 0) { + rc = cloudsync_set_dberror(data); + dbvm_reset(vm); + return rc; + } } // execute vm @@ -1386,7 +1697,7 @@ int merge_did_cid_win (cloudsync_context *data, cloudsync_table_context *table, rc = databasevm_step(vm); if (rc == DBRES_ROW) { - const void *local_site_id = database_column_blob(vm, 0); + const void *local_site_id = database_column_blob(vm, 0, NULL); if (!local_site_id) { dbvm_reset(vm); return cloudsync_set_error(data, "NULL site_id in cloudsync table, table is probably corrupted", DBRES_ERROR); @@ -1408,34 +1719,236 @@ int merge_did_cid_win (cloudsync_context *data, cloudsync_table_context *table, } int merge_sentinel_only_insert (cloudsync_context *data, cloudsync_table_context *table, const char *pk, int pklen, int64_t cl, int64_t db_version, const char *site_id, int site_len, int64_t seq, int64_t *rowid) { - + // reset return value *rowid = 0; - - // bind pk - dbvm_t *vm = table->real_merge_sentinel_stmt; - int rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, vm); - if (rc < 0) { - rc = cloudsync_set_dberror(data); + + if (data->pending_batch == NULL) { + // Immediate mode: execute base table INSERT + dbvm_t *vm = table->real_merge_sentinel_stmt; + int rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, vm); + if (rc < 0) { + rc = cloudsync_set_dberror(data); + dbvm_reset(vm); + return rc; + } + + SYNCBIT_SET(data); + rc = databasevm_step(vm); dbvm_reset(vm); - return rc; + SYNCBIT_RESET(data); + if (rc == DBRES_DONE) rc = DBRES_OK; + if (rc != DBRES_OK) { + cloudsync_set_dberror(data); + return rc; + } + } else { + // Batch mode: skip base table INSERT, the batch flush will create the row + merge_pending_batch *batch = data->pending_batch; + batch->sentinel_pending = true; + if (batch->table == NULL) { + batch->table = table; + batch->pk = (char *)cloudsync_memory_alloc(pklen); + if (!batch->pk) return cloudsync_set_error(data, "merge_sentinel_only_insert: out of memory for pk", DBRES_NOMEM); + memcpy(batch->pk, pk, pklen); + batch->pk_len = pklen; + } } - - // perform real operation and disable triggers - SYNCBIT_SET(data); + + // Metadata operations always execute regardless of batch mode + int rc = merge_zeroclock_on_resurrect(table, db_version, pk, pklen); + if (rc != DBRES_OK) return rc; + + return merge_set_winner_clock(data, table, pk, pklen, NULL, cl, db_version, site_id, site_len, seq, rowid); +} + +// MARK: - Block-level merge helpers - + +// Store a block value in the blocks table +static int block_store_value (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *block_colname, dbvalue_t *col_value) { + dbvm_t *vm = table->block_value_write_stmt; + if (!vm) return cloudsync_set_error(data, "block_store_value: blocks table not initialized", DBRES_MISUSE); + + int rc = databasevm_bind_blob(vm, 1, pk, pklen); + if (rc != DBRES_OK) goto cleanup; + rc = databasevm_bind_text(vm, 2, block_colname, -1); + if (rc != DBRES_OK) goto cleanup; + if (col_value) { + rc = databasevm_bind_value(vm, 3, col_value); + } else { + rc = databasevm_bind_null(vm, 3); + } + if (rc != DBRES_OK) goto cleanup; + rc = databasevm_step(vm); - dbvm_reset(vm); + if (rc == DBRES_DONE) rc = DBRES_OK; + +cleanup: + if (rc != DBRES_OK) cloudsync_set_dberror(data); + databasevm_reset(vm); + return rc; +} + +// Delete a block value from the blocks table +static int block_delete_value (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *block_colname) { + dbvm_t *vm = table->block_value_delete_stmt; + if (!vm) return cloudsync_set_error(data, "block_delete_value: blocks table not initialized", DBRES_MISUSE); + + int rc = databasevm_bind_blob(vm, 1, pk, pklen); + if (rc != DBRES_OK) goto cleanup; + rc = databasevm_bind_text(vm, 2, block_colname, -1); + if (rc != DBRES_OK) goto cleanup; + + rc = databasevm_step(vm); + if (rc == DBRES_DONE) rc = DBRES_OK; + +cleanup: + if (rc != DBRES_OK) cloudsync_set_dberror(data); + databasevm_reset(vm); + return rc; +} + +// Materialize all alive blocks for a base column into the base table +int block_materialize_column (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *base_col_name) { + if (!table->block_list_stmt) return cloudsync_set_error(data, "block_materialize_column: blocks table not initialized", DBRES_MISUSE); + + // Find column index and delimiter + int col_idx = -1; + for (int i = 0; i < table->ncols; i++) { + if (strcasecmp(table->col_name[i], base_col_name) == 0) { + col_idx = i; + break; + } + } + if (col_idx < 0) return cloudsync_set_error(data, "block_materialize_column: column not found", DBRES_ERROR); + const char *delimiter = table->col_delimiter[col_idx] ? table->col_delimiter[col_idx] : BLOCK_DEFAULT_DELIMITER; + + // Build the LIKE pattern for block col_names: "base_col\x1F%" + char *like_pattern = block_build_colname(base_col_name, "%"); + if (!like_pattern) return DBRES_NOMEM; + + // Query alive blocks from blocks table joined with metadata + // block_list_stmt: SELECT b.col_value FROM blocks b JOIN meta m + // ON b.pk = m.pk AND b.col_name = m.col_name + // WHERE b.pk = ? AND b.col_name LIKE ? AND m.col_version % 2 = 1 + // ORDER BY b.col_name + dbvm_t *vm = table->block_list_stmt; + int rc = databasevm_bind_blob(vm, 1, pk, pklen); + if (rc != DBRES_OK) { cloudsync_memory_free(like_pattern); databasevm_reset(vm); return rc; } + rc = databasevm_bind_text(vm, 2, like_pattern, -1); + if (rc != DBRES_OK) { cloudsync_memory_free(like_pattern); databasevm_reset(vm); return rc; } + // Bind pk again for the join condition (parameter 3) + rc = databasevm_bind_blob(vm, 3, pk, pklen); + if (rc != DBRES_OK) { cloudsync_memory_free(like_pattern); databasevm_reset(vm); return rc; } + rc = databasevm_bind_text(vm, 4, like_pattern, -1); + if (rc != DBRES_OK) { cloudsync_memory_free(like_pattern); databasevm_reset(vm); return rc; } + + // Collect block values + const char **block_values = NULL; + int block_count = 0; + int block_cap = 0; + + while ((rc = databasevm_step(vm)) == DBRES_ROW) { + const char *value = database_column_text(vm, 0); + if (block_count >= block_cap) { + int new_cap = block_cap ? block_cap * 2 : 16; + const char **new_arr = (const char **)cloudsync_memory_realloc((void *)block_values, (uint64_t)(new_cap * sizeof(char *))); + if (!new_arr) { rc = DBRES_NOMEM; break; } + block_values = new_arr; + block_cap = new_cap; + } + block_values[block_count] = value ? cloudsync_string_dup(value) : cloudsync_string_dup(""); + block_count++; + } + databasevm_reset(vm); + cloudsync_memory_free(like_pattern); + + if (rc != DBRES_DONE && rc != DBRES_OK && rc != DBRES_ROW) { + // Free collected values + for (int i = 0; i < block_count; i++) cloudsync_memory_free((void *)block_values[i]); + if (block_values) cloudsync_memory_free((void *)block_values); + return cloudsync_set_dberror(data); + } + + // Materialize text (NULL when no alive blocks) + char *text = (block_count > 0) ? block_materialize_text(block_values, block_count, delimiter) : NULL; + for (int i = 0; i < block_count; i++) cloudsync_memory_free((void *)block_values[i]); + if (block_values) cloudsync_memory_free((void *)block_values); + if (block_count > 0 && !text) return DBRES_NOMEM; + + // Update the base table column via the col_merge_stmt (with triggers disabled) + dbvm_t *merge_vm = table->col_merge_stmt[col_idx]; + if (!merge_vm) { cloudsync_memory_free(text); return DBRES_ERROR; } + + // Bind PKs + rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, merge_vm); + if (rc < 0) { cloudsync_memory_free(text); databasevm_reset(merge_vm); return DBRES_ERROR; } + + // Bind the text value twice (INSERT value + ON CONFLICT UPDATE value) + int npks = table->npks; + if (text) { + rc = databasevm_bind_text(merge_vm, npks + 1, text, -1); + if (rc != DBRES_OK) { cloudsync_memory_free(text); databasevm_reset(merge_vm); return rc; } + rc = databasevm_bind_text(merge_vm, npks + 2, text, -1); + if (rc != DBRES_OK) { cloudsync_memory_free(text); databasevm_reset(merge_vm); return rc; } + } else { + rc = databasevm_bind_null(merge_vm, npks + 1); + if (rc != DBRES_OK) { databasevm_reset(merge_vm); return rc; } + rc = databasevm_bind_null(merge_vm, npks + 2); + if (rc != DBRES_OK) { databasevm_reset(merge_vm); return rc; } + } + + // Execute with triggers disabled + table->enabled = 0; + SYNCBIT_SET(data); + rc = databasevm_step(merge_vm); + databasevm_reset(merge_vm); SYNCBIT_RESET(data); + table->enabled = 1; + + cloudsync_memory_free(text); + if (rc == DBRES_DONE) rc = DBRES_OK; - if (rc != DBRES_OK) { - cloudsync_set_dberror(data); - return rc; + if (rc != DBRES_OK) return cloudsync_set_dberror(data); + return DBRES_OK; +} + +// Accessor for has_block_cols flag +bool table_has_block_cols (cloudsync_table_context *table) { + return table && table->has_block_cols; +} + +// Get block column algo for a given column index +col_algo_t table_col_algo (cloudsync_table_context *table, int index) { + if (!table || !table->col_algo || index < 0 || index >= table->ncols) return col_algo_normal; + return table->col_algo[index]; +} + +// Get block delimiter for a given column index +const char *table_col_delimiter (cloudsync_table_context *table, int index) { + if (!table || !table->col_delimiter || index < 0 || index >= table->ncols) return BLOCK_DEFAULT_DELIMITER; + return table->col_delimiter[index] ? table->col_delimiter[index] : BLOCK_DEFAULT_DELIMITER; +} + +// Block column struct accessors (for use outside cloudsync.c where struct is opaque) +dbvm_t *table_block_value_read_stmt (cloudsync_table_context *table) { return table ? table->block_value_read_stmt : NULL; } +dbvm_t *table_block_value_write_stmt (cloudsync_table_context *table) { return table ? table->block_value_write_stmt : NULL; } +dbvm_t *table_block_list_stmt (cloudsync_table_context *table) { return table ? table->block_list_stmt : NULL; } +const char *table_blocks_ref (cloudsync_table_context *table) { return table ? table->blocks_ref : NULL; } + +void table_set_col_delimiter (cloudsync_table_context *table, int col_idx, const char *delimiter) { + if (!table || !table->col_delimiter || col_idx < 0 || col_idx >= table->ncols) return; + if (table->col_delimiter[col_idx]) cloudsync_memory_free(table->col_delimiter[col_idx]); + table->col_delimiter[col_idx] = delimiter ? cloudsync_string_dup(delimiter) : NULL; +} + +// Find column index by name +int table_col_index (cloudsync_table_context *table, const char *col_name) { + if (!table || !col_name) return -1; + for (int i = 0; i < table->ncols; i++) { + if (strcasecmp(table->col_name[i], col_name) == 0) return i; } - - rc = merge_zeroclock_on_resurrect(table, db_version, pk, pklen); - if (rc != DBRES_OK) return rc; - - return merge_set_winner_clock(data, table, pk, pklen, NULL, cl, db_version, site_id, site_len, seq, rowid); + return -1; } int merge_insert (cloudsync_context *data, cloudsync_table_context *table, const char *insert_pk, int insert_pk_len, int64_t insert_cl, const char *insert_name, dbvalue_t *insert_value, int64_t insert_col_version, int64_t insert_db_version, const char *insert_site_id, int insert_site_id_len, int64_t insert_seq, int64_t *rowid) { @@ -1505,14 +2018,137 @@ int merge_insert (cloudsync_context *data, cloudsync_table_context *table, const // check if the incoming change wins and should be applied bool does_cid_win = ((needs_resurrect) || (!row_exists_locally) || (flag)); if (!does_cid_win) return DBRES_OK; - + + // Block-level LWW: if the incoming col_name is a block entry (contains \x1F), + // bypass the normal base-table write and instead store the value in the blocks table. + // The base table column will be materialized from all alive blocks. + if (block_is_block_colname(insert_name) && table->has_block_cols) { + // Store or delete block value in blocks table depending on tombstone status + if (insert_col_version % 2 == 0) { + // Tombstone: remove from blocks table + rc = block_delete_value(data, table, insert_pk, insert_pk_len, insert_name); + } else { + rc = block_store_value(data, table, insert_pk, insert_pk_len, insert_name, insert_value); + } + if (rc != DBRES_OK) return cloudsync_set_error(data, "Unable to store/delete block value", rc); + + // Set winner clock in metadata + rc = merge_set_winner_clock(data, table, insert_pk, insert_pk_len, insert_name, + insert_col_version, insert_db_version, + insert_site_id, insert_site_id_len, insert_seq, rowid); + if (rc != DBRES_OK) return cloudsync_set_error(data, "Unable to set winner clock for block", rc); + + // Materialize the full column from blocks into the base table + char *base_col = block_extract_base_colname(insert_name); + if (base_col) { + rc = block_materialize_column(data, table, insert_pk, insert_pk_len, base_col); + cloudsync_memory_free(base_col); + if (rc != DBRES_OK) return cloudsync_set_error(data, "Unable to materialize block column", rc); + } + + return DBRES_OK; + } + // perform the final column insert or update if the incoming change wins - rc = merge_insert_col(data, table, insert_pk, insert_pk_len, insert_name, insert_value, insert_col_version, insert_db_version, insert_site_id, insert_site_id_len, insert_seq, rowid); - if (rc != DBRES_OK) cloudsync_set_error(data, "Unable to perform merge_insert_col", rc); - + if (data->pending_batch) { + // Propagate row_exists_locally to the batch on the first winning column. + // This lets merge_flush_pending choose UPDATE vs INSERT ON CONFLICT, + // which matters when RLS policies reference columns not in the payload. + if (data->pending_batch->table == NULL) { + data->pending_batch->row_exists = row_exists_locally; + } + rc = merge_pending_add(data, table, insert_pk, insert_pk_len, insert_name, insert_value, insert_col_version, insert_db_version, insert_site_id, insert_site_id_len, insert_seq); + if (rc != DBRES_OK) cloudsync_set_error(data, "Unable to perform merge_pending_add", rc); + } else { + rc = merge_insert_col(data, table, insert_pk, insert_pk_len, insert_name, insert_value, insert_col_version, insert_db_version, insert_site_id, insert_site_id_len, insert_seq, rowid); + if (rc != DBRES_OK) cloudsync_set_error(data, "Unable to perform merge_insert_col", rc); + } + return rc; } +// MARK: - Block column setup - + +int cloudsync_setup_block_column (cloudsync_context *data, const char *table_name, const char *col_name, const char *delimiter) { + cloudsync_table_context *table = table_lookup(data, table_name); + if (!table) return cloudsync_set_error(data, "cloudsync_setup_block_column: table not found", DBRES_ERROR); + + // Find column index + int col_idx = table_col_index(table, col_name); + if (col_idx < 0) { + char buf[1024]; + snprintf(buf, sizeof(buf), "cloudsync_setup_block_column: column '%s' not found in table '%s'", col_name, table_name); + return cloudsync_set_error(data, buf, DBRES_ERROR); + } + + // Set column algo + table->col_algo[col_idx] = col_algo_block; + table->has_block_cols = true; + + // Set delimiter (can be NULL for default) + if (table->col_delimiter[col_idx]) { + cloudsync_memory_free(table->col_delimiter[col_idx]); + table->col_delimiter[col_idx] = NULL; + } + if (delimiter) { + table->col_delimiter[col_idx] = cloudsync_string_dup(delimiter); + } + + // Create blocks table if not already done + if (!table->blocks_ref) { + table->blocks_ref = database_build_blocks_ref(table->schema, table->name); + if (!table->blocks_ref) return DBRES_NOMEM; + + // CREATE TABLE IF NOT EXISTS + char *sql = cloudsync_memory_mprintf(SQL_BLOCKS_CREATE_TABLE, table->blocks_ref); + if (!sql) return DBRES_NOMEM; + + int rc = database_exec(data, sql); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) return cloudsync_set_error(data, "Unable to create blocks table", rc); + + // Prepare block statements + // Write: upsert into blocks (pk, col_name, col_value) + sql = cloudsync_memory_mprintf(SQL_BLOCKS_UPSERT, table->blocks_ref); + if (!sql) return DBRES_NOMEM; + rc = databasevm_prepare(data, sql, (void **)&table->block_value_write_stmt, DBFLAG_PERSISTENT); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) return rc; + + // Read: SELECT col_value FROM blocks WHERE pk = ? AND col_name = ? + sql = cloudsync_memory_mprintf(SQL_BLOCKS_SELECT, table->blocks_ref); + if (!sql) return DBRES_NOMEM; + rc = databasevm_prepare(data, sql, (void **)&table->block_value_read_stmt, DBFLAG_PERSISTENT); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) return rc; + + // Delete: DELETE FROM blocks WHERE pk = ? AND col_name = ? + sql = cloudsync_memory_mprintf(SQL_BLOCKS_DELETE, table->blocks_ref); + if (!sql) return DBRES_NOMEM; + rc = databasevm_prepare(data, sql, (void **)&table->block_value_delete_stmt, DBFLAG_PERSISTENT); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) return rc; + + // List alive blocks for materialization + sql = cloudsync_memory_mprintf(SQL_BLOCKS_LIST_ALIVE, table->blocks_ref, table->meta_ref); + if (!sql) return DBRES_NOMEM; + rc = databasevm_prepare(data, sql, (void **)&table->block_list_stmt, DBFLAG_PERSISTENT); + cloudsync_memory_free(sql); + if (rc != DBRES_OK) return rc; + } + + // Persist settings + int rc = dbutils_table_settings_set_key_value(data, table_name, col_name, "algo", "block"); + if (rc != DBRES_OK) return rc; + + if (delimiter) { + rc = dbutils_table_settings_set_key_value(data, table_name, col_name, "delimiter", delimiter); + if (rc != DBRES_OK) return rc; + } + + return DBRES_OK; +} + // MARK: - Private - bool cloudsync_config_exists (cloudsync_context *data) { @@ -1942,13 +2578,13 @@ int cloudsync_refill_metatable (cloudsync_context *data, const char *table_name) rc = databasevm_bind_text(vm, 1, col_name, -1); if (rc != DBRES_OK) goto finalize; - + while (1) { rc = databasevm_step(vm); if (rc == DBRES_ROW) { - const char *pk = (const char *)database_column_text(vm, 0); + size_t pklen = 0; + const void *pk = (const char *)database_column_blob(vm, 0, &pklen); if (!pk) { rc = DBRES_ERROR; break; } - size_t pklen = strlen(pk); rc = local_mark_insert_or_update_meta(table, pk, pklen, col_name, db_version, cloudsync_bumpseq(data)); } else if (rc == DBRES_DONE) { rc = DBRES_OK; @@ -1971,7 +2607,7 @@ int cloudsync_refill_metatable (cloudsync_context *data, const char *table_name) // MARK: - Local - -int local_update_sentinel (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq) { +int local_update_sentinel (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq) { dbvm_t *vm = table->meta_sentinel_update_stmt; if (!vm) return -1; @@ -1993,7 +2629,7 @@ int local_update_sentinel (cloudsync_table_context *table, const char *pk, size_ return rc; } -int local_mark_insert_sentinel_meta (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq) { +int local_mark_insert_sentinel_meta (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq) { dbvm_t *vm = table->meta_sentinel_insert_stmt; if (!vm) return -1; @@ -2021,7 +2657,7 @@ int local_mark_insert_sentinel_meta (cloudsync_table_context *table, const char return rc; } -int local_mark_insert_or_update_meta_impl (cloudsync_table_context *table, const char *pk, size_t pklen, const char *col_name, int col_version, int64_t db_version, int seq) { +int local_mark_insert_or_update_meta_impl (cloudsync_table_context *table, const void *pk, size_t pklen, const char *col_name, int col_version, int64_t db_version, int seq) { dbvm_t *vm = table->meta_row_insert_update_stmt; if (!vm) return -1; @@ -2056,15 +2692,24 @@ int local_mark_insert_or_update_meta_impl (cloudsync_table_context *table, const return rc; } -int local_mark_insert_or_update_meta (cloudsync_table_context *table, const char *pk, size_t pklen, const char *col_name, int64_t db_version, int seq) { +int local_mark_insert_or_update_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const char *col_name, int64_t db_version, int seq) { return local_mark_insert_or_update_meta_impl(table, pk, pklen, col_name, 1, db_version, seq); } -int local_mark_delete_meta (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq) { +int local_mark_delete_block_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const char *block_colname, int64_t db_version, int seq) { + // Mark a block as deleted by setting col_version = 2 (even = deleted) + return local_mark_insert_or_update_meta_impl(table, pk, pklen, block_colname, 2, db_version, seq); +} + +int block_delete_value_external (cloudsync_context *data, cloudsync_table_context *table, const void *pk, size_t pklen, const char *block_colname) { + return block_delete_value(data, table, pk, (int)pklen, block_colname); +} + +int local_mark_delete_meta (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq) { return local_mark_insert_or_update_meta_impl(table, pk, pklen, NULL, 2, db_version, seq); } -int local_drop_meta (cloudsync_table_context *table, const char *pk, size_t pklen) { +int local_drop_meta (cloudsync_table_context *table, const void *pk, size_t pklen) { dbvm_t *vm = table->meta_row_drop_stmt; if (!vm) return -1; @@ -2080,7 +2725,7 @@ int local_drop_meta (cloudsync_table_context *table, const char *pk, size_t pkle return rc; } -int local_update_move_meta (cloudsync_table_context *table, const char *pk, size_t pklen, const char *pk2, size_t pklen2, int64_t db_version) { +int local_update_move_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const void *pk2, size_t pklen2, int64_t db_version) { /* * This function moves non-sentinel metadata entries from an old primary key (OLD.pk) * to a new primary key (NEW.pk) when a primary key change occurs. @@ -2431,78 +3076,108 @@ int cloudsync_payload_apply (cloudsync_context *data, const char *payload, int b uint16_t ncols = header.ncols; uint32_t nrows = header.nrows; int64_t last_payload_db_version = -1; - bool in_savepoint = false; int dbversion = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_CHECK_DBVERSION); int seq = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_CHECK_SEQ); cloudsync_pk_decode_bind_context decoded_context = {.vm = vm}; - void *payload_apply_xdata = NULL; - void *db = data->db; - cloudsync_payload_apply_callback_t payload_apply_callback = cloudsync_get_payload_apply_callback(db); - + + // Initialize deferred column-batch merge + merge_pending_batch batch = {0}; + data->pending_batch = &batch; + bool in_savepoint = false; + const void *last_pk = NULL; + int64_t last_pk_len = 0; + const char *last_tbl = NULL; + int64_t last_tbl_len = 0; + for (uint32_t i=0; iskip_decode_idx, cloudsync_payload_decode_callback, &decoded_context); if (res == -1) { + merge_flush_pending(data); + data->pending_batch = NULL; + if (batch.cached_vm) { databasevm_finalize(batch.cached_vm); batch.cached_vm = NULL; } + if (batch.cached_col_names) { cloudsync_memory_free(batch.cached_col_names); batch.cached_col_names = NULL; } + if (batch.entries) { cloudsync_memory_free(batch.entries); batch.entries = NULL; } if (in_savepoint) database_rollback_savepoint(data, "cloudsync_payload_apply"); rc = DBRES_ERROR; goto cleanup; } - // n is the pk_decode return value, I don't think I should assert here because in any case the next databasevm_step would fail - // assert(n == ncols); - - bool approved = true; - if (payload_apply_callback) approved = payload_apply_callback(&payload_apply_xdata, &decoded_context, db, data, CLOUDSYNC_PAYLOAD_APPLY_WILL_APPLY, DBRES_OK); - - // Apply consecutive rows with the same db_version inside a transaction if no - // transaction has already been opened. - // The user may have already opened a transaction before applying the payload, - // and the `payload_apply_callback` may have already opened a savepoint. - // Nested savepoints work, but overlapping savepoints could alter the expected behavior. - // This savepoint ensures that the db_version value remains consistent for all - // rows with the same original db_version in the payload. + // Detect PK/table/db_version boundary to flush pending batch + bool pk_changed = (last_pk != NULL && + (last_pk_len != decoded_context.pk_len || + memcmp(last_pk, decoded_context.pk, last_pk_len) != 0)); + bool tbl_changed = (last_tbl != NULL && + (last_tbl_len != decoded_context.tbl_len || + memcmp(last_tbl, decoded_context.tbl, last_tbl_len) != 0)); bool db_version_changed = (last_payload_db_version != decoded_context.db_version); - // Release existing savepoint if db_version changed + // Flush pending batch before any boundary change + if (pk_changed || tbl_changed || db_version_changed) { + int flush_rc = merge_flush_pending(data); + if (flush_rc != DBRES_OK) { + rc = flush_rc; + // continue processing remaining rows + } + } + + // Per-db_version savepoints group rows with the same source db_version + // into one transaction. In SQLite autocommit mode, the RELEASE triggers + // the commit hook which bumps data->db_version and resets seq, ensuring + // unique (db_version, seq) tuples across groups. In PostgreSQL SPI, + // database_in_transaction() is always true so this block is inactive — + // the inner per-PK savepoint in merge_flush_pending handles RLS instead. if (in_savepoint && db_version_changed) { rc = database_commit_savepoint(data, "cloudsync_payload_apply"); if (rc != DBRES_OK) { + merge_pending_free_entries(&batch); + data->pending_batch = NULL; cloudsync_set_error(data, "Error on cloudsync_payload_apply: unable to release a savepoint", rc); goto cleanup; } in_savepoint = false; } - // Start new savepoint if needed - bool in_transaction = database_in_transaction(data); - if (!in_transaction && db_version_changed) { + if (!in_savepoint && db_version_changed && !database_in_transaction(data)) { rc = database_begin_savepoint(data, "cloudsync_payload_apply"); if (rc != DBRES_OK) { + merge_pending_free_entries(&batch); + data->pending_batch = NULL; cloudsync_set_error(data, "Error on cloudsync_payload_apply: unable to start a transaction", rc); goto cleanup; } - last_payload_db_version = decoded_context.db_version; in_savepoint = true; } - - if (approved) { - rc = databasevm_step(vm); - if (rc != DBRES_DONE) { - // don't "break;", the error can be due to a RLS policy. - // in case of error we try to apply the following changes - // DEBUG_ALWAYS("cloudsync_payload_apply error on db_version %PRId64/%PRId64: (%d) %s\n", decoded_context.db_version, decoded_context.seq, rc, database_errmsg(data)); - } + + // Track db_version for batch-flush boundary detection + if (db_version_changed) { + last_payload_db_version = decoded_context.db_version; } - - if (payload_apply_callback) { - payload_apply_callback(&payload_apply_xdata, &decoded_context, db, data, CLOUDSYNC_PAYLOAD_APPLY_DID_APPLY, rc); + + // Update PK/table tracking + last_pk = decoded_context.pk; + last_pk_len = decoded_context.pk_len; + last_tbl = decoded_context.tbl; + last_tbl_len = decoded_context.tbl_len; + + rc = databasevm_step(vm); + if (rc != DBRES_DONE) { + // don't "break;", the error can be due to a RLS policy. + // in case of error we try to apply the following changes } - + buffer += seek; buf_len -= seek; dbvm_reset(vm); } - + + // Final flush after loop + { + int flush_rc = merge_flush_pending(data); + if (flush_rc != DBRES_OK && rc == DBRES_OK) rc = flush_rc; + } + data->pending_batch = NULL; + if (in_savepoint) { int rc1 = database_commit_savepoint(data, "cloudsync_payload_apply"); if (rc1 != DBRES_OK) rc = rc1; @@ -2512,10 +3187,6 @@ int cloudsync_payload_apply (cloudsync_context *data, const char *payload, int b if (rc != DBRES_OK && rc != DBRES_DONE) { cloudsync_set_dberror(data); } - - if (payload_apply_callback) { - payload_apply_callback(&payload_apply_xdata, &decoded_context, db, data, CLOUDSYNC_PAYLOAD_APPLY_CLEANUP, rc); - } if (rc == DBRES_DONE) rc = DBRES_OK; if (rc == DBRES_OK) { @@ -2532,15 +3203,20 @@ int cloudsync_payload_apply (cloudsync_context *data, const char *payload, int b } cleanup: + // cleanup merge_pending_batch + if (batch.cached_vm) { databasevm_finalize(batch.cached_vm); batch.cached_vm = NULL; } + if (batch.cached_col_names) { cloudsync_memory_free(batch.cached_col_names); batch.cached_col_names = NULL; } + if (batch.entries) { cloudsync_memory_free(batch.entries); batch.entries = NULL; } + // cleanup vm if (vm) databasevm_finalize(vm); - + // cleanup memory if (clone) cloudsync_memory_free(clone); - + // error already saved in (save last error) if (rc != DBRES_OK) return rc; - + // return the number of processed rows if (pnrows) *pnrows = nrows; return DBRES_OK; @@ -2548,21 +3224,18 @@ int cloudsync_payload_apply (cloudsync_context *data, const char *payload, int b // MARK: - Payload load/store - -int cloudsync_payload_get (cloudsync_context *data, char **blob, int *blob_size, int *db_version, int *seq, int64_t *new_db_version, int64_t *new_seq) { +int cloudsync_payload_get (cloudsync_context *data, char **blob, int *blob_size, int *db_version, int64_t *new_db_version) { // retrieve current db_version and seq *db_version = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_SEND_DBVERSION); if (*db_version < 0) return DBRES_ERROR; - - *seq = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_SEND_SEQ); - if (*seq < 0) return DBRES_ERROR; // retrieve BLOB char sql[1024]; snprintf(sql, sizeof(sql), "WITH max_db_version AS (SELECT MAX(db_version) AS max_db_version FROM cloudsync_changes WHERE site_id=cloudsync_siteid()) " - "SELECT * FROM (SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) AS payload, max_db_version AS max_db_version, MAX(IIF(db_version = max_db_version, seq, 0)) FROM cloudsync_changes, max_db_version WHERE site_id=cloudsync_siteid() AND (db_version>%d OR (db_version=%d AND seq>%d))) WHERE payload IS NOT NULL", *db_version, *db_version, *seq); + "SELECT * FROM (SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) AS payload, max_db_version AS max_db_version FROM cloudsync_changes, max_db_version WHERE site_id=cloudsync_siteid() AND db_version>%d) WHERE payload IS NOT NULL", *db_version); int64_t len = 0; - int rc = database_select_blob_2int(data, sql, blob, &len, new_db_version, new_seq); + int rc = database_select_blob_int(data, sql, blob, &len, new_db_version); *blob_size = (int)len; if (rc != DBRES_OK) return rc; @@ -2580,12 +3253,11 @@ int cloudsync_payload_save (cloudsync_context *data, const char *payload_path, i // retrieve payload char *blob = NULL; - int blob_size = 0, db_version = 0, seq = 0; - int64_t new_db_version = 0, new_seq = 0; - int rc = cloudsync_payload_get(data, &blob, &blob_size, &db_version, &seq, &new_db_version, &new_seq); + int blob_size = 0, db_version = 0; + int64_t new_db_version = 0; + int rc = cloudsync_payload_get(data, &blob, &blob_size, &db_version, &new_db_version); if (rc != DBRES_OK) { if (db_version < 0) return cloudsync_set_error(data, "Unable to retrieve db_version", rc); - else if (seq < 0) return cloudsync_set_error(data, "Unable to retrieve seq", rc); return cloudsync_set_error(data, "Unable to retrieve changes in cloudsync_payload_save", rc); } @@ -2602,18 +3274,6 @@ int cloudsync_payload_save (cloudsync_context *data, const char *payload_path, i return cloudsync_set_error(data, "Unable to write payload to file path", DBRES_IOERR); } - // TODO: dbutils_settings_set_key_value remove context and return error here (in case of error) - // update db_version and seq - char buf[256]; - if (new_db_version != db_version) { - snprintf(buf, sizeof(buf), "%" PRId64, new_db_version); - dbutils_settings_set_key_value(data, CLOUDSYNC_KEY_SEND_DBVERSION, buf); - } - if (new_seq != seq) { - snprintf(buf, sizeof(buf), "%" PRId64, new_seq); - dbutils_settings_set_key_value(data, CLOUDSYNC_KEY_SEND_SEQ, buf); - } - // returns blob size if (size) *size = blob_size; return DBRES_OK; @@ -2678,6 +3338,7 @@ int cloudsync_table_sanity_check (cloudsync_context *data, const char *name, boo } // if user declared explicit primary key(s) then make sure they are all declared as NOT NULL + #if CLOUDSYNC_CHECK_NOTNULL_PRIKEYS if (npri_keys > 0) { int npri_keys_notnull = database_count_pk(data, name, true, cloudsync_schema(data)); if (npri_keys_notnull < 0) return cloudsync_set_dberror(data); @@ -2686,6 +3347,7 @@ int cloudsync_table_sanity_check (cloudsync_context *data, const char *name, boo return cloudsync_set_error(data, buffer, DBRES_ERROR); } } + #endif // check for columns declared as NOT NULL without a DEFAULT value. // Otherwise, col_merge_stmt would fail if changes to other columns are inserted first. diff --git a/src/cloudsync.h b/src/cloudsync.h index 84dfe4a..8673d5f 100644 --- a/src/cloudsync.h +++ b/src/cloudsync.h @@ -12,12 +12,13 @@ #include #include #include "database.h" +#include "block.h" #ifdef __cplusplus extern "C" { #endif -#define CLOUDSYNC_VERSION "0.9.112" +#define CLOUDSYNC_VERSION "0.9.200" #define CLOUDSYNC_MAX_TABLENAME_LEN 512 #define CLOUDSYNC_VALUE_NOTSET -1 @@ -28,12 +29,6 @@ extern "C" { #define CLOUDSYNC_CHANGES_NCOLS 9 -typedef enum { - CLOUDSYNC_PAYLOAD_APPLY_WILL_APPLY = 1, - CLOUDSYNC_PAYLOAD_APPLY_DID_APPLY = 2, - CLOUDSYNC_PAYLOAD_APPLY_CLEANUP = 3 -} CLOUDSYNC_PAYLOAD_APPLY_STEPS; - // CRDT Algos table_algo cloudsync_algo_from_name (const char *algo_name); const char *cloudsync_algo_name (table_algo algo); @@ -89,7 +84,7 @@ int cloudsync_payload_encode_step (cloudsync_payload_context *payload, clouds int cloudsync_payload_encode_final (cloudsync_payload_context *payload, cloudsync_context *data); char *cloudsync_payload_blob (cloudsync_payload_context *payload, int64_t *blob_size, int64_t *nrows); size_t cloudsync_payload_context_size (size_t *header_size); -int cloudsync_payload_get (cloudsync_context *data, char **blob, int *blob_size, int *db_version, int *seq, int64_t *new_db_version, int64_t *new_seq); +int cloudsync_payload_get (cloudsync_context *data, char **blob, int *blob_size, int *db_version, int64_t *new_db_version); int cloudsync_payload_save (cloudsync_context *data, const char *payload_path, int *blob_size); // available only on Desktop OS (no WASM, no mobile) // CloudSync table context @@ -109,16 +104,33 @@ const char *table_schema (cloudsync_table_context *table); int table_remove (cloudsync_context *data, cloudsync_table_context *table); void table_free (cloudsync_table_context *table); +// Block-level LWW support +bool table_has_block_cols (cloudsync_table_context *table); +col_algo_t table_col_algo (cloudsync_table_context *table, int index); +const char *table_col_delimiter (cloudsync_table_context *table, int index); +int table_col_index (cloudsync_table_context *table, const char *col_name); +int block_materialize_column (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *base_col_name); +int cloudsync_setup_block_column (cloudsync_context *data, const char *table_name, const char *col_name, const char *delimiter); + +// Block column accessors (avoids accessing opaque struct from outside cloudsync.c) +dbvm_t *table_block_value_read_stmt (cloudsync_table_context *table); +dbvm_t *table_block_value_write_stmt (cloudsync_table_context *table); +dbvm_t *table_block_list_stmt (cloudsync_table_context *table); +const char *table_blocks_ref (cloudsync_table_context *table); +void table_set_col_delimiter (cloudsync_table_context *table, int col_idx, const char *delimiter); + // local merge/apply -int local_mark_insert_sentinel_meta (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq); -int local_update_sentinel (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq); -int local_mark_insert_or_update_meta (cloudsync_table_context *table, const char *pk, size_t pklen, const char *col_name, int64_t db_version, int seq); -int local_mark_delete_meta (cloudsync_table_context *table, const char *pk, size_t pklen, int64_t db_version, int seq); -int local_drop_meta (cloudsync_table_context *table, const char *pk, size_t pklen); -int local_update_move_meta (cloudsync_table_context *table, const char *pk, size_t pklen, const char *pk2, size_t pklen2, int64_t db_version); +int local_mark_insert_sentinel_meta (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq); +int local_update_sentinel (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq); +int local_mark_insert_or_update_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const char *col_name, int64_t db_version, int seq); +int local_mark_delete_meta (cloudsync_table_context *table, const void *pk, size_t pklen, int64_t db_version, int seq); +int local_mark_delete_block_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const char *block_colname, int64_t db_version, int seq); +int block_delete_value_external (cloudsync_context *data, cloudsync_table_context *table, const void *pk, size_t pklen, const char *block_colname); +int local_drop_meta (cloudsync_table_context *table, const void *pk, size_t pklen); +int local_update_move_meta (cloudsync_table_context *table, const void *pk, size_t pklen, const void *pk2, size_t pklen2, int64_t db_version); // used by changes virtual table -int merge_insert_col (cloudsync_context *data, cloudsync_table_context *table, const char *pk, int pklen, const char *col_name, dbvalue_t *col_value, int64_t col_version, int64_t db_version, const char *site_id, int site_len, int64_t seq, int64_t *rowid); +int merge_insert_col (cloudsync_context *data, cloudsync_table_context *table, const void *pk, int pklen, const char *col_name, dbvalue_t *col_value, int64_t col_version, int64_t db_version, const char *site_id, int site_len, int64_t seq, int64_t *rowid); int merge_insert (cloudsync_context *data, cloudsync_table_context *table, const char *insert_pk, int insert_pk_len, int64_t insert_cl, const char *insert_name, dbvalue_t *insert_value, int64_t insert_col_version, int64_t insert_db_version, const char *insert_site_id, int insert_site_id_len, int64_t insert_seq, int64_t *rowid); // filter rewrite diff --git a/src/database.h b/src/database.h index f5324a3..56bb2d6 100644 --- a/src/database.h +++ b/src/database.h @@ -64,7 +64,7 @@ int database_exec_callback (cloudsync_context *data, const char *sql, database_ int database_select_int (cloudsync_context *data, const char *sql, int64_t *value); int database_select_text (cloudsync_context *data, const char *sql, char **value); int database_select_blob (cloudsync_context *data, const char *sql, char **value, int64_t *value_len); -int database_select_blob_2int (cloudsync_context *data, const char *sql, char **value, int64_t *value_len, int64_t *value2, int64_t *value3); +int database_select_blob_int (cloudsync_context *data, const char *sql, char **value, int64_t *value_len, int64_t *value2); int database_write (cloudsync_context *data, const char *sql, const char **values, DBTYPE types[], int lens[], int count); bool database_table_exists (cloudsync_context *data, const char *table_name, const char *schema); bool database_internal_table_exists (cloudsync_context *data, const char *name); @@ -119,7 +119,7 @@ void database_value_free (dbvalue_t *value); void *database_value_dup (dbvalue_t *value); // COLUMN -const void *database_column_blob (dbvm_t *vm, int index); +const void *database_column_blob (dbvm_t *vm, int index, size_t *len); double database_column_double (dbvm_t *vm, int index); int64_t database_column_int (dbvm_t *vm, int index); const char *database_column_text (dbvm_t *vm, int index); @@ -142,6 +142,8 @@ char *sql_build_select_nonpk_by_pk (cloudsync_context *data, const char *table_n char *sql_build_delete_by_pk (cloudsync_context *data, const char *table_name, const char *schema); char *sql_build_insert_pk_ignore (cloudsync_context *data, const char *table_name, const char *schema); char *sql_build_upsert_pk_and_col (cloudsync_context *data, const char *table_name, const char *colname, const char *schema); +char *sql_build_upsert_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema); +char *sql_build_update_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema); char *sql_build_select_cols_by_pk (cloudsync_context *data, const char *table_name, const char *colname, const char *schema); char *sql_build_rekey_pk_and_reset_version_except_col (cloudsync_context *data, const char *table_name, const char *except_col); char *sql_build_delete_cols_not_in_schema_query(const char *schema, const char *table_name, const char *meta_ref, const char *pkcol); @@ -153,11 +155,9 @@ char *sql_build_insert_missing_pks_query(const char *schema, const char *table_n char *database_table_schema(const char *table_name); char *database_build_meta_ref(const char *schema, const char *table_name); char *database_build_base_ref(const char *schema, const char *table_name); +char *database_build_blocks_ref(const char *schema, const char *table_name); -// USED ONLY by SQLite Cloud to implement RLS +// OPAQUE STRUCT used by pk_context functions typedef struct cloudsync_pk_decode_bind_context cloudsync_pk_decode_bind_context; -typedef bool (*cloudsync_payload_apply_callback_t)(void **xdata, cloudsync_pk_decode_bind_context *decoded_change, void *db, void *data, int step, int rc); -void cloudsync_set_payload_apply_callback(void *db, cloudsync_payload_apply_callback_t callback); -cloudsync_payload_apply_callback_t cloudsync_get_payload_apply_callback(void *db); #endif diff --git a/src/dbutils.c b/src/dbutils.c index 48fdb72..67bfeb8 100644 --- a/src/dbutils.c +++ b/src/dbutils.c @@ -357,19 +357,33 @@ int dbutils_settings_table_load_callback (void *xdata, int ncols, char **values, for (int i=0; i+3 + +#ifdef __cplusplus +extern "C" { +#endif + +#ifdef JSMN_STATIC +#define JSMN_API static +#else +#define JSMN_API extern +#endif + +/** + * JSON type identifier. Basic types are: + * o Object + * o Array + * o String + * o Other primitive: number, boolean (true/false) or null + */ +typedef enum { + JSMN_UNDEFINED = 0, + JSMN_OBJECT = 1 << 0, + JSMN_ARRAY = 1 << 1, + JSMN_STRING = 1 << 2, + JSMN_PRIMITIVE = 1 << 3 +} jsmntype_t; + +enum jsmnerr { + /* Not enough tokens were provided */ + JSMN_ERROR_NOMEM = -1, + /* Invalid character inside JSON string */ + JSMN_ERROR_INVAL = -2, + /* The string is not a full JSON packet, more bytes expected */ + JSMN_ERROR_PART = -3 +}; + +/** + * JSON token description. + * type type (object, array, string etc.) + * start start position in JSON data string + * end end position in JSON data string + */ +typedef struct jsmntok { + jsmntype_t type; + int start; + int end; + int size; +#ifdef JSMN_PARENT_LINKS + int parent; +#endif +} jsmntok_t; + +/** + * JSON parser. Contains an array of token blocks available. Also stores + * the string being parsed now and current position in that string. + */ +typedef struct jsmn_parser { + unsigned int pos; /* offset in the JSON string */ + unsigned int toknext; /* next token to allocate */ + int toksuper; /* superior token node, e.g. parent object or array */ +} jsmn_parser; + +/** + * Create JSON parser over an array of tokens + */ +JSMN_API void jsmn_init(jsmn_parser *parser); + +/** + * Run JSON parser. It parses a JSON data string into and array of tokens, each + * describing + * a single JSON object. + */ +JSMN_API int jsmn_parse(jsmn_parser *parser, const char *js, const size_t len, + jsmntok_t *tokens, const unsigned int num_tokens); + +#ifndef JSMN_HEADER +/** + * Allocates a fresh unused token from the token pool. + */ +static jsmntok_t *jsmn_alloc_token(jsmn_parser *parser, jsmntok_t *tokens, + const size_t num_tokens) { + jsmntok_t *tok; + if (parser->toknext >= num_tokens) { + return NULL; + } + tok = &tokens[parser->toknext++]; + tok->start = tok->end = -1; + tok->size = 0; +#ifdef JSMN_PARENT_LINKS + tok->parent = -1; +#endif + return tok; +} + +/** + * Fills token type and boundaries. + */ +static void jsmn_fill_token(jsmntok_t *token, const jsmntype_t type, + const int start, const int end) { + token->type = type; + token->start = start; + token->end = end; + token->size = 0; +} + +/** + * Fills next available token with JSON primitive. + */ +static int jsmn_parse_primitive(jsmn_parser *parser, const char *js, + const size_t len, jsmntok_t *tokens, + const size_t num_tokens) { + jsmntok_t *token; + int start; + + start = parser->pos; + + for (; parser->pos < len && js[parser->pos] != '\0'; parser->pos++) { + switch (js[parser->pos]) { +#ifndef JSMN_STRICT + /* In strict mode primitive must be followed by "," or "}" or "]" */ + case ':': +#endif + case '\t': + case '\r': + case '\n': + case ' ': + case ',': + case ']': + case '}': + goto found; + default: + /* to quiet a warning from gcc*/ + break; + } + if (js[parser->pos] < 32 || js[parser->pos] >= 127) { + parser->pos = start; + return JSMN_ERROR_INVAL; + } + } +#ifdef JSMN_STRICT + /* In strict mode primitive must be followed by a comma/object/array */ + parser->pos = start; + return JSMN_ERROR_PART; +#endif + +found: + if (tokens == NULL) { + parser->pos--; + return 0; + } + token = jsmn_alloc_token(parser, tokens, num_tokens); + if (token == NULL) { + parser->pos = start; + return JSMN_ERROR_NOMEM; + } + jsmn_fill_token(token, JSMN_PRIMITIVE, start, parser->pos); +#ifdef JSMN_PARENT_LINKS + token->parent = parser->toksuper; +#endif + parser->pos--; + return 0; +} + +/** + * Fills next token with JSON string. + */ +static int jsmn_parse_string(jsmn_parser *parser, const char *js, + const size_t len, jsmntok_t *tokens, + const size_t num_tokens) { + jsmntok_t *token; + + int start = parser->pos; + + /* Skip starting quote */ + parser->pos++; + + for (; parser->pos < len && js[parser->pos] != '\0'; parser->pos++) { + char c = js[parser->pos]; + + /* Quote: end of string */ + if (c == '\"') { + if (tokens == NULL) { + return 0; + } + token = jsmn_alloc_token(parser, tokens, num_tokens); + if (token == NULL) { + parser->pos = start; + return JSMN_ERROR_NOMEM; + } + jsmn_fill_token(token, JSMN_STRING, start + 1, parser->pos); +#ifdef JSMN_PARENT_LINKS + token->parent = parser->toksuper; +#endif + return 0; + } + + /* Backslash: Quoted symbol expected */ + if (c == '\\' && parser->pos + 1 < len) { + int i; + parser->pos++; + switch (js[parser->pos]) { + /* Allowed escaped symbols */ + case '\"': + case '/': + case '\\': + case 'b': + case 'f': + case 'r': + case 'n': + case 't': + break; + /* Allows escaped symbol \uXXXX */ + case 'u': + parser->pos++; + for (i = 0; i < 4 && parser->pos < len && js[parser->pos] != '\0'; + i++) { + /* If it isn't a hex character we have an error */ + if (!((js[parser->pos] >= 48 && js[parser->pos] <= 57) || /* 0-9 */ + (js[parser->pos] >= 65 && js[parser->pos] <= 70) || /* A-F */ + (js[parser->pos] >= 97 && js[parser->pos] <= 102))) { /* a-f */ + parser->pos = start; + return JSMN_ERROR_INVAL; + } + parser->pos++; + } + parser->pos--; + break; + /* Unexpected symbol */ + default: + parser->pos = start; + return JSMN_ERROR_INVAL; + } + } + } + parser->pos = start; + return JSMN_ERROR_PART; +} + +/** + * Parse JSON string and fill tokens. + */ +JSMN_API int jsmn_parse(jsmn_parser *parser, const char *js, const size_t len, + jsmntok_t *tokens, const unsigned int num_tokens) { + int r; + int i; + jsmntok_t *token; + int count = parser->toknext; + + for (; parser->pos < len && js[parser->pos] != '\0'; parser->pos++) { + char c; + jsmntype_t type; + + c = js[parser->pos]; + switch (c) { + case '{': + case '[': + count++; + if (tokens == NULL) { + break; + } + token = jsmn_alloc_token(parser, tokens, num_tokens); + if (token == NULL) { + return JSMN_ERROR_NOMEM; + } + if (parser->toksuper != -1) { + jsmntok_t *t = &tokens[parser->toksuper]; +#ifdef JSMN_STRICT + /* In strict mode an object or array can't become a key */ + if (t->type == JSMN_OBJECT) { + return JSMN_ERROR_INVAL; + } +#endif + t->size++; +#ifdef JSMN_PARENT_LINKS + token->parent = parser->toksuper; +#endif + } + token->type = (c == '{' ? JSMN_OBJECT : JSMN_ARRAY); + token->start = parser->pos; + parser->toksuper = parser->toknext - 1; + break; + case '}': + case ']': + if (tokens == NULL) { + break; + } + type = (c == '}' ? JSMN_OBJECT : JSMN_ARRAY); +#ifdef JSMN_PARENT_LINKS + if (parser->toknext < 1) { + return JSMN_ERROR_INVAL; + } + token = &tokens[parser->toknext - 1]; + for (;;) { + if (token->start != -1 && token->end == -1) { + if (token->type != type) { + return JSMN_ERROR_INVAL; + } + token->end = parser->pos + 1; + parser->toksuper = token->parent; + break; + } + if (token->parent == -1) { + if (token->type != type || parser->toksuper == -1) { + return JSMN_ERROR_INVAL; + } + break; + } + token = &tokens[token->parent]; + } +#else + for (i = parser->toknext - 1; i >= 0; i--) { + token = &tokens[i]; + if (token->start != -1 && token->end == -1) { + if (token->type != type) { + return JSMN_ERROR_INVAL; + } + parser->toksuper = -1; + token->end = parser->pos + 1; + break; + } + } + /* Error if unmatched closing bracket */ + if (i == -1) { + return JSMN_ERROR_INVAL; + } + for (; i >= 0; i--) { + token = &tokens[i]; + if (token->start != -1 && token->end == -1) { + parser->toksuper = i; + break; + } + } +#endif + break; + case '\"': + r = jsmn_parse_string(parser, js, len, tokens, num_tokens); + if (r < 0) { + return r; + } + count++; + if (parser->toksuper != -1 && tokens != NULL) { + tokens[parser->toksuper].size++; + } + break; + case '\t': + case '\r': + case '\n': + case ' ': + break; + case ':': + parser->toksuper = parser->toknext - 1; + break; + case ',': + if (tokens != NULL && parser->toksuper != -1 && + tokens[parser->toksuper].type != JSMN_ARRAY && + tokens[parser->toksuper].type != JSMN_OBJECT) { +#ifdef JSMN_PARENT_LINKS + parser->toksuper = tokens[parser->toksuper].parent; +#else + for (i = parser->toknext - 1; i >= 0; i--) { + if (tokens[i].type == JSMN_ARRAY || tokens[i].type == JSMN_OBJECT) { + if (tokens[i].start != -1 && tokens[i].end == -1) { + parser->toksuper = i; + break; + } + } + } +#endif + } + break; +#ifdef JSMN_STRICT + /* In strict mode primitives are: numbers and booleans */ + case '-': + case '0': + case '1': + case '2': + case '3': + case '4': + case '5': + case '6': + case '7': + case '8': + case '9': + case 't': + case 'f': + case 'n': + /* And they must not be keys of the object */ + if (tokens != NULL && parser->toksuper != -1) { + const jsmntok_t *t = &tokens[parser->toksuper]; + if (t->type == JSMN_OBJECT || + (t->type == JSMN_STRING && t->size != 0)) { + return JSMN_ERROR_INVAL; + } + } +#else + /* In non-strict mode every unquoted value is a primitive */ + default: +#endif + r = jsmn_parse_primitive(parser, js, len, tokens, num_tokens); + if (r < 0) { + return r; + } + count++; + if (parser->toksuper != -1 && tokens != NULL) { + tokens[parser->toksuper].size++; + } + break; + +#ifdef JSMN_STRICT + /* Unexpected char in strict mode */ + default: + return JSMN_ERROR_INVAL; +#endif + } + } + + if (tokens != NULL) { + for (i = parser->toknext - 1; i >= 0; i--) { + /* Unmatched opened object or array */ + if (tokens[i].start != -1 && tokens[i].end == -1) { + return JSMN_ERROR_PART; + } + } + } + + return count; +} + +/** + * Creates a new parser based over a given buffer with an array of tokens + * available. + */ +JSMN_API void jsmn_init(jsmn_parser *parser) { + parser->pos = 0; + parser->toknext = 0; + parser->toksuper = -1; +} + +#endif /* JSMN_HEADER */ + +#ifdef __cplusplus +} +#endif + +#endif /* JSMN_H */ diff --git a/src/network.c b/src/network/network.c similarity index 61% rename from src/network.c rename to src/network/network.c index f3133c5..6660005 100644 --- a/src/network.c +++ b/src/network/network.c @@ -9,13 +9,17 @@ #include #include +#include #include "network.h" -#include "utils.h" -#include "dbutils.h" -#include "cloudsync.h" +#include "../utils.h" +#include "../dbutils.h" +#include "../cloudsync.h" #include "network_private.h" +#define JSMN_STATIC +#include "jsmn.h" + #ifndef SQLITE_WASM_EXTRA_INIT #ifndef CLOUDSYNC_OMIT_CURL #include "curl/curl.h" @@ -47,9 +51,11 @@ SQLITE_EXTENSION_INIT3 struct network_data { char site_id[UUID_STR_MAXLEN]; char *authentication; // apikey or token + char *org_id; // organization ID for X-CloudSync-Org header char *check_endpoint; char *upload_endpoint; char *apply_endpoint; + char *status_endpoint; }; typedef struct { @@ -80,27 +86,34 @@ char *network_data_get_siteid (network_data *data) { return data->site_id; } -bool network_data_set_endpoints (network_data *data, char *auth, char *check, char *upload, char *apply) { +char *network_data_get_orgid (network_data *data) { + return data->org_id; +} + +bool network_data_set_endpoints (network_data *data, char *auth, char *check, char *upload, char *apply, char *status) { // sanity check if (!check || !upload) return false; - + // always free previous owned pointers if (data->authentication) cloudsync_memory_free(data->authentication); if (data->check_endpoint) cloudsync_memory_free(data->check_endpoint); if (data->upload_endpoint) cloudsync_memory_free(data->upload_endpoint); if (data->apply_endpoint) cloudsync_memory_free(data->apply_endpoint); + if (data->status_endpoint) cloudsync_memory_free(data->status_endpoint); // clear pointers data->authentication = NULL; data->check_endpoint = NULL; data->upload_endpoint = NULL; data->apply_endpoint = NULL; + data->status_endpoint = NULL; // make a copy of the new endpoints char *auth_copy = NULL; char *check_copy = NULL; char *upload_copy = NULL; char *apply_copy = NULL; + char *status_copy = NULL; // auth is optional if (auth) { @@ -109,34 +122,41 @@ bool network_data_set_endpoints (network_data *data, char *auth, char *check, ch } check_copy = cloudsync_string_dup(check); if (!check_copy) goto abort_endpoints; - + upload_copy = cloudsync_string_dup(upload); if (!upload_copy) goto abort_endpoints; - + apply_copy = cloudsync_string_dup(apply); if (!apply_copy) goto abort_endpoints; + status_copy = cloudsync_string_dup(status); + if (!status_copy) goto abort_endpoints; + data->authentication = auth_copy; data->check_endpoint = check_copy; data->upload_endpoint = upload_copy; data->apply_endpoint = apply_copy; + data->status_endpoint = status_copy; return true; - + abort_endpoints: if (auth_copy) cloudsync_memory_free(auth_copy); if (check_copy) cloudsync_memory_free(check_copy); if (upload_copy) cloudsync_memory_free(upload_copy); if (apply_copy) cloudsync_memory_free(apply_copy); + if (status_copy) cloudsync_memory_free(status_copy); return false; } void network_data_free (network_data *data) { if (!data) return; - + if (data->authentication) cloudsync_memory_free(data->authentication); + if (data->org_id) cloudsync_memory_free(data->org_id); if (data->check_endpoint) cloudsync_memory_free(data->check_endpoint); if (data->upload_endpoint) cloudsync_memory_free(data->upload_endpoint); if (data->apply_endpoint) cloudsync_memory_free(data->apply_endpoint); + if (data->status_endpoint) cloudsync_memory_free(data->status_endpoint); cloudsync_memory_free(data); } @@ -205,6 +225,14 @@ NETWORK_RESULT network_receive_buffer (network_data *data, const char *endpoint, headers = tmp; } + if (data->org_id) { + char org_header[512]; + snprintf(org_header, sizeof(org_header), "%s: %s", CLOUDSYNC_HEADER_ORG, data->org_id); + struct curl_slist *tmp = curl_slist_append(headers, org_header); + if (!tmp) {rc = CURLE_OUT_OF_MEMORY; goto cleanup;} + headers = tmp; + } + if (json_payload) { struct curl_slist *tmp = curl_slist_append(headers, "Content-Type: application/json"); if (!tmp) {rc = CURLE_OUT_OF_MEMORY; goto cleanup;} @@ -317,7 +345,15 @@ bool network_send_buffer (network_data *data, const char *endpoint, const char * if (!tmp) {rc = CURLE_OUT_OF_MEMORY; goto cleanup;} headers = tmp; } - + + if (data->org_id) { + char org_header[512]; + snprintf(org_header, sizeof(org_header), "%s: %s", CLOUDSYNC_HEADER_ORG, data->org_id); + struct curl_slist *tmp = curl_slist_append(headers, org_header); + if (!tmp) {rc = CURLE_OUT_OF_MEMORY; goto cleanup;} + headers = tmp; + } + // Set headers if needed (S3 pre-signed URLs usually do not require additional headers) tmp = curl_slist_append(headers, "Content-Type: application/octet-stream"); if (!tmp) {rc = CURLE_OUT_OF_MEMORY; goto cleanup;} @@ -414,6 +450,113 @@ char *network_authentication_token (const char *key, const char *value) { return buffer; } +// MARK: - JSON helpers (jsmn) - + +#define JSMN_MAX_TOKENS 64 + +static bool jsmn_token_eq(const char *json, const jsmntok_t *tok, const char *s) { + return (tok->type == JSMN_STRING && + (int)strlen(s) == tok->end - tok->start && + strncmp(json + tok->start, s, tok->end - tok->start) == 0); +} + +static int jsmn_find_key(const char *json, const jsmntok_t *tokens, int ntokens, const char *key) { + for (int i = 1; i + 1 < ntokens; i++) { + if (jsmn_token_eq(json, &tokens[i], key)) return i; + } + return -1; +} + +static char *json_unescape_string(const char *src, int len) { + char *out = cloudsync_memory_zeroalloc(len + 1); + if (!out) return NULL; + + int j = 0; + for (int i = 0; i < len; ) { + if (src[i] == '\\' && i + 1 < len) { + char c = src[i + 1]; + if (c == '"' || c == '\\' || c == '/') { out[j++] = c; i += 2; } + else if (c == 'n') { out[j++] = '\n'; i += 2; } + else if (c == 'r') { out[j++] = '\r'; i += 2; } + else if (c == 't') { out[j++] = '\t'; i += 2; } + else if (c == 'b') { out[j++] = '\b'; i += 2; } + else if (c == 'f') { out[j++] = '\f'; i += 2; } + else if (c == 'u' && i + 5 < len) { + unsigned int cp = 0; + for (int k = 0; k < 4; k++) { + char h = src[i + 2 + k]; + cp <<= 4; + if (h >= '0' && h <= '9') cp |= h - '0'; + else if (h >= 'a' && h <= 'f') cp |= 10 + h - 'a'; + else if (h >= 'A' && h <= 'F') cp |= 10 + h - 'A'; + } + if (cp < 0x80) { out[j++] = (char)cp; } + else { out[j++] = '?'; } // non-ASCII: replace + i += 6; + } + else { out[j++] = src[i]; i++; } + } else { + out[j++] = src[i]; i++; + } + } + out[j] = '\0'; + return out; +} + +static char *json_extract_string(const char *json, size_t json_len, const char *key) { + if (!json || json_len == 0 || !key) return NULL; + + jsmn_parser parser; + jsmntok_t tokens[JSMN_MAX_TOKENS]; + jsmn_init(&parser); + int ntokens = jsmn_parse(&parser, json, json_len, tokens, JSMN_MAX_TOKENS); + if (ntokens < 1) return NULL; + + int i = jsmn_find_key(json, tokens, ntokens, key); + if (i < 0 || i + 1 >= ntokens) return NULL; + + jsmntok_t *val = &tokens[i + 1]; + if (val->type != JSMN_STRING) return NULL; + + return json_unescape_string(json + val->start, val->end - val->start); +} + +static int64_t json_extract_int(const char *json, size_t json_len, const char *key, int64_t default_value) { + if (!json || json_len == 0 || !key) return default_value; + + jsmn_parser parser; + jsmntok_t tokens[JSMN_MAX_TOKENS]; + jsmn_init(&parser); + int ntokens = jsmn_parse(&parser, json, json_len, tokens, JSMN_MAX_TOKENS); + if (ntokens < 1 || tokens[0].type != JSMN_OBJECT) return default_value; + + int i = jsmn_find_key(json, tokens, ntokens, key); + if (i < 0 || i + 1 >= ntokens) return default_value; + + jsmntok_t *val = &tokens[i + 1]; + if (val->type != JSMN_PRIMITIVE) return default_value; + + return strtoll(json + val->start, NULL, 10); +} + +static int json_extract_array_size(const char *json, size_t json_len, const char *key) { + if (!json || json_len == 0 || !key) return -1; + + jsmn_parser parser; + jsmntok_t tokens[JSMN_MAX_TOKENS]; + jsmn_init(&parser); + int ntokens = jsmn_parse(&parser, json, json_len, tokens, JSMN_MAX_TOKENS); + if (ntokens < 1 || tokens[0].type != JSMN_OBJECT) return -1; + + int i = jsmn_find_key(json, tokens, ntokens, key); + if (i < 0 || i + 1 >= ntokens) return -1; + + jsmntok_t *val = &tokens[i + 1]; + if (val->type != JSMN_ARRAY) return -1; + + return val->size; +} + int network_extract_query_param (const char *query, const char *key, char *output, size_t output_size) { if (!query || !key || !output || output_size == 0) { return -1; // Invalid input @@ -457,161 +600,61 @@ int network_extract_query_param (const char *query, const char *key, char *outpu return -3; // Key not found } -#if !defined(CLOUDSYNC_OMIT_CURL) || defined(SQLITE_WASM_EXTRA_INIT) -bool network_compute_endpoints (sqlite3_context *context, network_data *data, const char *conn_string) { - // compute endpoints - bool result = false; - - char *scheme = NULL; - char *host = NULL; - char *port = NULL; - char *database = NULL; - char *query = NULL; - - char *authentication = NULL; - char *check_endpoint = NULL; - char *upload_endpoint = NULL; - char *apply_endpoint = NULL; - - char *conn_string_https = NULL; - - #ifndef SQLITE_WASM_EXTRA_INIT - CURLUcode rc = CURLUE_OUT_OF_MEMORY; - CURLU *url = curl_url(); - if (!url) goto finalize; - #endif - - conn_string_https = cloudsync_string_replace_prefix(conn_string, "sqlitecloud://", "https://"); - if (!conn_string_https) goto finalize; - - #ifndef SQLITE_WASM_EXTRA_INIT - // set URL: https://UUID.g5.sqlite.cloud:443/chinook.sqlite?apikey=hWDanFolRT9WDK0p54lufNrIyfgLZgtMw6tb6fbPmpo - rc = curl_url_set(url, CURLUPART_URL, conn_string_https, 0); - if (rc != CURLUE_OK) goto finalize; - - // https (MANDATORY) - rc = curl_url_get(url, CURLUPART_SCHEME, &scheme, 0); - if (rc != CURLUE_OK) goto finalize; - - // UUID.g5.sqlite.cloud (MANDATORY) - rc = curl_url_get(url, CURLUPART_HOST, &host, 0); - if (rc != CURLUE_OK) goto finalize; - - // 443 (OPTIONAL) - rc = curl_url_get(url, CURLUPART_PORT, &port, 0); - if (rc != CURLUE_OK && rc != CURLUE_NO_PORT) goto finalize; - char *port_or_default = port && strcmp(port, "8860") != 0 ? port : CLOUDSYNC_DEFAULT_ENDPOINT_PORT; - - // /chinook.sqlite (MANDATORY) - rc = curl_url_get(url, CURLUPART_PATH, &database, 0); - if (rc != CURLUE_OK) goto finalize; - - // apikey=hWDanFolRT9WDK0p54lufNrIyfgLZgtMw6tb6fbPmpo (OPTIONAL) - rc = curl_url_get(url, CURLUPART_QUERY, &query, 0); - if (rc != CURLUE_OK && rc != CURLUE_NO_QUERY) goto finalize; - #else - // Parse: scheme://host[:port]/path?query - const char *p = strstr(conn_string_https, "://"); - if (!p) goto finalize; - scheme = substr(conn_string_https, p); - p += 3; - const char *host_start = p; - const char *host_end = strpbrk(host_start, ":/?"); - if (!host_end) goto finalize; - host = substr(host_start, host_end); - p = host_end; - if (*p == ':') { - ++p; - const char *port_end = strpbrk(p, "/?"); - if (!port_end) goto finalize; - port = substr(p, port_end); - p = port_end; - } - if (*p == '/') { - const char *path_start = p; - const char *path_end = strchr(path_start, '?'); - if (!path_end) path_end = path_start + strlen(path_start); - database = substr(path_start, path_end); - p = path_end; - } - if (*p == '?') { - query = strdup(p); - } - if (!scheme || !host || !database) goto finalize; - char *port_or_default = port && strcmp(port, "8860") != 0 ? port : CLOUDSYNC_DEFAULT_ENDPOINT_PORT; - #endif - - if (query != NULL) { - char value[CLOUDSYNC_SESSION_TOKEN_MAXSIZE]; - if (!authentication && network_extract_query_param(query, "apikey", value, sizeof(value)) == 0) { - authentication = network_authentication_token("apikey", value); - } - if (!authentication && network_extract_query_param(query, "token", value, sizeof(value)) == 0) { - authentication = network_authentication_token("token", value); - } +static bool network_compute_endpoints_with_address (sqlite3_context *context, network_data *data, const char *address, const char *managedDatabaseId) { + if (!managedDatabaseId || managedDatabaseId[0] == '\0') { + sqlite3_result_error(context, "managedDatabaseId cannot be empty", -1); + sqlite3_result_error_code(context, SQLITE_ERROR); + return false; } - - size_t requested = strlen(scheme) + strlen(host) + strlen(port_or_default) + strlen(CLOUDSYNC_ENDPOINT_PREFIX) + strlen(database) + 64; - check_endpoint = (char *)cloudsync_memory_zeroalloc(requested); - upload_endpoint = (char *)cloudsync_memory_zeroalloc(requested); - apply_endpoint = (char *)cloudsync_memory_zeroalloc(requested); - - if ((!upload_endpoint) || (!check_endpoint) || (!apply_endpoint)) goto finalize; - - snprintf(check_endpoint, requested, "%s://%s:%s/%s%s/%s/%s", scheme, host, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database, data->site_id, CLOUDSYNC_ENDPOINT_CHECK); - snprintf(upload_endpoint, requested, "%s://%s:%s/%s%s/%s/%s", scheme, host, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database, data->site_id, CLOUDSYNC_ENDPOINT_UPLOAD); - snprintf(apply_endpoint, requested, "%s://%s:%s/%s%s/%s/%s", scheme, host, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database, data->site_id, CLOUDSYNC_ENDPOINT_APPLY); - result = true; - -finalize: - if (result == false) { - // store proper result code/message - #ifndef SQLITE_WASM_EXTRA_INIT - if (rc != CURLUE_OK) sqlite3_result_error(context, curl_url_strerror(rc), -1); - sqlite3_result_error_code(context, (rc != CURLUE_OK) ? SQLITE_ERROR : SQLITE_NOMEM); - #else - sqlite3_result_error(context, "URL parse error", -1); + if (!address || address[0] == '\0') { + sqlite3_result_error(context, "address cannot be empty", -1); sqlite3_result_error_code(context, SQLITE_ERROR); - #endif - - // cleanup memory managed by the extension - if (authentication) cloudsync_memory_free(authentication); + return false; + } + + // build endpoints: {address}/v2/cloudsync/databases/{managedDatabaseId}/{siteId}/{action} + size_t requested = strlen(address) + 1 + + strlen(CLOUDSYNC_ENDPOINT_PREFIX) + 1 + strlen(managedDatabaseId) + 1 + + UUID_STR_MAXLEN + 1 + 16; + char *check_endpoint = (char *)cloudsync_memory_zeroalloc(requested); + char *upload_endpoint = (char *)cloudsync_memory_zeroalloc(requested); + char *apply_endpoint = (char *)cloudsync_memory_zeroalloc(requested); + char *status_endpoint = (char *)cloudsync_memory_zeroalloc(requested); + + if (!check_endpoint || !upload_endpoint || !apply_endpoint || !status_endpoint) { + sqlite3_result_error_code(context, SQLITE_NOMEM); if (check_endpoint) cloudsync_memory_free(check_endpoint); if (upload_endpoint) cloudsync_memory_free(upload_endpoint); if (apply_endpoint) cloudsync_memory_free(apply_endpoint); + if (status_endpoint) cloudsync_memory_free(status_endpoint); + return false; } - - if (result) { - if (authentication) { - if (data->authentication) cloudsync_memory_free(data->authentication); - data->authentication = authentication; - } - - if (data->check_endpoint) cloudsync_memory_free(data->check_endpoint); - data->check_endpoint = check_endpoint; - - if (data->upload_endpoint) cloudsync_memory_free(data->upload_endpoint); - data->upload_endpoint = upload_endpoint; - if (data->apply_endpoint) cloudsync_memory_free(data->apply_endpoint); - data->apply_endpoint = apply_endpoint; - } - - // cleanup memory - #ifndef SQLITE_WASM_EXTRA_INIT - if (url) curl_url_cleanup(url); - #endif - if (scheme) curl_free(scheme); - if (host) curl_free(host); - if (port) curl_free(port); - if (database) curl_free(database); - if (query) curl_free(query); - if (conn_string_https && conn_string_https != conn_string) cloudsync_memory_free(conn_string_https); - - return result; + // format: {address}/v2/cloudsync/databases/{managedDatabaseID}/{siteId}/{action} + snprintf(check_endpoint, requested, "%s/%s/%s/%s/%s", + address, CLOUDSYNC_ENDPOINT_PREFIX, managedDatabaseId, data->site_id, CLOUDSYNC_ENDPOINT_CHECK); + snprintf(upload_endpoint, requested, "%s/%s/%s/%s/%s", + address, CLOUDSYNC_ENDPOINT_PREFIX, managedDatabaseId, data->site_id, CLOUDSYNC_ENDPOINT_UPLOAD); + snprintf(apply_endpoint, requested, "%s/%s/%s/%s/%s", + address, CLOUDSYNC_ENDPOINT_PREFIX, managedDatabaseId, data->site_id, CLOUDSYNC_ENDPOINT_APPLY); + snprintf(status_endpoint, requested, "%s/%s/%s/%s/%s", + address, CLOUDSYNC_ENDPOINT_PREFIX, managedDatabaseId, data->site_id, CLOUDSYNC_ENDPOINT_STATUS); + + if (data->check_endpoint) cloudsync_memory_free(data->check_endpoint); + data->check_endpoint = check_endpoint; + + if (data->upload_endpoint) cloudsync_memory_free(data->upload_endpoint); + data->upload_endpoint = upload_endpoint; + + if (data->apply_endpoint) cloudsync_memory_free(data->apply_endpoint); + data->apply_endpoint = apply_endpoint; + + if (data->status_endpoint) cloudsync_memory_free(data->status_endpoint); + data->status_endpoint = status_endpoint; + + return true; } -#endif void network_result_to_sqlite_error (sqlite3_context *context, NETWORK_RESULT res, const char *default_error_message) { sqlite3_result_error(context, ((res.code == CLOUDSYNC_NETWORK_ERROR) && (res.buffer)) ? res.buffer : default_error_message, -1); @@ -630,58 +673,60 @@ network_data *cloudsync_network_data (sqlite3_context *context) { return netdata; } -void cloudsync_network_init (sqlite3_context *context, int argc, sqlite3_value **argv) { - DEBUG_FUNCTION("cloudsync_network_init"); - +static void cloudsync_network_init_internal (sqlite3_context *context, const char *address, const char *managedDatabaseId) { #ifndef CLOUDSYNC_OMIT_CURL curl_global_init(CURL_GLOBAL_ALL); #endif - - // no real network operations here - // just setup the network_data struct + cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); network_data *netdata = cloudsync_network_data(context); if (!netdata) goto abort_memory; - + // init context uint8_t *site_id = (uint8_t *)cloudsync_context_init(data); if (!site_id) goto abort_siteid; - + // save site_id string representation: 01957493c6c07e14803727e969f1d2cc cloudsync_uuid_v7_stringify(site_id, netdata->site_id, false); - - // connection string is something like: - // https://UUID.g5.sqlite.cloud:443/chinook.sqlite?apikey=hWDanFolRT9WDK0p54lufNrIyfgLZgtMw6tb6fbPmpo - // or https://UUID.g5.sqlite.cloud:443/chinook.sqlite - // apikey part is optional and can be replaced by a session token once client is authenticated - - const char *connection_param = (const char *)sqlite3_value_text(argv[0]); - + // compute endpoints - if (network_compute_endpoints(context, netdata, connection_param) == false) { - // error message/code already set inside network_compute_endpoints + // authentication can be set later via cloudsync_network_set_token/cloudsync_network_set_apikey + if (network_compute_endpoints_with_address(context, netdata, address, managedDatabaseId) == false) { goto abort_cleanup; } - + cloudsync_set_auxdata(data, netdata); sqlite3_result_int(context, SQLITE_OK); return; - + abort_memory: sqlite3_result_error(context, "Unable to allocate memory in cloudsync_network_init.", -1); sqlite3_result_error_code(context, SQLITE_NOMEM); goto abort_cleanup; - + abort_siteid: sqlite3_result_error(context, "Unable to compute/retrieve site_id.", -1); sqlite3_result_error_code(context, SQLITE_MISUSE); goto abort_cleanup; - + abort_cleanup: cloudsync_set_auxdata(data, NULL); network_data_free(netdata); } +void cloudsync_network_init (sqlite3_context *context, int argc, sqlite3_value **argv) { + DEBUG_FUNCTION("cloudsync_network_init"); + const char *managedDatabaseId = (const char *)sqlite3_value_text(argv[0]); + cloudsync_network_init_internal(context, CLOUDSYNC_DEFAULT_ADDRESS, managedDatabaseId); +} + +void cloudsync_network_init_custom (sqlite3_context *context, int argc, sqlite3_value **argv) { + DEBUG_FUNCTION("cloudsync_network_init_custom"); + const char *address = (const char *)sqlite3_value_text(argv[0]); + const char *managedDatabaseId = (const char *)sqlite3_value_text(argv[1]); + cloudsync_network_init_internal(context, address, managedDatabaseId); +} + void cloudsync_network_cleanup_internal (sqlite3_context *context) { cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); network_data *netdata = cloudsync_network_data(context); @@ -726,18 +771,58 @@ void cloudsync_network_set_token (sqlite3_context *context, int argc, sqlite3_va void cloudsync_network_set_apikey (sqlite3_context *context, int argc, sqlite3_value **argv) { DEBUG_FUNCTION("cloudsync_network_set_apikey"); - + const char *value = (const char *)sqlite3_value_text(argv[0]); bool result = cloudsync_network_set_authentication_token(context, value, false); (result) ? sqlite3_result_int(context, SQLITE_OK) : sqlite3_result_error_code(context, SQLITE_NOMEM); } +// Returns a malloc'd JSON array string like '["tasks","users"]', or NULL on error/no results. +// Caller must free with cloudsync_memory_free. +static char *network_get_affected_tables(sqlite3 *db, int64_t since_db_version) { + sqlite3_stmt *stmt = NULL; + int rc = sqlite3_prepare_v2(db, + "SELECT json_group_array(DISTINCT tbl) FROM cloudsync_changes WHERE db_version > ?", + -1, &stmt, NULL); + if (rc != SQLITE_OK) return NULL; + sqlite3_bind_int64(stmt, 1, since_db_version); + + char *result = NULL; + if (sqlite3_step(stmt) == SQLITE_ROW) { + const char *json = (const char *)sqlite3_column_text(stmt, 0); + if (json) result = cloudsync_string_dup(json); + } + sqlite3_finalize(stmt); + return result; +} + +// MARK: - Sync result + +typedef struct { + int64_t server_version; // lastOptimisticVersion + int64_t local_version; // new_db_version (max local) + const char *status; // computed status string + int rows_received; // rows from check + char *tables_json; // JSON array of affected table names, caller must cloudsync_memory_free +} sync_result; + +static const char *network_compute_status(int64_t last_optimistic, int64_t last_confirmed, + int gaps_size, int64_t local_version) { + if (last_optimistic < 0 || last_confirmed < 0) return "error"; + if (gaps_size > 0 || last_optimistic < local_version) return "out-of-sync"; + if (last_optimistic == last_confirmed) return "synced"; + return "syncing"; +} + // MARK: - void cloudsync_network_has_unsent_changes (sqlite3_context *context, int argc, sqlite3_value **argv) { sqlite3 *db = sqlite3_context_db_handle(context); cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); + network_data *netdata = (network_data *)cloudsync_auxdata(data); + if (!netdata) {sqlite3_result_error(context, "Unable to retrieve CloudSync network context.", -1); return;} + char *sql = "SELECT max(db_version) FROM cloudsync_changes WHERE site_id == (SELECT site_id FROM cloudsync_site_id WHERE rowid=0)"; int64_t last_local_change = 0; int rc = database_select_int(data, sql, &last_local_change); @@ -752,11 +837,23 @@ void cloudsync_network_has_unsent_changes (sqlite3_context *context, int argc, s return; } - int sent_db_version = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_SEND_DBVERSION); - sqlite3_result_int(context, (sent_db_version < last_local_change)); + NETWORK_RESULT res = network_receive_buffer(netdata, netdata->status_endpoint, netdata->authentication, true, false, NULL, CLOUDSYNC_HEADER_SQLITECLOUD); + + int64_t last_optimistic_version = -1; + + if (res.code == CLOUDSYNC_NETWORK_BUFFER && res.buffer) { + last_optimistic_version = json_extract_int(res.buffer, res.blen, "lastOptimisticVersion", -1); + } else if (res.code != CLOUDSYNC_NETWORK_OK) { + network_result_to_sqlite_error(context, res, "unable to retrieve current status from remote host."); + network_result_cleanup(&res); + return; + } + + network_result_cleanup(&res); + sqlite3_result_int(context, (last_optimistic_version >= 0 && last_optimistic_version < last_local_change)); } -int cloudsync_network_send_changes_internal (sqlite3_context *context, int argc, sqlite3_value **argv) { +int cloudsync_network_send_changes_internal (sqlite3_context *context, int argc, sqlite3_value **argv, sync_result *out) { DEBUG_FUNCTION("cloudsync_network_send_changes"); // retrieve global context @@ -767,72 +864,123 @@ int cloudsync_network_send_changes_internal (sqlite3_context *context, int argc, // retrieve payload char *blob = NULL; - int blob_size = 0, db_version = 0, seq = 0; - int64_t new_db_version = 0, new_seq = 0; - int rc = cloudsync_payload_get(data, &blob, &blob_size, &db_version, &seq, &new_db_version, &new_seq); + int blob_size = 0, db_version = 0; + int64_t new_db_version = 0; + int rc = cloudsync_payload_get(data, &blob, &blob_size, &db_version, &new_db_version); if (rc != SQLITE_OK) { if (db_version < 0) sqlite3_result_error(context, "Unable to retrieve db_version.", -1); - else if (seq < 0) sqlite3_result_error(context, "Unable to retrieve seq.", -1); else sqlite3_result_error(context, "Unable to retrieve changes in cloudsync_network_send_changes", -1); return rc; } - - // exit if there is no data to send - if (blob == NULL || blob_size == 0) return SQLITE_OK; - NETWORK_RESULT res = network_receive_buffer(netdata, netdata->upload_endpoint, netdata->authentication, true, false, NULL, CLOUDSYNC_HEADER_SQLITECLOUD); - if (res.code != CLOUDSYNC_NETWORK_BUFFER) { - cloudsync_memory_free(blob); - network_result_to_sqlite_error(context, res, "cloudsync_network_send_changes unable to receive upload URL"); - network_result_cleanup(&res); - return SQLITE_ERROR; + // Case 1: empty local db — no payload and no server state, skip network entirely + if ((blob == NULL || blob_size == 0) && db_version == 0) { + if (out) { + out->server_version = 0; + out->local_version = 0; + out->status = network_compute_status(0, 0, 0, 0); + } + return SQLITE_OK; } - - const char *s3_url = res.buffer; - bool sent = network_send_buffer(netdata, s3_url, NULL, blob, blob_size); - cloudsync_memory_free(blob); - if (sent == false) { - network_result_to_sqlite_error(context, res, "cloudsync_network_send_changes unable to upload BLOB changes to remote host."); + + NETWORK_RESULT res; + if (blob != NULL && blob_size > 0) { + // there is data to send + res = network_receive_buffer(netdata, netdata->upload_endpoint, netdata->authentication, true, false, NULL, CLOUDSYNC_HEADER_SQLITECLOUD); + if (res.code != CLOUDSYNC_NETWORK_BUFFER) { + cloudsync_memory_free(blob); + network_result_to_sqlite_error(context, res, "cloudsync_network_send_changes unable to receive upload URL"); + network_result_cleanup(&res); + return SQLITE_ERROR; + } + + char *s3_url = json_extract_string(res.buffer, res.blen, "url"); + if (!s3_url) { + cloudsync_memory_free(blob); + sqlite3_result_error(context, "cloudsync_network_send_changes: missing 'url' in upload response.", -1); + network_result_cleanup(&res); + return SQLITE_ERROR; + } + bool sent = network_send_buffer(netdata, s3_url, NULL, blob, blob_size); + cloudsync_memory_free(blob); + if (sent == false) { + cloudsync_memory_free(s3_url); + network_result_to_sqlite_error(context, res, "cloudsync_network_send_changes unable to upload BLOB changes to remote host."); + network_result_cleanup(&res); + return SQLITE_ERROR; + } + + int db_version_min = db_version+1; + int db_version_max = (int)new_db_version; + if (db_version_min > db_version_max) db_version_min = db_version_max; + char json_payload[4096]; + snprintf(json_payload, sizeof(json_payload), "{\"url\":\"%s\", \"dbVersionMin\":%d, \"dbVersionMax\":%d}", s3_url, db_version_min, db_version_max); + cloudsync_memory_free(s3_url); + + // free res network_result_cleanup(&res); - return SQLITE_ERROR; + + // notify remote host that we succesfully uploaded changes + res = network_receive_buffer(netdata, netdata->apply_endpoint, netdata->authentication, true, true, json_payload, CLOUDSYNC_HEADER_SQLITECLOUD); + } else { + // there is no data to send, just check the status to update the db_version value in settings and to reply the status + new_db_version = db_version; + res = network_receive_buffer(netdata, netdata->status_endpoint, netdata->authentication, true, false, NULL, CLOUDSYNC_HEADER_SQLITECLOUD); } - - char json_payload[2024]; - snprintf(json_payload, sizeof(json_payload), "{\"url\":\"%s\", \"dbVersionMin\":%d, \"dbVersionMax\":%lld}", s3_url, db_version, (long long)new_db_version); - - // free res - network_result_cleanup(&res); - - // notify remote host that we succesfully uploaded changes - res = network_receive_buffer(netdata, netdata->apply_endpoint, netdata->authentication, true, true, json_payload, CLOUDSYNC_HEADER_SQLITECLOUD); - if (res.code != CLOUDSYNC_NETWORK_OK) { + + int64_t last_optimistic_version = -1; + int64_t last_confirmed_version = -1; + int gaps_size = -1; + + if (res.code == CLOUDSYNC_NETWORK_BUFFER && res.buffer) { + last_optimistic_version = json_extract_int(res.buffer, res.blen, "lastOptimisticVersion", -1); + last_confirmed_version = json_extract_int(res.buffer, res.blen, "lastConfirmedVersion", -1); + gaps_size = json_extract_array_size(res.buffer, res.blen, "gaps"); + if (gaps_size < 0) gaps_size = 0; + } else if (res.code != CLOUDSYNC_NETWORK_OK) { network_result_to_sqlite_error(context, res, "cloudsync_network_send_changes unable to notify BLOB upload to remote host."); network_result_cleanup(&res); return SQLITE_ERROR; } - - // update db_version and seq + + // update db_version in settings char buf[256]; - if (new_db_version != db_version) { + if (last_optimistic_version >= 0) { + if (last_optimistic_version != db_version) { + snprintf(buf, sizeof(buf), "%" PRId64, last_optimistic_version); + dbutils_settings_set_key_value(data, CLOUDSYNC_KEY_SEND_DBVERSION, buf); + } + } else if (new_db_version != db_version) { snprintf(buf, sizeof(buf), "%" PRId64, new_db_version); dbutils_settings_set_key_value(data, CLOUDSYNC_KEY_SEND_DBVERSION, buf); } - if (new_seq != seq) { - snprintf(buf, sizeof(buf), "%" PRId64, new_seq); - dbutils_settings_set_key_value(data, CLOUDSYNC_KEY_SEND_SEQ, buf); + + // populate sync result + if (out) { + out->server_version = last_optimistic_version; + out->local_version = new_db_version; + out->status = network_compute_status(last_optimistic_version, last_confirmed_version, gaps_size, new_db_version); } - + network_result_cleanup(&res); return SQLITE_OK; } void cloudsync_network_send_changes (sqlite3_context *context, int argc, sqlite3_value **argv) { DEBUG_FUNCTION("cloudsync_network_send_changes"); - - cloudsync_network_send_changes_internal(context, argc, argv); + + sync_result sr = {-1, 0, NULL, 0, NULL}; + int rc = cloudsync_network_send_changes_internal(context, argc, argv, &sr); + if (rc != SQLITE_OK) return; + + char buf[256]; + snprintf(buf, sizeof(buf), + "{\"send\":{\"status\":\"%s\",\"localVersion\":%" PRId64 ",\"serverVersion\":%" PRId64 "}}", + sr.status ? sr.status : "error", sr.local_version, sr.server_version); + sqlite3_result_text(context, buf, -1, SQLITE_TRANSIENT); } -int cloudsync_network_check_internal(sqlite3_context *context, int *pnrows) { +int cloudsync_network_check_internal(sqlite3_context *context, int *pnrows, sync_result *out) { cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); network_data *netdata = (network_data *)cloudsync_auxdata(data); if (!netdata) {sqlite3_result_error(context, "Unable to retrieve CloudSync network context.", -1); return -1;} @@ -843,37 +991,62 @@ int cloudsync_network_check_internal(sqlite3_context *context, int *pnrows) { int seq = dbutils_settings_get_int_value(data, CLOUDSYNC_KEY_CHECK_SEQ); if (seq<0) {sqlite3_result_error(context, "Unable to retrieve seq.", -1); return -1;} + // Capture local db_version before download so we can query cloudsync_changes afterwards + int64_t prev_dbv = cloudsync_dbversion(data); + char json_payload[2024]; snprintf(json_payload, sizeof(json_payload), "{\"dbVersion\":%lld, \"seq\":%d}", (long long)db_version, seq); - // http://uuid.g5.sqlite.cloud/v2/cloudsync/{dbname}/{site_id}/check NETWORK_RESULT result = network_receive_buffer(netdata, netdata->check_endpoint, netdata->authentication, true, true, json_payload, CLOUDSYNC_HEADER_SQLITECLOUD); int rc = SQLITE_OK; if (result.code == CLOUDSYNC_NETWORK_BUFFER) { - rc = network_download_changes(context, result.buffer, pnrows); + char *download_url = json_extract_string(result.buffer, result.blen, "url"); + if (!download_url) { + sqlite3_result_error(context, "cloudsync_network_check_changes: missing 'url' in check response.", -1); + network_result_cleanup(&result); + return SQLITE_ERROR; + } + rc = network_download_changes(context, download_url, pnrows); + cloudsync_memory_free(download_url); } else { rc = network_set_sqlite_result(context, &result); } - + + if (out && pnrows) out->rows_received = *pnrows; + + // Query cloudsync_changes for affected tables after successful download + if (out && rc == SQLITE_OK && pnrows && *pnrows > 0) { + sqlite3 *db = (sqlite3 *)cloudsync_db(data); + out->tables_json = network_get_affected_tables(db, prev_dbv); + } + network_result_cleanup(&result); return rc; } void cloudsync_network_sync (sqlite3_context *context, int wait_ms, int max_retries) { - int rc = cloudsync_network_send_changes_internal(context, 0, NULL); + sync_result sr = {-1, 0, NULL, 0, NULL}; + int rc = cloudsync_network_send_changes_internal(context, 0, NULL, &sr); if (rc != SQLITE_OK) return; - + int ntries = 0; int nrows = 0; while (ntries < max_retries) { if (ntries > 0) sqlite3_sleep(wait_ms); - rc = cloudsync_network_check_internal(context, &nrows); + if (sr.tables_json) { cloudsync_memory_free(sr.tables_json); sr.tables_json = NULL; } + rc = cloudsync_network_check_internal(context, &nrows, &sr); if (rc == SQLITE_OK && nrows > 0) break; ntries++; } - - sqlite3_result_error_code(context, (nrows == -1) ? SQLITE_ERROR : SQLITE_OK); - if (nrows >= 0) sqlite3_result_int(context, nrows); + if (rc != SQLITE_OK) { if (sr.tables_json) cloudsync_memory_free(sr.tables_json); return; } + + const char *tables = sr.tables_json ? sr.tables_json : "[]"; + char *buf = cloudsync_memory_mprintf( + "{\"send\":{\"status\":\"%s\",\"localVersion\":%" PRId64 ",\"serverVersion\":%" PRId64 "}," + "\"receive\":{\"rows\":%d,\"tables\":%s}}", + sr.status ? sr.status : "error", sr.local_version, sr.server_version, nrows, tables); + sqlite3_result_text(context, buf, -1, cloudsync_memory_free); + if (sr.tables_json) cloudsync_memory_free(sr.tables_json); } void cloudsync_network_sync0 (sqlite3_context *context, int argc, sqlite3_value **argv) { @@ -895,12 +1068,16 @@ void cloudsync_network_sync2 (sqlite3_context *context, int argc, sqlite3_value void cloudsync_network_check_changes (sqlite3_context *context, int argc, sqlite3_value **argv) { DEBUG_FUNCTION("cloudsync_network_check_changes"); - + + sync_result sr = {-1, 0, NULL, 0, NULL}; int nrows = 0; - cloudsync_network_check_internal(context, &nrows); - - // returns number of applied rows - sqlite3_result_int(context, nrows); + int rc = cloudsync_network_check_internal(context, &nrows, &sr); + if (rc != SQLITE_OK) { if (sr.tables_json) cloudsync_memory_free(sr.tables_json); return; } + + const char *tables = sr.tables_json ? sr.tables_json : "[]"; + char *buf = cloudsync_memory_mprintf("{\"receive\":{\"rows\":%d,\"tables\":%s}}", nrows, tables); + sqlite3_result_text(context, buf, -1, cloudsync_memory_free); + if (sr.tables_json) cloudsync_memory_free(sr.tables_json); } void cloudsync_network_reset_sync_version (sqlite3_context *context, int argc, sqlite3_value **argv) { @@ -1000,6 +1177,21 @@ void cloudsync_network_logout (sqlite3_context *context, int argc, sqlite3_value cloudsync_memory_free(errmsg); } +void cloudsync_network_status (sqlite3_context *context, int argc, sqlite3_value **argv) { + DEBUG_FUNCTION("cloudsync_network_status"); + + cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); + network_data *netdata = (network_data *)cloudsync_auxdata(data); + if (!netdata) { + sqlite3_result_error(context, "Unable to retrieve CloudSync network context.", -1); + return; + } + + NETWORK_RESULT res = network_receive_buffer(netdata, netdata->status_endpoint, netdata->authentication, true, false, NULL, CLOUDSYNC_HEADER_SQLITECLOUD); + network_set_sqlite_result(context, &res); + network_result_cleanup(&res); +} + // MARK: - int cloudsync_network_register (sqlite3 *db, char **pzErrMsg, void *ctx) { @@ -1009,6 +1201,9 @@ int cloudsync_network_register (sqlite3 *db, char **pzErrMsg, void *ctx) { rc = sqlite3_create_function(db, "cloudsync_network_init", 1, DEFAULT_FLAGS, ctx, cloudsync_network_init, NULL, NULL); if (rc != SQLITE_OK) goto cleanup; + rc = sqlite3_create_function(db, "cloudsync_network_init_custom", 2, DEFAULT_FLAGS, ctx, cloudsync_network_init_custom, NULL, NULL); + if (rc != SQLITE_OK) return rc; + rc = sqlite3_create_function(db, "cloudsync_network_cleanup", 0, DEFAULT_FLAGS, ctx, cloudsync_network_cleanup, NULL, NULL); if (rc != SQLITE_OK) return rc; @@ -1038,7 +1233,10 @@ int cloudsync_network_register (sqlite3 *db, char **pzErrMsg, void *ctx) { rc = sqlite3_create_function(db, "cloudsync_network_logout", 0, DEFAULT_FLAGS, ctx, cloudsync_network_logout, NULL, NULL); if (rc != SQLITE_OK) return rc; - + + rc = sqlite3_create_function(db, "cloudsync_network_status", 0, DEFAULT_FLAGS, ctx, cloudsync_network_status, NULL, NULL); + if (rc != SQLITE_OK) return rc; + cleanup: if ((rc != SQLITE_OK) && (pzErrMsg)) { *pzErrMsg = sqlite3_mprintf("Error creating function in cloudsync_network_register: %s", sqlite3_errmsg(db)); diff --git a/src/network.h b/src/network/network.h similarity index 92% rename from src/network.h rename to src/network/network.h index 3b4db01..0c7e7de 100644 --- a/src/network.h +++ b/src/network/network.h @@ -8,7 +8,7 @@ #ifndef __CLOUDSYNC_NETWORK__ #define __CLOUDSYNC_NETWORK__ -#include "cloudsync.h" +#include "../cloudsync.h" #ifndef SQLITE_CORE #include "sqlite3ext.h" diff --git a/src/network.m b/src/network/network.m similarity index 70% rename from src/network.m rename to src/network/network.m index fa4c4ea..da2338c 100644 --- a/src/network.m +++ b/src/network/network.m @@ -13,60 +13,6 @@ void network_buffer_cleanup (void *xdata) { if (xdata) CFRelease(xdata); } -bool network_compute_endpoints (sqlite3_context *context, network_data *data, const char *conn_string) { - NSString *conn = [NSString stringWithUTF8String:conn_string]; - NSString *conn_string_https = nil; - - if ([conn hasPrefix:@"sqlitecloud://"]) { - conn_string_https = [conn stringByReplacingCharactersInRange:NSMakeRange(0, [@"sqlitecloud://" length]) withString:@"https://"]; - } else { - conn_string_https = conn; - } - - NSURL *url = [NSURL URLWithString:conn_string_https]; - if (!url) return false; - - NSString *scheme = url.scheme; // "https" - if (!scheme) return false; - NSString *host = url.host; // "cn5xiooanz.global3.ryujaz.sqlite.cloud" - if (!host) return false; - - NSString *port = url.port.stringValue; - NSString *database = url.path; // "/chinook-cloudsync.sqlite" - if (!database) return false; - - NSString *query = url.query; // "apikey=hWDanFolRT9WDK0p54lufNrIyfgLZgtMw6tb6fbPmpo" (OPTIONAL) - NSString *authentication = nil; - - if (query) { - NSURLComponents *components = [NSURLComponents componentsWithString:[@"http://dummy?" stringByAppendingString:query]]; - NSArray *items = components.queryItems; - for (NSURLQueryItem *item in items) { - // build new token - // apikey: just write the key for retrocompatibility - // other keys, like token: add a prefix, i.e. token= - - if ([item.name isEqualToString:@"apikey"]) { - authentication = item.value; - break; - } - if ([item.name isEqualToString:@"token"]) { - authentication = [NSString stringWithFormat:@"%@=%@", item.name, item.value]; - break; - } - } - } - - char *site_id = network_data_get_siteid(data); - char *port_or_default = (port && strcmp(port.UTF8String, "8860") != 0) ? (char *)port.UTF8String : CLOUDSYNC_DEFAULT_ENDPOINT_PORT; - - NSString *check_endpoint = [NSString stringWithFormat:@"%s://%s:%s/%s%s/%s/%s", scheme.UTF8String, host.UTF8String, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database.UTF8String, site_id, CLOUDSYNC_ENDPOINT_CHECK]; - NSString *upload_endpoint = [NSString stringWithFormat:@"%s://%s:%s/%s%s/%s/%s", scheme.UTF8String, host.UTF8String, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database.UTF8String, site_id, CLOUDSYNC_ENDPOINT_UPLOAD]; - NSString *apply_endpoint = [NSString stringWithFormat:@"%s://%s:%s/%s%s/%s/%s", scheme.UTF8String, host.UTF8String, port_or_default, CLOUDSYNC_ENDPOINT_PREFIX, database.UTF8String, site_id, CLOUDSYNC_ENDPOINT_APPLY]; - - return network_data_set_endpoints(data, (char *)authentication.UTF8String, (char *)check_endpoint.UTF8String, (char *)upload_endpoint.UTF8String, (char *)apply_endpoint.UTF8String); -} - bool network_send_buffer(network_data *data, const char *endpoint, const char *authentication, const void *blob, int blob_size) { NSString *urlString = [NSString stringWithUTF8String:endpoint]; NSURL *url = [NSURL URLWithString:urlString]; @@ -82,6 +28,11 @@ bool network_send_buffer(network_data *data, const char *endpoint, const char *a [request setValue:authString forHTTPHeaderField:@"Authorization"]; } + char *org_id = network_data_get_orgid(data); + if (org_id) { + [request setValue:[NSString stringWithUTF8String:org_id] forHTTPHeaderField:@CLOUDSYNC_HEADER_ORG]; + } + NSData *bodyData = [NSData dataWithBytes:blob length:blob_size]; [request setHTTPBody:bodyData]; @@ -135,6 +86,11 @@ NETWORK_RESULT network_receive_buffer(network_data *data, const char *endpoint, } } + char *org_id = network_data_get_orgid(data); + if (org_id) { + [request setValue:[NSString stringWithUTF8String:org_id] forHTTPHeaderField:@CLOUDSYNC_HEADER_ORG]; + } + if (authentication) { NSString *authString = [NSString stringWithFormat:@"Bearer %s", authentication]; [request setValue:authString forHTTPHeaderField:@"Authorization"]; diff --git a/src/network_private.h b/src/network/network_private.h similarity index 79% rename from src/network_private.h rename to src/network/network_private.h index 7583b66..b042959 100644 --- a/src/network_private.h +++ b/src/network/network_private.h @@ -8,12 +8,14 @@ #ifndef __CLOUDSYNC_NETWORK_PRIVATE__ #define __CLOUDSYNC_NETWORK_PRIVATE__ -#define CLOUDSYNC_ENDPOINT_PREFIX "v2/cloudsync" +#define CLOUDSYNC_DEFAULT_ADDRESS "https://cloudsync.sqlite.ai" +#define CLOUDSYNC_ENDPOINT_PREFIX "v2/cloudsync/databases" #define CLOUDSYNC_ENDPOINT_UPLOAD "upload" #define CLOUDSYNC_ENDPOINT_CHECK "check" #define CLOUDSYNC_ENDPOINT_APPLY "apply" -#define CLOUDSYNC_DEFAULT_ENDPOINT_PORT "443" +#define CLOUDSYNC_ENDPOINT_STATUS "status" #define CLOUDSYNC_HEADER_SQLITECLOUD "Accept: sqlc/plain" +#define CLOUDSYNC_HEADER_ORG "X-CloudSync-Org" #define CLOUDSYNC_NETWORK_OK 1 #define CLOUDSYNC_NETWORK_ERROR 2 @@ -30,9 +32,9 @@ typedef struct { } NETWORK_RESULT; char *network_data_get_siteid (network_data *data); -bool network_data_set_endpoints (network_data *data, char *auth, char *check, char *upload, char *apply); +char *network_data_get_orgid (network_data *data); +bool network_data_set_endpoints (network_data *data, char *auth, char *check, char *upload, char *apply, char *status); -bool network_compute_endpoints (sqlite3_context *context, network_data *data, const char *conn_string); bool network_send_buffer(network_data *data, const char *endpoint, const char *authentication, const void *blob, int blob_size); NETWORK_RESULT network_receive_buffer (network_data *data, const char *endpoint, const char *authentication, bool zero_terminated, bool is_post_request, char *json_payload, const char *custom_header); diff --git a/src/pk.c b/src/pk.c index cd7899b..97a6639 100644 --- a/src/pk.c +++ b/src/pk.c @@ -87,6 +87,8 @@ #define DATABASE_TYPE_MAX_NEGATIVE_INTEGER 6 // was SQLITE_MAX_NEGATIVE_INTEGER #define DATABASE_TYPE_NEGATIVE_FLOAT 7 // was SQLITE_NEGATIVE_FLOAT +char * const PRIKEY_NULL_CONSTRAINT_ERROR = "PRIKEY_NULL_CONSTRAINT_ERROR"; + // MARK: - Public Callbacks - int pk_decode_bind_callback (void *xdata, int index, int type, int64_t ival, double dval, char *pval) { @@ -436,7 +438,14 @@ char *pk_encode (dbvalue_t **argv, int argc, char *b, bool is_prikey, size_t *bs if (!bsize) return NULL; // must fit in a single byte if (argc > 255) return NULL; - + + // if schema does not enforce NOT NULL on primary keys, check at runtime + #ifndef CLOUDSYNC_CHECK_NOTNULL_PRIKEYS + for (int i = 0; i < argc; i++) { + if (database_value_type(argv[i]) == DBTYPE_NULL) return PRIKEY_NULL_CONSTRAINT_ERROR; + } + #endif + // 1 is the number of items in the serialization // always 1 byte so max 255 primary keys, even if there is an hard SQLite limit of 128 size_t blen_curr = *bsize; diff --git a/src/pk.h b/src/pk.h index 2571915..ea9a390 100644 --- a/src/pk.h +++ b/src/pk.h @@ -15,6 +15,8 @@ typedef int (*pk_decode_callback) (void *xdata, int index, int type, int64_t ival, double dval, char *pval); +extern char * const PRIKEY_NULL_CONSTRAINT_ERROR; + char *pk_encode_prikey (dbvalue_t **argv, int argc, char *b, size_t *bsize); char *pk_encode_value (dbvalue_t *value, size_t *bsize); char *pk_encode (dbvalue_t **argv, int argc, char *b, bool is_prikey, size_t *bsize, int skip_idx); diff --git a/src/postgresql/cloudsync--1.0.sql b/src/postgresql/cloudsync--1.0.sql index bbd52c0..7d36f60 100644 --- a/src/postgresql/cloudsync--1.0.sql +++ b/src/postgresql/cloudsync--1.0.sql @@ -289,6 +289,16 @@ RETURNS text AS 'MODULE_PATHNAME', 'pg_cloudsync_table_schema' LANGUAGE C VOLATILE; +-- ============================================================================ +-- Block-level LWW Functions +-- ============================================================================ + +-- Materialize block-level column into base table +CREATE OR REPLACE FUNCTION cloudsync_text_materialize(table_name text, col_name text, VARIADIC pk_values "any") +RETURNS boolean +AS 'MODULE_PATHNAME', 'cloudsync_text_materialize' +LANGUAGE C VOLATILE; + -- ============================================================================ -- Type Casts -- ============================================================================ diff --git a/src/postgresql/cloudsync_postgresql.c b/src/postgresql/cloudsync_postgresql.c index 09df63b..9e6cd85 100644 --- a/src/postgresql/cloudsync_postgresql.c +++ b/src/postgresql/cloudsync_postgresql.c @@ -32,6 +32,7 @@ // CloudSync headers (after PostgreSQL headers) #include "../cloudsync.h" +#include "../block.h" #include "../database.h" #include "../dbutils.h" #include "../pk.h" @@ -129,6 +130,9 @@ void _PG_init (void) { // Initialize memory debugger (NOOP in production) cloudsync_memory_init(1); + + // Set fractional-indexing allocator to use cloudsync memory + block_init_allocator(); } void _PG_fini (void) { @@ -597,7 +601,25 @@ Datum cloudsync_set_column (PG_FUNCTION_ARGS) { PG_TRY(); { - dbutils_table_settings_set_key_value(data, tbl, col, key, value); + // Handle block column setup: cloudsync_set_column('tbl', 'col', 'algo', 'block') + if (key && value && strcmp(key, "algo") == 0 && strcmp(value, "block") == 0) { + int rc = cloudsync_setup_block_column(data, tbl, col, NULL); + if (rc != DBRES_OK) { + ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("%s", cloudsync_errmsg(data)))); + } + } else { + // Handle delimiter setting: cloudsync_set_column('tbl', 'col', 'delimiter', '\n\n') + if (key && strcmp(key, "delimiter") == 0) { + cloudsync_table_context *table = table_lookup(data, tbl); + if (table) { + int col_idx = table_col_index(table, col); + if (col_idx >= 0 && table_col_algo(table, col_idx) == col_algo_block) { + table_set_col_delimiter(table, col_idx, value); + } + } + } + dbutils_table_settings_set_key_value(data, tbl, col, key, value); + } } PG_CATCH(); { @@ -1120,9 +1142,13 @@ Datum cloudsync_pk_encode (PG_FUNCTION_ARGS) { errmsg("cloudsync_pk_encode requires at least one primary key value"))); } + // Normalize all values to text for consistent PK encoding + // (PG triggers cast PK values to ::text; SQL callers must match) + pgvalues_normalize_to_text(argv, argc); + size_t pklen = 0; char *encoded = pk_encode_prikey((dbvalue_t **)argv, argc, NULL, &pklen); - if (!encoded) { + if (!encoded || encoded == PRIKEY_NULL_CONSTRAINT_ERROR) { ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("cloudsync_pk_encode failed to encode primary key"))); } @@ -1258,6 +1284,9 @@ Datum cloudsync_insert (PG_FUNCTION_ARGS) { // Extract PK values from VARIADIC "any" (args starting from index 1) cleanup.argv = pgvalues_from_args(fcinfo, 1, &cleanup.argc); + // Normalize PK values to text for consistent encoding + pgvalues_normalize_to_text(cleanup.argv, cleanup.argc); + // Verify we have the correct number of PK columns int expected_pks = table_count_pks(table); if (cleanup.argc != expected_pks) { @@ -1271,6 +1300,10 @@ Datum cloudsync_insert (PG_FUNCTION_ARGS) { if (!cleanup.pk) { ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("Not enough memory to encode the primary key(s)"))); } + if (cleanup.pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + cleanup.pk = NULL; + ereport(ERROR, (errcode(ERRCODE_NOT_NULL_VIOLATION), errmsg("Insert aborted because primary key in table %s contains NULL values", table_name))); + } // Compute the next database version for tracking changes int64_t db_version = cloudsync_dbversion_next(data, CLOUDSYNC_VALUE_NOTSET); @@ -1291,8 +1324,56 @@ Datum cloudsync_insert (PG_FUNCTION_ARGS) { if (rc == DBRES_OK) { // Process each non-primary key column for insert or update for (int i = 0; i < table_count_cols(table); i++) { - rc = local_mark_insert_or_update_meta(table, cleanup.pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); - if (rc != DBRES_OK) break; + if (table_col_algo(table, i) == col_algo_block) { + // Block column: read value from base table, split into blocks, store each block + dbvm_t *val_vm = table_column_lookup(table, table_colname(table, i), false, NULL); + if (!val_vm) { rc = DBRES_ERROR; break; } + + int bind_rc = pk_decode_prikey(cleanup.pk, pklen, pk_decode_bind_callback, (void *)val_vm); + if (bind_rc < 0) { databasevm_reset(val_vm); rc = DBRES_ERROR; break; } + + int step_rc = databasevm_step(val_vm); + if (step_rc == DBRES_ROW) { + const char *text = database_column_text(val_vm, 0); + const char *delim = table_col_delimiter(table, i); + const char *col = table_colname(table, i); + + block_list_t *blocks = block_split(text ? text : "", delim); + if (blocks) { + char **positions = block_initial_positions(blocks->count); + if (positions) { + for (int b = 0; b < blocks->count; b++) { + char *block_cn = block_build_colname(col, positions[b]); + if (block_cn) { + rc = local_mark_insert_or_update_meta(table, cleanup.pk, pklen, block_cn, db_version, cloudsync_bumpseq(data)); + + // Store block value in blocks table + dbvm_t *wvm = table_block_value_write_stmt(table); + if (wvm && rc == DBRES_OK) { + databasevm_bind_blob(wvm, 1, cleanup.pk, (int)pklen); + databasevm_bind_text(wvm, 2, block_cn, -1); + databasevm_bind_text(wvm, 3, blocks->entries[b].content, -1); + databasevm_step(wvm); + databasevm_reset(wvm); + } + + cloudsync_memory_free(block_cn); + } + cloudsync_memory_free(positions[b]); + if (rc != DBRES_OK) break; + } + cloudsync_memory_free(positions); + } + block_list_free(blocks); + } + } + databasevm_reset(val_vm); + if (step_rc == DBRES_ROW || step_rc == DBRES_DONE) { if (rc == DBRES_OK) continue; } + if (rc != DBRES_OK) break; + } else { + rc = local_mark_insert_or_update_meta(table, cleanup.pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); + if (rc != DBRES_OK) break; + } } } @@ -1349,6 +1430,9 @@ Datum cloudsync_delete (PG_FUNCTION_ARGS) { // Extract PK values from VARIADIC "any" (args starting from index 1) cleanup.argv = pgvalues_from_args(fcinfo, 1, &cleanup.argc); + // Normalize PK values to text for consistent encoding + pgvalues_normalize_to_text(cleanup.argv, cleanup.argc); + int expected_pks = table_count_pks(table); if (cleanup.argc != expected_pks) { ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("Expected %d primary key values, got %d", expected_pks, cleanup.argc))); @@ -1360,6 +1444,10 @@ Datum cloudsync_delete (PG_FUNCTION_ARGS) { if (!cleanup.pk) { ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("Not enough memory to encode the primary key(s)"))); } + if (cleanup.pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + cleanup.pk = NULL; + ereport(ERROR, (errcode(ERRCODE_NOT_NULL_VIOLATION), errmsg("Delete aborted because primary key in table %s contains NULL values", table_name))); + } int64_t db_version = cloudsync_dbversion_next(data, CLOUDSYNC_VALUE_NOTSET); @@ -1561,6 +1649,10 @@ Datum cloudsync_update_finalfn (PG_FUNCTION_ARGS) { if (!pk) { ereport(ERROR, (errcode(ERRCODE_OUT_OF_MEMORY), errmsg("Not enough memory to encode the primary key(s)"))); } + if (pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + pk = NULL; + ereport(ERROR, (errcode(ERRCODE_NOT_NULL_VIOLATION), errmsg("Update aborted because primary key in table %s contains NULL values", table_name))); + } if (prikey_changed) { oldpk = pk_encode_prikey((dbvalue_t **)payload->old_values, pk_count, buffer2, &oldpklen); if (!oldpk) { @@ -1583,8 +1675,99 @@ Datum cloudsync_update_finalfn (PG_FUNCTION_ARGS) { if (col_index >= payload->count) break; if (dbutils_value_compare((dbvalue_t *)payload->old_values[col_index], (dbvalue_t *)payload->new_values[col_index]) != 0) { - rc = local_mark_insert_or_update_meta(table, pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); - if (rc != DBRES_OK) goto cleanup; + if (table_col_algo(table, i) == col_algo_block) { + // Block column: diff old and new text, emit per-block metadata changes + const char *new_text = (const char *)database_value_text(payload->new_values[col_index]); + const char *delim = table_col_delimiter(table, i); + const char *col = table_colname(table, i); + + // Read existing blocks from blocks table + block_list_t *old_blocks = block_list_create_empty(); + char *like_pattern = block_build_colname(col, "%"); + if (like_pattern && old_blocks) { + char *list_sql = cloudsync_memory_mprintf( + "SELECT col_name, col_value FROM %s WHERE pk = $1 AND col_name LIKE $2 ORDER BY col_name COLLATE \"C\"", + table_blocks_ref(table)); + if (list_sql) { + dbvm_t *list_vm = NULL; + if (databasevm_prepare(data, list_sql, &list_vm, 0) == DBRES_OK) { + databasevm_bind_blob(list_vm, 1, pk, (int)pklen); + databasevm_bind_text(list_vm, 2, like_pattern, -1); + while (databasevm_step(list_vm) == DBRES_ROW) { + const char *bcn = database_column_text(list_vm, 0); + const char *bval = database_column_text(list_vm, 1); + const char *pos = block_extract_position_id(bcn); + if (pos && old_blocks) { + block_list_add(old_blocks, bval ? bval : "", pos); + } + } + databasevm_finalize(list_vm); + } + cloudsync_memory_free(list_sql); + } + } + + // Split new text into parts (NULL text = all blocks removed) + block_list_t *new_blocks = new_text ? block_split(new_text, delim) : block_list_create_empty(); + if (new_blocks && old_blocks) { + // Build array of new content strings (NULL when count is 0) + const char **new_parts = NULL; + if (new_blocks->count > 0) { + new_parts = (const char **)cloudsync_memory_alloc( + (uint64_t)(new_blocks->count * sizeof(char *))); + if (new_parts) { + for (int b = 0; b < new_blocks->count; b++) { + new_parts[b] = new_blocks->entries[b].content; + } + } + } + + if (new_parts || new_blocks->count == 0) { + block_diff_t *diff = block_diff(old_blocks->entries, old_blocks->count, + new_parts, new_blocks->count); + if (diff) { + for (int d = 0; d < diff->count; d++) { + block_diff_entry_t *de = &diff->entries[d]; + char *block_cn = block_build_colname(col, de->position_id); + if (!block_cn) continue; + + if (de->type == BLOCK_DIFF_ADDED || de->type == BLOCK_DIFF_MODIFIED) { + rc = local_mark_insert_or_update_meta(table, pk, pklen, block_cn, + db_version, cloudsync_bumpseq(data)); + // Store block value + if (rc == DBRES_OK && table_block_value_write_stmt(table)) { + dbvm_t *wvm = table_block_value_write_stmt(table); + databasevm_bind_blob(wvm, 1, pk, (int)pklen); + databasevm_bind_text(wvm, 2, block_cn, -1); + databasevm_bind_text(wvm, 3, de->content, -1); + databasevm_step(wvm); + databasevm_reset(wvm); + } + } else if (de->type == BLOCK_DIFF_REMOVED) { + // Mark block as deleted in metadata (even col_version) + rc = local_mark_delete_block_meta(table, pk, pklen, block_cn, + db_version, cloudsync_bumpseq(data)); + // Remove from blocks table + if (rc == DBRES_OK) { + block_delete_value_external(data, table, pk, pklen, block_cn); + } + } + cloudsync_memory_free(block_cn); + if (rc != DBRES_OK) break; + } + block_diff_free(diff); + } + if (new_parts) cloudsync_memory_free((void *)new_parts); + } + } + if (new_blocks) block_list_free(new_blocks); + if (old_blocks) block_list_free(old_blocks); + if (like_pattern) cloudsync_memory_free(like_pattern); + if (rc != DBRES_OK) goto cleanup; + } else { + rc = local_mark_insert_or_update_meta(table, pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); + if (rc != DBRES_OK) goto cleanup; + } } } @@ -1945,7 +2128,42 @@ Datum cloudsync_col_value(PG_FUNCTION_ARGS) { if (!table) { ereport(ERROR, (errmsg("Unable to retrieve table name %s in clousdsync_col_value.", table_name))); } - + + // Block column: if col_name contains \x1F, read from blocks table + if (block_is_block_colname(col_name) && table_has_block_cols(table)) { + dbvm_t *bvm = table_block_value_read_stmt(table); + if (!bvm) { + bytea *null_encoded = cloudsync_encode_null_value(); + PG_RETURN_BYTEA_P(null_encoded); + } + + bytea *encoded_pk_b = PG_GETARG_BYTEA_P(2); + size_t b_pk_len = (size_t)VARSIZE_ANY_EXHDR(encoded_pk_b); + int brc = databasevm_bind_blob(bvm, 1, VARDATA_ANY(encoded_pk_b), (uint64_t)b_pk_len); + if (brc != DBRES_OK) { databasevm_reset(bvm); ereport(ERROR, (errmsg("cloudsync_col_value block bind error"))); } + brc = databasevm_bind_text(bvm, 2, col_name, -1); + if (brc != DBRES_OK) { databasevm_reset(bvm); ereport(ERROR, (errmsg("cloudsync_col_value block bind error"))); } + + brc = databasevm_step(bvm); + if (brc == DBRES_ROW) { + size_t blob_len = 0; + const void *blob = database_column_blob(bvm, 0, &blob_len); + bytea *result = NULL; + if (blob && blob_len > 0) { + result = (bytea *)palloc(VARHDRSZ + blob_len); + SET_VARSIZE(result, VARHDRSZ + blob_len); + memcpy(VARDATA(result), blob, blob_len); + } + databasevm_reset(bvm); + if (result) PG_RETURN_BYTEA_P(result); + PG_RETURN_NULL(); + } else { + databasevm_reset(bvm); + bytea *null_encoded = cloudsync_encode_null_value(); + PG_RETURN_BYTEA_P(null_encoded); + } + } + // extract the right col_value vm associated to the column name dbvm_t *vm = table_column_lookup(table, col_name, false, NULL); if (!vm) { @@ -1972,8 +2190,8 @@ Datum cloudsync_col_value(PG_FUNCTION_ARGS) { PG_RETURN_BYTEA_P(result); } else if (rc == DBRES_ROW) { // copy value before reset invalidates SPI tuple memory - const void *blob = database_column_blob(vm, 0); - int blob_len = database_column_bytes(vm, 0); + size_t blob_len = 0; + const void *blob = database_column_blob(vm, 0, &blob_len); bytea *result = NULL; if (blob && blob_len > 0) { result = (bytea *)palloc(VARHDRSZ + blob_len); @@ -1990,6 +2208,73 @@ Datum cloudsync_col_value(PG_FUNCTION_ARGS) { PG_RETURN_NULL(); // unreachable, silences compiler } +// MARK: - Block-level LWW - + +PG_FUNCTION_INFO_V1(cloudsync_text_materialize); +Datum cloudsync_text_materialize (PG_FUNCTION_ARGS) { + if (PG_ARGISNULL(0) || PG_ARGISNULL(1)) { + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("cloudsync_text_materialize: table_name and col_name cannot be NULL"))); + } + + const char *table_name = text_to_cstring(PG_GETARG_TEXT_PP(0)); + const char *col_name = text_to_cstring(PG_GETARG_TEXT_PP(1)); + + cloudsync_context *data = get_cloudsync_context(); + cloudsync_pg_cleanup_state cleanup = {0}; + + int spi_rc = SPI_connect(); + if (spi_rc != SPI_OK_CONNECT) { + ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("SPI_connect failed: %d", spi_rc))); + } + cleanup.spi_connected = true; + + PG_ENSURE_ERROR_CLEANUP(cloudsync_pg_cleanup, PointerGetDatum(&cleanup)); + { + cloudsync_table_context *table = table_lookup(data, table_name); + if (!table) { + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("Unable to retrieve table name %s in cloudsync_text_materialize", table_name))); + } + + int col_idx = table_col_index(table, col_name); + if (col_idx < 0 || table_col_algo(table, col_idx) != col_algo_block) { + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("Column %s in table %s is not configured as block-level", col_name, table_name))); + } + + // Extract PK values from VARIADIC "any" (args starting from index 2) + cleanup.argv = pgvalues_from_args(fcinfo, 2, &cleanup.argc); + + // Normalize PK values to text for consistent encoding + pgvalues_normalize_to_text(cleanup.argv, cleanup.argc); + + int expected_pks = table_count_pks(table); + if (cleanup.argc != expected_pks) { + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("Expected %d primary key values, got %d", expected_pks, cleanup.argc))); + } + + size_t pklen = sizeof(cleanup.pk_buffer); + cleanup.pk = pk_encode_prikey((dbvalue_t **)cleanup.argv, cleanup.argc, cleanup.pk_buffer, &pklen); + if (!cleanup.pk || cleanup.pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + if (cleanup.pk == PRIKEY_NULL_CONSTRAINT_ERROR) cleanup.pk = NULL; + ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("Failed to encode primary key(s)"))); + } + + int rc = block_materialize_column(data, table, cleanup.pk, (int)pklen, col_name); + if (rc != DBRES_OK) { + ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), + errmsg("%s", cloudsync_errmsg(data)))); + } + } + PG_END_ENSURE_ERROR_CLEANUP(cloudsync_pg_cleanup, PointerGetDatum(&cleanup)); + + cloudsync_pg_cleanup(0, PointerGetDatum(&cleanup)); + PG_RETURN_BOOL(true); +} + // Track SRF execution state across calls typedef struct { Portal portal; @@ -2137,6 +2422,20 @@ static char * build_union_sql (void) { } SPI_freetuptable(SPI_tuptable); + // Check if blocks table exists for this table + char blocks_tbl_name[1024]; + snprintf(blocks_tbl_name, sizeof(blocks_tbl_name), "%s_cloudsync_blocks", base); + StringInfoData btq; + initStringInfo(&btq); + appendStringInfo(&btq, + "SELECT 1 FROM pg_class c JOIN pg_namespace n ON n.oid = c.relnamespace " + "WHERE c.relname = %s AND n.nspname = %s AND c.relkind = 'r'", + quote_literal_cstr(blocks_tbl_name), nsp_lit); + int btrc = SPI_execute(btq.data, true, 1); + bool has_blocks_table = (btrc == SPI_OK_SELECT && SPI_processed > 0); + if (SPI_tuptable) { SPI_freetuptable(SPI_tuptable); SPI_tuptable = NULL; } + pfree(btq.data); + /* Collect all base-table columns to build CASE over t1.col_name */ StringInfoData colq; initStringInfo(&colq); @@ -2157,13 +2456,22 @@ static char * build_union_sql (void) { ereport(ERROR, (errmsg("cloudsync: unable to resolve columns for %s.%s", nsp, base))); } uint64 ncols = SPI_processed; - + StringInfoData caseexpr; initStringInfo(&caseexpr); appendStringInfoString(&caseexpr, "CASE " "WHEN t1.col_name = '" CLOUDSYNC_TOMBSTONE_VALUE "' THEN " CLOUDSYNC_NULL_VALUE_BYTEA " " "WHEN b.ctid IS NULL THEN " CLOUDSYNC_RLS_RESTRICTED_VALUE_BYTEA " " + ); + if (has_blocks_table) { + appendStringInfo(&caseexpr, + "WHEN t1.col_name LIKE '%%' || chr(31) || '%%' THEN " + "(SELECT cloudsync_encode_value(blk.col_value) FROM %s.\"%s_cloudsync_blocks\" blk " + "WHERE blk.pk = t1.pk AND blk.col_name = t1.col_name) ", + quote_identifier(nsp), base); + } + appendStringInfoString(&caseexpr, "ELSE CASE t1.col_name " ); diff --git a/src/postgresql/database_postgresql.c b/src/postgresql/database_postgresql.c index f777166..3fc6310 100644 --- a/src/postgresql/database_postgresql.c +++ b/src/postgresql/database_postgresql.c @@ -68,6 +68,8 @@ typedef struct { // Params int nparams; Oid types[MAX_PARAMS]; + Oid prepared_types[MAX_PARAMS]; // types used when plan was SPI_prepare'd + int prepared_nparams; // nparams at prepare time Datum values[MAX_PARAMS]; char nulls[MAX_PARAMS]; bool executed_nonselect; // non-select executed already @@ -210,6 +212,129 @@ char *sql_build_upsert_pk_and_col (cloudsync_context *data, const char *table_na return (rc == DBRES_OK) ? query : NULL; } +char *sql_build_upsert_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema) { + if (ncolnames <= 0 || !colnames) return NULL; + + char *qualified = database_build_base_ref(schema, table_name); + if (!qualified) return NULL; + + // Build VALUES list for column names: ('col_a',1),('col_b',2) + // Column names are SQL literals here, so escape single quotes + size_t values_cap = (size_t)ncolnames * 128 + 1; + char *col_values = cloudsync_memory_alloc(values_cap); + if (!col_values) { cloudsync_memory_free(qualified); return NULL; } + + size_t vpos = 0; + for (int i = 0; i < ncolnames; i++) { + char esc[1024]; + sql_escape_literal(colnames[i], esc, sizeof(esc)); + vpos += snprintf(col_values + vpos, values_cap - vpos, "%s('%s'::text,%d)", + i > 0 ? "," : "", esc, i + 1); + } + + // Build meta-query that generates the final INSERT...ON CONFLICT SQL with proper types + char *meta_sql = cloudsync_memory_mprintf( + "WITH tbl AS (" + " SELECT to_regclass('%s') AS oid" + "), " + "pk AS (" + " SELECT a.attname, k.ord, format_type(a.atttypid, a.atttypmod) AS coltype " + " FROM pg_index x " + " JOIN tbl t ON t.oid = x.indrelid " + " JOIN LATERAL unnest(x.indkey) WITH ORDINALITY AS k(attnum, ord) ON true " + " JOIN pg_attribute a ON a.attrelid = x.indrelid AND a.attnum = k.attnum " + " WHERE x.indisprimary " + " ORDER BY k.ord" + "), " + "pk_count AS (SELECT count(*) AS n FROM pk), " + "cols AS (" + " SELECT u.colname, format_type(a.atttypid, a.atttypmod) AS coltype, u.ord " + " FROM (VALUES %s) AS u(colname, ord) " + " JOIN pg_attribute a ON a.attrelid = (SELECT oid FROM tbl) AND a.attname = u.colname " + " WHERE a.attnum > 0 AND NOT a.attisdropped" + ") " + "SELECT " + " 'INSERT INTO ' || (SELECT (oid::regclass)::text FROM tbl)" + " || ' (' || (SELECT string_agg(format('%%I', attname), ',' ORDER BY ord) FROM pk)" + " || ',' || (SELECT string_agg(format('%%I', colname), ',' ORDER BY ord) FROM cols) || ')'" + " || ' VALUES (' || (SELECT string_agg(format('$%%s::%%s', ord, coltype), ',' ORDER BY ord) FROM pk)" + " || ',' || (SELECT string_agg(format('$%%s::%%s', (SELECT n FROM pk_count) + ord, coltype), ',' ORDER BY ord) FROM cols) || ')'" + " || ' ON CONFLICT (' || (SELECT string_agg(format('%%I', attname), ',' ORDER BY ord) FROM pk) || ')'" + " || ' DO UPDATE SET ' || (SELECT string_agg(format('%%I=EXCLUDED.%%I', colname, colname), ',' ORDER BY ord) FROM cols)" + " || ';';", + qualified, col_values + ); + + cloudsync_memory_free(qualified); + cloudsync_memory_free(col_values); + if (!meta_sql) return NULL; + + char *query = NULL; + int rc = database_select_text(data, meta_sql, &query); + cloudsync_memory_free(meta_sql); + + return (rc == DBRES_OK) ? query : NULL; +} + +char *sql_build_update_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema) { + if (ncolnames <= 0 || !colnames) return NULL; + + char *qualified = database_build_base_ref(schema, table_name); + if (!qualified) return NULL; + + // Build VALUES list for column names: ('col_a',1),('col_b',2) + size_t values_cap = (size_t)ncolnames * 128 + 1; + char *col_values = cloudsync_memory_alloc(values_cap); + if (!col_values) { cloudsync_memory_free(qualified); return NULL; } + + size_t vpos = 0; + for (int i = 0; i < ncolnames; i++) { + char esc[1024]; + sql_escape_literal(colnames[i], esc, sizeof(esc)); + vpos += snprintf(col_values + vpos, values_cap - vpos, "%s('%s'::text,%d)", + i > 0 ? "," : "", esc, i + 1); + } + + // Build meta-query that generates UPDATE ... SET col=$ WHERE pk=$ + char *meta_sql = cloudsync_memory_mprintf( + "WITH tbl AS (" + " SELECT to_regclass('%s') AS oid" + "), " + "pk AS (" + " SELECT a.attname, k.ord, format_type(a.atttypid, a.atttypmod) AS coltype " + " FROM pg_index x " + " JOIN tbl t ON t.oid = x.indrelid " + " JOIN LATERAL unnest(x.indkey) WITH ORDINALITY AS k(attnum, ord) ON true " + " JOIN pg_attribute a ON a.attrelid = x.indrelid AND a.attnum = k.attnum " + " WHERE x.indisprimary " + " ORDER BY k.ord" + "), " + "pk_count AS (SELECT count(*) AS n FROM pk), " + "cols AS (" + " SELECT u.colname, format_type(a.atttypid, a.atttypmod) AS coltype, u.ord " + " FROM (VALUES %s) AS u(colname, ord) " + " JOIN pg_attribute a ON a.attrelid = (SELECT oid FROM tbl) AND a.attname = u.colname " + " WHERE a.attnum > 0 AND NOT a.attisdropped" + ") " + "SELECT " + " 'UPDATE ' || (SELECT (oid::regclass)::text FROM tbl)" + " || ' SET ' || (SELECT string_agg(format('%%I=$%%s::%%s', colname, (SELECT n FROM pk_count) + ord, coltype), ',' ORDER BY ord) FROM cols)" + " || ' WHERE ' || (SELECT string_agg(format('%%I=$%%s::%%s', attname, ord, coltype), ' AND ' ORDER BY ord) FROM pk)" + " || ';';", + qualified, col_values + ); + + cloudsync_memory_free(qualified); + cloudsync_memory_free(col_values); + if (!meta_sql) return NULL; + + char *query = NULL; + int rc = database_select_text(data, meta_sql, &query); + cloudsync_memory_free(meta_sql); + + return (rc == DBRES_OK) ? query : NULL; +} + char *sql_build_select_cols_by_pk (cloudsync_context *data, const char *table_name, const char *colname, const char *schema) { UNUSED_PARAMETER(data); char *qualified = database_build_base_ref(schema, table_name); @@ -310,6 +435,17 @@ char *database_build_base_ref (const char *schema, const char *table_name) { return cloudsync_memory_mprintf("\"%s\"", escaped_table); } +char *database_build_blocks_ref (const char *schema, const char *table_name) { + char escaped_table[512]; + sql_escape_identifier(table_name, escaped_table, sizeof(escaped_table)); + if (schema) { + char escaped_schema[512]; + sql_escape_identifier(schema, escaped_schema, sizeof(escaped_schema)); + return cloudsync_memory_mprintf("\"%s\".\"%s_cloudsync_blocks\"", escaped_schema, escaped_table); + } + return cloudsync_memory_mprintf("\"%s_cloudsync_blocks\"", escaped_table); +} + // Schema-aware SQL builder for PostgreSQL: deletes columns not in schema or pkcol. // Schema parameter: pass empty string to fall back to current_schema() via SQL. char *sql_build_delete_cols_not_in_schema_query (const char *schema, const char *table_name, const char *meta_ref, const char *pkcol) { @@ -581,27 +717,26 @@ int database_select1_value (cloudsync_context *data, const char *sql, char **ptr return rc; } -int database_select3_values (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2, int64_t *value3) { +int database_select2_values (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2) { cloudsync_reset_error(data); // init values *value = NULL; *value2 = 0; - *value3 = 0; *len = 0; int rc = SPI_execute(sql, true, 0); if (rc < 0) { - rc = cloudsync_set_error(data, "SPI_execute failed in database_select3_values", DBRES_ERROR); + rc = cloudsync_set_error(data, "SPI_execute failed in database_select2_values", DBRES_ERROR); goto cleanup; } if (!SPI_tuptable || !SPI_tuptable->tupdesc) { - rc = cloudsync_set_error(data, "No result table in database_select3_values", DBRES_ERROR); + rc = cloudsync_set_error(data, "No result table in database_select2_values", DBRES_ERROR); goto cleanup; } - if (SPI_tuptable->tupdesc->natts < 3) { - rc = cloudsync_set_error(data, "Result has fewer than 3 columns in database_select3_values", DBRES_ERROR); + if (SPI_tuptable->tupdesc->natts < 2) { + rc = cloudsync_set_error(data, "Result has fewer than 2 columns in database_select2_values", DBRES_ERROR); goto cleanup; } if (SPI_processed == 0) { @@ -659,17 +794,6 @@ int database_select3_values (cloudsync_context *data, const char *sql, char **va } } - // Third column - int - Datum datum3 = SPI_getbinval(tuple, SPI_tuptable->tupdesc, 3, &isnull); - if (!isnull) { - Oid typeid = SPI_gettypeid(SPI_tuptable->tupdesc, 3); - if (typeid == INT8OID) { - *value3 = DatumGetInt64(datum3); - } else if (typeid == INT4OID) { - *value3 = (int64_t)DatumGetInt32(datum3); - } - } - rc = DBRES_OK; cleanup: @@ -998,8 +1122,8 @@ int database_select_blob (cloudsync_context *data, const char *sql, char **value return database_select1_value(data, sql, value, len, DBTYPE_BLOB); } -int database_select_blob_2int (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2, int64_t *value3) { - return database_select3_values(data, sql, value, len, value2, value3); +int database_select_blob_int (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2) { + return database_select2_values(data, sql, value, len, value2); } int database_cleanup (cloudsync_context *data) { @@ -1203,7 +1327,7 @@ static int database_create_insert_trigger_internal (cloudsync_context *data, con char sql[2048]; snprintf(sql, sizeof(sql), - "SELECT string_agg('NEW.' || quote_ident(kcu.column_name), ',' ORDER BY kcu.ordinal_position) " + "SELECT string_agg('NEW.' || quote_ident(kcu.column_name) || '::text', ',' ORDER BY kcu.ordinal_position) " "FROM information_schema.table_constraints tc " "JOIN information_schema.key_column_usage kcu " " ON tc.constraint_name = kcu.constraint_name " @@ -1471,7 +1595,7 @@ static int database_create_delete_trigger_internal (cloudsync_context *data, con char sql[2048]; snprintf(sql, sizeof(sql), - "SELECT string_agg('OLD.' || quote_ident(kcu.column_name), ',' ORDER BY kcu.ordinal_position) " + "SELECT string_agg('OLD.' || quote_ident(kcu.column_name) || '::text', ',' ORDER BY kcu.ordinal_position) " "FROM information_schema.table_constraints tc " "JOIN information_schema.key_column_usage kcu " " ON tc.constraint_name = kcu.constraint_name " @@ -1936,9 +2060,13 @@ int databasevm_step0 (pg_stmt_t *stmt) { ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("Unable to prepare SQL statement"))); } - + SPI_keepplan(stmt->plan); stmt->plan_is_prepared = true; + + // Save the types used for this plan so we can detect type changes + memcpy(stmt->prepared_types, stmt->types, sizeof(Oid) * stmt->nparams); + stmt->prepared_nparams = stmt->nparams; } PG_CATCH(); { @@ -1975,6 +2103,26 @@ int databasevm_step (dbvm_t *vm) { cloudsync_context *data = stmt->data; cloudsync_reset_error(data); + // If plan is prepared but parameter types have changed since preparation, + // free the old plan and re-prepare with new types. This happens when the same + // prepared statement is reused with different PK encodings (e.g., integer vs text). + if (stmt->plan_is_prepared && stmt->plan) { + bool types_changed = (stmt->nparams != stmt->prepared_nparams); + if (!types_changed) { + for (int i = 0; i < stmt->nparams; i++) { + if (stmt->types[i] != stmt->prepared_types[i]) { + types_changed = true; + break; + } + } + } + if (types_changed) { + SPI_freeplan(stmt->plan); + stmt->plan = NULL; + stmt->plan_is_prepared = false; + } + } + if (!stmt->plan_is_prepared) { int rc = databasevm_step0(stmt); if (rc != DBRES_OK) return rc; @@ -2365,7 +2513,7 @@ Datum database_column_datum (dbvm_t *vm, int index) { return (isnull) ? (Datum)0 : d; } -const void *database_column_blob (dbvm_t *vm, int index) { +const void *database_column_blob (dbvm_t *vm, int index, size_t *len) { if (!vm) return NULL; pg_stmt_t *stmt = (pg_stmt_t*)vm; if (!stmt->last_tuptable || !stmt->current_tupdesc) return NULL; @@ -2387,16 +2535,17 @@ const void *database_column_blob (dbvm_t *vm, int index) { return NULL; } - Size len = VARSIZE(ba) - VARHDRSZ; - void *out = palloc(len); + Size blen = VARSIZE(ba) - VARHDRSZ; + void *out = palloc(blen); if (!out) { MemoryContextSwitchTo(old); return NULL; } - memcpy(out, VARDATA(ba), (size_t)len); + memcpy(out, VARDATA(ba), (size_t)blen); MemoryContextSwitchTo(old); + if (len) *len = (size_t)blen; return out; } @@ -2458,15 +2607,26 @@ const char *database_column_text (dbvm_t *vm, int index) { Datum d = get_datum(stmt, index, &isnull, &type); if (isnull) return NULL; - if (type != TEXTOID && type != VARCHAROID && type != BPCHAROID) - return NULL; // or convert via output function if you want - MemoryContext old = MemoryContextSwitchTo(stmt->row_mcxt); - text *t = DatumGetTextP(d); - int len = VARSIZE(t) - VARHDRSZ; - char *out = palloc(len + 1); - memcpy(out, VARDATA(t), len); - out[len] = 0; + char *out = NULL; + + if (type == BYTEAOID) { + bytea *b = DatumGetByteaP(d); + int len = VARSIZE(b) - VARHDRSZ; + out = palloc(len + 1); + memcpy(out, VARDATA(b), len); + out[len] = 0; + } else if (type == TEXTOID || type == VARCHAROID || type == BPCHAROID) { + text *t = DatumGetTextP(d); + int len = VARSIZE(t) - VARHDRSZ; + out = palloc(len + 1); + memcpy(out, VARDATA(t), len); + out[len] = 0; + } else { + MemoryContextSwitchTo(old); + return NULL; + } + MemoryContextSwitchTo(old); return out; @@ -2698,15 +2858,24 @@ void *database_value_dup (dbvalue_t *value) { if (!v) return NULL; pgvalue_t *copy = pgvalue_create(v->datum, v->typeid, v->typmod, v->collation, v->isnull); - if (v->detoasted && v->owned_detoast) { - Size len = VARSIZE_ANY(v->owned_detoast); + + // Deep-copy pass-by-reference (varlena) datum data into TopMemoryContext + // so the copy survives SPI_finish() which destroys the caller's SPI context. + bool is_varlena = (v->typeid == BYTEAOID) || pgvalue_is_text_type(v->typeid); + if (is_varlena && !v->isnull) { + void *src = v->owned_detoast ? v->owned_detoast : DatumGetPointer(v->datum); + Size len = VARSIZE_ANY(src); + MemoryContext old = MemoryContextSwitchTo(TopMemoryContext); copy->owned_detoast = palloc(len); - memcpy(copy->owned_detoast, v->owned_detoast, len); + MemoryContextSwitchTo(old); + memcpy(copy->owned_detoast, src, len); copy->datum = PointerGetDatum(copy->owned_detoast); copy->detoasted = true; } if (v->cstring) { + MemoryContext old = MemoryContextSwitchTo(TopMemoryContext); copy->cstring = pstrdup(v->cstring); + MemoryContextSwitchTo(old); copy->owns_cstring = true; } return (void*)copy; @@ -2744,7 +2913,7 @@ static int database_refresh_snapshot (void) { return DBRES_ERROR; } PG_END_TRY(); - + return DBRES_OK; } @@ -2772,6 +2941,7 @@ int database_begin_savepoint (cloudsync_context *data, const char *savepoint_nam int database_commit_savepoint (cloudsync_context *data, const char *savepoint_name) { cloudsync_reset_error(data); + if (GetCurrentTransactionNestLevel() <= 1) return DBRES_OK; int rc = DBRES_OK; MemoryContext oldcontext = CurrentMemoryContext; @@ -2796,6 +2966,7 @@ int database_commit_savepoint (cloudsync_context *data, const char *savepoint_na int database_rollback_savepoint (cloudsync_context *data, const char *savepoint_name) { cloudsync_reset_error(data); + if (GetCurrentTransactionNestLevel() <= 1) return DBRES_OK; int rc = DBRES_OK; MemoryContext oldcontext = CurrentMemoryContext; @@ -2902,14 +3073,4 @@ uint64_t dbmem_size (void *ptr) { return 0; } -// MARK: - CLOUDSYNC CALLBACK - -static cloudsync_payload_apply_callback_t payload_apply_callback = NULL; - -void cloudsync_set_payload_apply_callback(void *db, cloudsync_payload_apply_callback_t callback) { - payload_apply_callback = callback; -} - -cloudsync_payload_apply_callback_t cloudsync_get_payload_apply_callback(void *db) { - return payload_apply_callback; -} diff --git a/src/postgresql/pgvalue.c b/src/postgresql/pgvalue.c index 01d9cf6..69fd626 100644 --- a/src/postgresql/pgvalue.c +++ b/src/postgresql/pgvalue.c @@ -169,3 +169,30 @@ pgvalue_t **pgvalues_from_args(FunctionCallInfo fcinfo, int start_arg, int *out_ if (out_count) *out_count = count; return values; } + +void pgvalues_normalize_to_text(pgvalue_t **values, int count) { + // Convert all non-text pgvalues to text representation. + // This ensures PK encoding is consistent regardless of whether the caller + // passes native types (e.g., integer 1) or text representations (e.g., '1'). + // The UPDATE trigger casts all values to ::text, so INSERT trigger and + // SQL functions must do the same for PK encoding consistency. + if (!values) return; + + for (int i = 0; i < count; i++) { + pgvalue_t *v = values[i]; + if (!v || v->isnull) continue; + if (pgvalue_is_text_type(v->typeid)) continue; + + // Convert to text using the type's output function + const char *cstr = database_value_text((dbvalue_t *)v); + if (!cstr) continue; + + // Create a new text datum + text *t = cstring_to_text(cstr); + pgvalue_t *new_v = pgvalue_create(PointerGetDatum(t), TEXTOID, -1, v->collation, false); + if (new_v) { + pgvalue_free(v); + values[i] = new_v; + } + } +} diff --git a/src/postgresql/pgvalue.h b/src/postgresql/pgvalue.h index 51d4c0f..3fbd28b 100644 --- a/src/postgresql/pgvalue.h +++ b/src/postgresql/pgvalue.h @@ -39,5 +39,6 @@ bool pgvalue_is_text_type(Oid typeid); int pgvalue_dbtype(pgvalue_t *v); pgvalue_t **pgvalues_from_array(ArrayType *array, int *out_count); pgvalue_t **pgvalues_from_args(FunctionCallInfo fcinfo, int start_arg, int *out_count); +void pgvalues_normalize_to_text(pgvalue_t **values, int count); #endif // CLOUDSYNC_PGVALUE_H diff --git a/src/postgresql/sql_postgresql.c b/src/postgresql/sql_postgresql.c index 3af2c8c..db9c2de 100644 --- a/src/postgresql/sql_postgresql.c +++ b/src/postgresql/sql_postgresql.c @@ -28,7 +28,7 @@ const char * const SQL_TABLE_SETTINGS_DELETE_ALL_FOR_TABLE = const char * const SQL_TABLE_SETTINGS_REPLACE = "INSERT INTO cloudsync_table_settings (tbl_name, col_name, key, value) VALUES ($1, $2, $3, $4) " - "ON CONFLICT (tbl_name, key) DO UPDATE SET col_name = EXCLUDED.col_name, value = EXCLUDED.value;"; + "ON CONFLICT (tbl_name, col_name, key) DO UPDATE SET value = EXCLUDED.value;"; const char * const SQL_TABLE_SETTINGS_DELETE_ONE = "DELETE FROM cloudsync_table_settings WHERE (tbl_name=$1 AND col_name=$2 AND key=$3);"; @@ -40,7 +40,7 @@ const char * const SQL_SETTINGS_LOAD_GLOBAL = "SELECT key, value FROM cloudsync_settings;"; const char * const SQL_SETTINGS_LOAD_TABLE = - "SELECT lower(tbl_name), lower(col_name), key, value FROM cloudsync_table_settings ORDER BY tbl_name;"; + "SELECT lower(tbl_name), lower(col_name), key, value FROM cloudsync_table_settings ORDER BY tbl_name, col_name;"; const char * const SQL_CREATE_SETTINGS_TABLE = "CREATE TABLE IF NOT EXISTS cloudsync_settings (key TEXT PRIMARY KEY NOT NULL, value TEXT);" @@ -75,7 +75,7 @@ const char * const SQL_INSERT_SITE_ID_ROWID = "INSERT INTO cloudsync_site_id (id, site_id) VALUES ($1, $2);"; const char * const SQL_CREATE_TABLE_SETTINGS_TABLE = - "CREATE TABLE IF NOT EXISTS cloudsync_table_settings (tbl_name TEXT NOT NULL, col_name TEXT NOT NULL, key TEXT, value TEXT, PRIMARY KEY(tbl_name,key));"; + "CREATE TABLE IF NOT EXISTS cloudsync_table_settings (tbl_name TEXT NOT NULL, col_name TEXT NOT NULL, key TEXT NOT NULL, value TEXT, PRIMARY KEY(tbl_name,col_name,key));"; const char * const SQL_CREATE_SCHEMA_VERSIONS_TABLE = "CREATE TABLE IF NOT EXISTS cloudsync_schema_versions (hash BIGINT PRIMARY KEY, seq INTEGER NOT NULL)"; @@ -408,3 +408,29 @@ const char * const SQL_CLOUDSYNC_SELECT_PKS_NOT_IN_SYNC_FOR_COL_FILTERED = "SELECT 1 FROM %s _cstemp2 " "WHERE _cstemp2.pk = _cstemp1.pk AND _cstemp2.col_name = $1" ");"; + +// MARK: Blocks (block-level LWW) + +const char * const SQL_BLOCKS_CREATE_TABLE = + "CREATE TABLE IF NOT EXISTS %s (" + "pk BYTEA NOT NULL, " + "col_name TEXT COLLATE \"C\" NOT NULL, " + "col_value TEXT, " + "PRIMARY KEY (pk, col_name))"; + +const char * const SQL_BLOCKS_UPSERT = + "INSERT INTO %s (pk, col_name, col_value) VALUES ($1, $2, $3) " + "ON CONFLICT (pk, col_name) DO UPDATE SET col_value = EXCLUDED.col_value"; + +const char * const SQL_BLOCKS_SELECT = + "SELECT col_value FROM %s WHERE pk = $1 AND col_name = $2"; + +const char * const SQL_BLOCKS_DELETE = + "DELETE FROM %s WHERE pk = $1 AND col_name = $2"; + +const char * const SQL_BLOCKS_LIST_ALIVE = + "SELECT b.col_value FROM %s b " + "JOIN %s m ON b.pk = m.pk AND b.col_name = m.col_name " + "WHERE b.pk = $1 AND b.col_name LIKE $2 " + "AND m.pk = $3 AND m.col_name LIKE $4 AND m.col_version %% 2 = 1 " + "ORDER BY b.col_name COLLATE \"C\""; diff --git a/src/sql.h b/src/sql.h index 7c14988..dfa394e 100644 --- a/src/sql.h +++ b/src/sql.h @@ -67,4 +67,11 @@ extern const char * const SQL_CLOUDSYNC_SELECT_PKS_NOT_IN_SYNC_FOR_COL; extern const char * const SQL_CLOUDSYNC_SELECT_PKS_NOT_IN_SYNC_FOR_COL_FILTERED; extern const char * const SQL_CHANGES_INSERT_ROW; +// BLOCKS (block-level LWW) +extern const char * const SQL_BLOCKS_CREATE_TABLE; +extern const char * const SQL_BLOCKS_UPSERT; +extern const char * const SQL_BLOCKS_SELECT; +extern const char * const SQL_BLOCKS_DELETE; +extern const char * const SQL_BLOCKS_LIST_ALIVE; + #endif diff --git a/src/sqlite/cloudsync_sqlite.c b/src/sqlite/cloudsync_sqlite.c index 08268b3..ebdd1cc 100644 --- a/src/sqlite/cloudsync_sqlite.c +++ b/src/sqlite/cloudsync_sqlite.c @@ -9,11 +9,12 @@ #include "cloudsync_changes_sqlite.h" #include "../pk.h" #include "../cloudsync.h" +#include "../block.h" #include "../database.h" #include "../dbutils.h" #ifndef CLOUDSYNC_OMIT_NETWORK -#include "../network.h" +#include "../network/network.h" #endif #ifndef SQLITE_CORE @@ -139,13 +140,34 @@ void dbsync_set (sqlite3_context *context, int argc, sqlite3_value **argv) { void dbsync_set_column (sqlite3_context *context, int argc, sqlite3_value **argv) { DEBUG_FUNCTION("cloudsync_set_column"); - + const char *tbl = (const char *)database_value_text(argv[0]); const char *col = (const char *)database_value_text(argv[1]); const char *key = (const char *)database_value_text(argv[2]); const char *value = (const char *)database_value_text(argv[3]); - + cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); + + // Handle block column setup: cloudsync_set_column('tbl', 'col', 'algo', 'block') + if (key && value && strcmp(key, "algo") == 0 && strcmp(value, "block") == 0) { + int rc = cloudsync_setup_block_column(data, tbl, col, NULL); + if (rc != DBRES_OK) { + sqlite3_result_error(context, cloudsync_errmsg(data), -1); + } + return; + } + + // Handle delimiter setting: cloudsync_set_column('tbl', 'col', 'delimiter', '\n\n') + if (key && strcmp(key, "delimiter") == 0) { + cloudsync_table_context *table = table_lookup(data, tbl); + if (table) { + int col_idx = table_col_index(table, col); + if (col_idx >= 0 && table_col_algo(table, col_idx) == col_algo_block) { + table_set_col_delimiter(table, col_idx, value); + } + } + } + dbutils_table_settings_set_key_value(data, tbl, col, key, value); } @@ -218,7 +240,7 @@ void dbsync_col_value (sqlite3_context *context, int argc, sqlite3_value **argv) sqlite3_result_null(context); return; } - + // lookup table const char *table_name = (const char *)database_value_text(argv[0]); cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); @@ -227,18 +249,42 @@ void dbsync_col_value (sqlite3_context *context, int argc, sqlite3_value **argv) dbsync_set_error(context, "Unable to retrieve table name %s in clousdsync_colvalue.", table_name); return; } - + + // Block column: if col_name contains \x1F, read from blocks table + if (block_is_block_colname(col_name) && table_has_block_cols(table)) { + dbvm_t *bvm = table_block_value_read_stmt(table); + if (!bvm) { + sqlite3_result_null(context); + return; + } + int rc = databasevm_bind_blob(bvm, 1, database_value_blob(argv[2]), database_value_bytes(argv[2])); + if (rc != DBRES_OK) { databasevm_reset(bvm); sqlite3_result_error(context, database_errmsg(data), -1); return; } + rc = databasevm_bind_text(bvm, 2, col_name, -1); + if (rc != DBRES_OK) { databasevm_reset(bvm); sqlite3_result_error(context, database_errmsg(data), -1); return; } + + rc = databasevm_step(bvm); + if (rc == SQLITE_ROW) { + sqlite3_result_value(context, database_column_value(bvm, 0)); + } else if (rc == SQLITE_DONE) { + sqlite3_result_null(context); + } else { + sqlite3_result_error(context, database_errmsg(data), -1); + } + databasevm_reset(bvm); + return; + } + // extract the right col_value vm associated to the column name sqlite3_stmt *vm = table_column_lookup(table, col_name, false, NULL); if (!vm) { sqlite3_result_error(context, "Unable to retrieve column value precompiled statement in clousdsync_colvalue.", -1); return; } - + // bind primary key values int rc = pk_decode_prikey((char *)database_value_blob(argv[2]), (size_t)database_value_bytes(argv[2]), pk_decode_bind_callback, (void *)vm); if (rc < 0) goto cleanup; - + // execute vm rc = databasevm_step(vm); if (rc == SQLITE_DONE) { @@ -249,7 +295,7 @@ void dbsync_col_value (sqlite3_context *context, int argc, sqlite3_value **argv) rc = SQLITE_OK; sqlite3_result_value(context, database_column_value(vm, 0)); } - + cleanup: if (rc != SQLITE_OK) { sqlite3_result_error(context, database_errmsg(data), -1); @@ -260,7 +306,7 @@ void dbsync_col_value (sqlite3_context *context, int argc, sqlite3_value **argv) void dbsync_pk_encode (sqlite3_context *context, int argc, sqlite3_value **argv) { size_t bsize = 0; char *buffer = pk_encode_prikey((dbvalue_t **)argv, argc, NULL, &bsize); - if (!buffer) { + if (!buffer || buffer == PRIKEY_NULL_CONSTRAINT_ERROR) { sqlite3_result_null(context); return; } @@ -347,6 +393,10 @@ void dbsync_insert (sqlite3_context *context, int argc, sqlite3_value **argv) { sqlite3_result_error(context, "Not enough memory to encode the primary key(s).", -1); return; } + if (pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + dbsync_set_error(context, "Insert aborted because primary key in table %s contains NULL values.", table_name); + return; + } // compute the next database version for tracking changes int64_t db_version = cloudsync_dbversion_next(data, CLOUDSYNC_VALUE_NOTSET); @@ -368,11 +418,59 @@ void dbsync_insert (sqlite3_context *context, int argc, sqlite3_value **argv) { // process each non-primary key column for insert or update for (int i=0; icount); + if (positions) { + for (int b = 0; b < blocks->count; b++) { + char *block_cn = block_build_colname(col, positions[b]); + if (block_cn) { + rc = local_mark_insert_or_update_meta(table, pk, pklen, block_cn, db_version, cloudsync_bumpseq(data)); + + // Store block value in blocks table + dbvm_t *wvm = table_block_value_write_stmt(table); + if (wvm && rc == SQLITE_OK) { + databasevm_bind_blob(wvm, 1, pk, (int)pklen); + databasevm_bind_text(wvm, 2, block_cn, -1); + databasevm_bind_text(wvm, 3, blocks->entries[b].content, -1); + databasevm_step(wvm); + databasevm_reset(wvm); + } + + cloudsync_memory_free(block_cn); + } + cloudsync_memory_free(positions[b]); + if (rc != SQLITE_OK) break; + } + cloudsync_memory_free(positions); + } + block_list_free(blocks); + } + } + databasevm_reset((dbvm_t *)val_vm); + if (rc == DBRES_ROW || rc == DBRES_DONE) rc = SQLITE_OK; + if (rc != SQLITE_OK) goto cleanup; + } else { + // Regular column: mark as inserted or updated in the metadata + rc = local_mark_insert_or_update_meta(table, pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); + if (rc != SQLITE_OK) goto cleanup; + } } - + cleanup: if (rc != SQLITE_OK) sqlite3_result_error(context, database_errmsg(data), -1); // free memory if the primary key was dynamically allocated @@ -407,6 +505,11 @@ void dbsync_delete (sqlite3_context *context, int argc, sqlite3_value **argv) { return; } + if (pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + dbsync_set_error(context, "Delete aborted because primary key in table %s contains NULL values.", table_name); + return; + } + // mark the row as deleted by inserting a delete sentinel into the metadata rc = local_mark_delete_meta(table, pk, pklen, db_version, cloudsync_bumpseq(data)); if (rc != SQLITE_OK) goto cleanup; @@ -542,6 +645,11 @@ void dbsync_update_final (sqlite3_context *context) { dbsync_update_payload_free(payload); return; } + if (pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + dbsync_set_error(context, "Update aborted because primary key in table %s contains NULL values.", table_name); + dbsync_update_payload_free(payload); + return; + } if (prikey_changed) { // if the primary key has changed, we need to handle the row differently: @@ -551,6 +659,7 @@ void dbsync_update_final (sqlite3_context *context) { // encode the OLD primary key into a buffer oldpk = pk_encode_prikey((dbvalue_t **)payload->old_values, table_count_pks(table), buffer2, &oldpklen); if (!oldpk) { + // no check here about PRIKEY_NULL_CONSTRAINT_ERROR because by design oldpk cannot contain NULL values if (pk != buffer) cloudsync_memory_free(pk); sqlite3_result_error(context, "Not enough memory to encode the primary key(s).", -1); dbsync_update_payload_free(payload); @@ -581,10 +690,103 @@ void dbsync_update_final (sqlite3_context *context) { int col_index = table_count_pks(table) + i; // Regular columns start after primary keys if (dbutils_value_compare(payload->old_values[col_index], payload->new_values[col_index]) != 0) { - // if a column value has changed, mark it as updated in the metadata - // columns are in cid order - rc = local_mark_insert_or_update_meta(table, pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); - if (rc != SQLITE_OK) goto cleanup; + if (table_col_algo(table, i) == col_algo_block) { + // Block column: diff old and new text, emit per-block metadata changes + const char *new_text = (const char *)database_value_text(payload->new_values[col_index]); + const char *delim = table_col_delimiter(table, i); + const char *col = table_colname(table, i); + + // Read existing blocks from blocks table + block_list_t *old_blocks = block_list_create_empty(); + if (table_block_list_stmt(table)) { + char *like_pattern = block_build_colname(col, "%"); + if (like_pattern) { + // Query blocks table directly for existing block names and values + char *list_sql = cloudsync_memory_mprintf( + "SELECT col_name, col_value FROM %s WHERE pk = ?1 AND col_name LIKE ?2 ORDER BY col_name", + table_blocks_ref(table)); + if (list_sql) { + dbvm_t *list_vm = NULL; + if (databasevm_prepare(data, list_sql, &list_vm, 0) == DBRES_OK) { + databasevm_bind_blob(list_vm, 1, pk, (int)pklen); + databasevm_bind_text(list_vm, 2, like_pattern, -1); + while (databasevm_step(list_vm) == DBRES_ROW) { + const char *bcn = database_column_text(list_vm, 0); + const char *bval = database_column_text(list_vm, 1); + const char *pos = block_extract_position_id(bcn); + if (pos && old_blocks) { + block_list_add(old_blocks, bval ? bval : "", pos); + } + } + databasevm_finalize(list_vm); + } + cloudsync_memory_free(list_sql); + } + cloudsync_memory_free(like_pattern); + } + } + + // Split new text into parts (NULL text = all blocks removed) + block_list_t *new_blocks = new_text ? block_split(new_text, delim) : block_list_create_empty(); + if (new_blocks && old_blocks) { + // Build array of new content strings (NULL when count is 0) + const char **new_parts = NULL; + if (new_blocks->count > 0) { + new_parts = (const char **)cloudsync_memory_alloc( + (uint64_t)(new_blocks->count * sizeof(char *))); + if (new_parts) { + for (int b = 0; b < new_blocks->count; b++) { + new_parts[b] = new_blocks->entries[b].content; + } + } + } + + if (new_parts || new_blocks->count == 0) { + block_diff_t *diff = block_diff(old_blocks->entries, old_blocks->count, + new_parts, new_blocks->count); + if (diff) { + for (int d = 0; d < diff->count; d++) { + block_diff_entry_t *de = &diff->entries[d]; + char *block_cn = block_build_colname(col, de->position_id); + if (!block_cn) continue; + + if (de->type == BLOCK_DIFF_ADDED || de->type == BLOCK_DIFF_MODIFIED) { + rc = local_mark_insert_or_update_meta(table, pk, pklen, block_cn, + db_version, cloudsync_bumpseq(data)); + // Store block value + if (rc == SQLITE_OK && table_block_value_write_stmt(table)) { + dbvm_t *wvm = table_block_value_write_stmt(table); + databasevm_bind_blob(wvm, 1, pk, (int)pklen); + databasevm_bind_text(wvm, 2, block_cn, -1); + databasevm_bind_text(wvm, 3, de->content, -1); + databasevm_step(wvm); + databasevm_reset(wvm); + } + } else if (de->type == BLOCK_DIFF_REMOVED) { + // Mark block as deleted in metadata (even col_version) + rc = local_mark_delete_block_meta(table, pk, pklen, block_cn, + db_version, cloudsync_bumpseq(data)); + // Remove from blocks table + if (rc == SQLITE_OK) { + block_delete_value_external(data, table, pk, pklen, block_cn); + } + } + cloudsync_memory_free(block_cn); + if (rc != SQLITE_OK) break; + } + block_diff_free(diff); + } + if (new_parts) cloudsync_memory_free((void *)new_parts); + } + } + if (new_blocks) block_list_free(new_blocks); + if (old_blocks) block_list_free(old_blocks); + if (rc != SQLITE_OK) goto cleanup; + } else { + // Regular column: mark as updated in the metadata (columns are in cid order) + rc = local_mark_insert_or_update_meta(table, pk, pklen, table_colname(table, i), db_version, cloudsync_bumpseq(data)); + if (rc != SQLITE_OK) goto cleanup; + } } } @@ -955,6 +1157,62 @@ int dbsync_register_trigger_aggregate (sqlite3 *db, const char *name, void (*xst return dbsync_register_with_flags(db, name, NULL, xstep, xfinal, nargs, FLAGS_TRIGGER, pzErrMsg, ctx, ctx_free); } +// MARK: - Block-level LWW - + +void dbsync_text_materialize (sqlite3_context *context, int argc, sqlite3_value **argv) { + DEBUG_FUNCTION("cloudsync_text_materialize"); + + // argv[0] -> table name + // argv[1] -> column name + // argv[2..N] -> primary key values + + if (argc < 3) { + sqlite3_result_error(context, "cloudsync_text_materialize requires at least 3 arguments: table, column, pk...", -1); + return; + } + + const char *table_name = (const char *)database_value_text(argv[0]); + const char *col_name = (const char *)database_value_text(argv[1]); + cloudsync_context *data = (cloudsync_context *)sqlite3_user_data(context); + + cloudsync_table_context *table = table_lookup(data, table_name); + if (!table) { + dbsync_set_error(context, "Unable to retrieve table name %s in cloudsync_text_materialize.", table_name); + return; + } + + int col_idx = table_col_index(table, col_name); + if (col_idx < 0 || table_col_algo(table, col_idx) != col_algo_block) { + dbsync_set_error(context, "Column %s in table %s is not configured as block-level.", col_name, table_name); + return; + } + + // Encode primary keys + int npks = table_count_pks(table); + if (argc - 2 != npks) { + sqlite3_result_error(context, "Wrong number of primary key values for cloudsync_text_materialize.", -1); + return; + } + + char buffer[1024]; + size_t pklen = sizeof(buffer); + char *pk = pk_encode_prikey((dbvalue_t **)&argv[2], npks, buffer, &pklen); + if (!pk || pk == PRIKEY_NULL_CONSTRAINT_ERROR) { + sqlite3_result_error(context, "Failed to encode primary key(s).", -1); + return; + } + + // Materialize the column + int rc = block_materialize_column(data, table, pk, (int)pklen, col_name); + if (rc != DBRES_OK) { + sqlite3_result_error(context, cloudsync_errmsg(data), -1); + } else { + sqlite3_result_int(context, 1); + } + + if (pk != buffer) cloudsync_memory_free(pk); +} + // MARK: - Row Filter - void dbsync_set_filter (sqlite3_context *context, int argc, sqlite3_value **argv) { @@ -1028,7 +1286,10 @@ int dbsync_register_functions (sqlite3 *db, char **pzErrMsg) { // init memory debugger (NOOP in production) cloudsync_memory_init(1); - + + // set fractional-indexing allocator to use cloudsync memory + block_init_allocator(); + // init context void *ctx = cloudsync_context_create(db); if (!ctx) { @@ -1154,6 +1415,9 @@ int dbsync_register_functions (sqlite3 *db, char **pzErrMsg) { rc = dbsync_register_function(db, "cloudsync_seq", dbsync_seq, 0, pzErrMsg, ctx, NULL); if (rc != SQLITE_OK) return rc; + rc = dbsync_register_function(db, "cloudsync_text_materialize", dbsync_text_materialize, -1, pzErrMsg, ctx, NULL); + if (rc != SQLITE_OK) return rc; + // NETWORK LAYER #ifndef CLOUDSYNC_OMIT_NETWORK rc = cloudsync_network_register(db, pzErrMsg, ctx); diff --git a/src/sqlite/database_sqlite.c b/src/sqlite/database_sqlite.c index 82433fe..b7864bb 100644 --- a/src/sqlite/database_sqlite.c +++ b/src/sqlite/database_sqlite.c @@ -25,8 +25,6 @@ SQLITE_EXTENSION_INIT3 #endif -#define CLOUDSYNC_PAYLOAD_APPLY_CALLBACK_KEY "cloudsync_payload_apply_callback" - // MARK: - SQL - char *sql_build_drop_table (const char *table_name, char *buffer, int bsize, bool is_meta) { @@ -151,6 +149,126 @@ char *sql_build_upsert_pk_and_col (cloudsync_context *data, const char *table_na return (rc == DBRES_OK) ? query : NULL; } +char *sql_build_upsert_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema) { + UNUSED_PARAMETER(schema); + if (ncolnames <= 0 || !colnames) return NULL; + + // Get PK column names via pragma_table_info (same approach as database_pk_names) + char **pk_names = NULL; + int npks = 0; + int rc = database_pk_names(data, table_name, &pk_names, &npks); + if (rc != DBRES_OK || npks <= 0 || !pk_names) return NULL; + + // Build column list: "pk1","pk2","col_a","col_b" + char *col_list = cloudsync_memory_mprintf("\"%w\"", pk_names[0]); + if (!col_list) goto fail; + for (int i = 1; i < npks; i++) { + char *prev = col_list; + col_list = cloudsync_memory_mprintf("%s,\"%w\"", prev, pk_names[i]); + cloudsync_memory_free(prev); + if (!col_list) goto fail; + } + for (int i = 0; i < ncolnames; i++) { + char *prev = col_list; + col_list = cloudsync_memory_mprintf("%s,\"%w\"", prev, colnames[i]); + cloudsync_memory_free(prev); + if (!col_list) goto fail; + } + + // Build bind list: ?,?,?,? + int total = npks + ncolnames; + char *binds = (char *)cloudsync_memory_alloc(total * 2); + if (!binds) { cloudsync_memory_free(col_list); goto fail; } + int pos = 0; + for (int i = 0; i < total; i++) { + if (i > 0) binds[pos++] = ','; + binds[pos++] = '?'; + } + binds[pos] = '\0'; + + // Build excluded set: "col_a"=EXCLUDED."col_a","col_b"=EXCLUDED."col_b" + char *excl = cloudsync_memory_mprintf("\"%w\"=EXCLUDED.\"%w\"", colnames[0], colnames[0]); + if (!excl) { cloudsync_memory_free(col_list); cloudsync_memory_free(binds); goto fail; } + for (int i = 1; i < ncolnames; i++) { + char *prev = excl; + excl = cloudsync_memory_mprintf("%s,\"%w\"=EXCLUDED.\"%w\"", prev, colnames[i], colnames[i]); + cloudsync_memory_free(prev); + if (!excl) { cloudsync_memory_free(col_list); cloudsync_memory_free(binds); goto fail; } + } + + // Assemble final SQL + char *sql = cloudsync_memory_mprintf( + "INSERT INTO \"%w\" (%s) VALUES (%s) ON CONFLICT DO UPDATE SET %s;", + table_name, col_list, binds, excl + ); + + cloudsync_memory_free(col_list); + cloudsync_memory_free(binds); + cloudsync_memory_free(excl); + for (int i = 0; i < npks; i++) cloudsync_memory_free(pk_names[i]); + cloudsync_memory_free(pk_names); + return sql; + +fail: + if (pk_names) { + for (int i = 0; i < npks; i++) cloudsync_memory_free(pk_names[i]); + cloudsync_memory_free(pk_names); + } + return NULL; +} + +char *sql_build_update_pk_and_multi_cols (cloudsync_context *data, const char *table_name, const char **colnames, int ncolnames, const char *schema) { + UNUSED_PARAMETER(schema); + if (ncolnames <= 0 || !colnames) return NULL; + + // Get PK column names + char **pk_names = NULL; + int npks = 0; + int rc = database_pk_names(data, table_name, &pk_names, &npks); + if (rc != DBRES_OK || npks <= 0 || !pk_names) return NULL; + + // Build SET clause: "col_a"=?npks+1,"col_b"=?npks+2 + // Uses numbered parameters to match merge_flush_pending bind order: + // positions 1..npks are PKs, npks+1..npks+ncolnames are column values. + char *set_clause = cloudsync_memory_mprintf("\"%w\"=?%d", colnames[0], npks + 1); + if (!set_clause) goto fail; + for (int i = 1; i < ncolnames; i++) { + char *prev = set_clause; + set_clause = cloudsync_memory_mprintf("%s,\"%w\"=?%d", prev, colnames[i], npks + 1 + i); + cloudsync_memory_free(prev); + if (!set_clause) goto fail; + } + + // Build WHERE clause: "pk1"=?1 AND "pk2"=?2 + char *where_clause = cloudsync_memory_mprintf("\"%w\"=?%d", pk_names[0], 1); + if (!where_clause) { cloudsync_memory_free(set_clause); goto fail; } + for (int i = 1; i < npks; i++) { + char *prev = where_clause; + where_clause = cloudsync_memory_mprintf("%s AND \"%w\"=?%d", prev, pk_names[i], 1 + i); + cloudsync_memory_free(prev); + if (!where_clause) { cloudsync_memory_free(set_clause); goto fail; } + } + + // Assemble: UPDATE "table" SET ... WHERE ... + char *sql = cloudsync_memory_mprintf( + "UPDATE \"%w\" SET %s WHERE %s;", + table_name, set_clause, where_clause + ); + + cloudsync_memory_free(set_clause); + cloudsync_memory_free(where_clause); + for (int i = 0; i < npks; i++) cloudsync_memory_free(pk_names[i]); + cloudsync_memory_free(pk_names); + return sql; + +fail: + if (pk_names) { + for (int i = 0; i < npks; i++) cloudsync_memory_free(pk_names[i]); + cloudsync_memory_free(pk_names); + } + return NULL; +} + char *sql_build_select_cols_by_pk (cloudsync_context *data, const char *table_name, const char *colname, const char *schema) { UNUSED_PARAMETER(schema); char *colnamequote = "\""; @@ -200,6 +318,11 @@ char *database_build_base_ref (const char *schema, const char *table_name) { return cloudsync_string_dup(table_name); } +char *database_build_blocks_ref (const char *schema, const char *table_name) { + // schema unused in SQLite + return cloudsync_memory_mprintf("%s_cloudsync_blocks", table_name); +} + // SQLite version: schema parameter unused (SQLite has no schemas). char *sql_build_delete_cols_not_in_schema_query (const char *schema, const char *table_name, const char *meta_ref, const char *pkcol) { UNUSED_PARAMETER(schema); @@ -322,21 +445,20 @@ static int database_select1_value (cloudsync_context *data, const char *sql, cha return rc; } -static int database_select3_values (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2, int64_t *value3) { +static int database_select2_values (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2) { sqlite3 *db = (sqlite3 *)cloudsync_db(data); // init values and sanity check expected_type *value = NULL; *value2 = 0; - *value3 = 0; *len = 0; sqlite3_stmt *vm = NULL; int rc = sqlite3_prepare_v2((sqlite3 *)db, sql, -1, &vm, NULL); if (rc != SQLITE_OK) goto cleanup_select; - // ensure at least one column - if (sqlite3_column_count(vm) < 3) {rc = SQLITE_MISMATCH; goto cleanup_select;} + // ensure column count + if (sqlite3_column_count(vm) < 2) {rc = SQLITE_MISMATCH; goto cleanup_select;} rc = sqlite3_step(vm); if (rc == SQLITE_DONE) {rc = SQLITE_OK; goto cleanup_select;} // no rows OK @@ -345,7 +467,6 @@ static int database_select3_values (cloudsync_context *data, const char *sql, ch // sanity check column types if (sqlite3_column_type(vm, 0) != SQLITE_BLOB) {rc = SQLITE_MISMATCH; goto cleanup_select;} if (sqlite3_column_type(vm, 1) != SQLITE_INTEGER) {rc = SQLITE_MISMATCH; goto cleanup_select;} - if (sqlite3_column_type(vm, 2) != SQLITE_INTEGER) {rc = SQLITE_MISMATCH; goto cleanup_select;} // 1st column is BLOB const void *blob = (const void *)sqlite3_column_blob(vm, 0); @@ -359,9 +480,8 @@ static int database_select3_values (cloudsync_context *data, const char *sql, ch *len = blob_len; } - // 2nd and 3rd columns are INTEGERS + // 2nd column is INTEGER *value2 = (int64_t)sqlite3_column_int64(vm, 1); - *value3 = (int64_t)sqlite3_column_int64(vm, 2); rc = SQLITE_OK; @@ -456,8 +576,8 @@ int database_select_blob (cloudsync_context *data, const char *sql, char **value return database_select1_value(data, sql, value, len, DBTYPE_BLOB); } -int database_select_blob_2int (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2, int64_t *value3) { - return database_select3_values(data, sql, value, len, value2, value3); +int database_select_blob_int (cloudsync_context *data, const char *sql, char **value, int64_t *len, int64_t *value2) { + return database_select2_values(data, sql, value, len, value2); } const char *database_errmsg (cloudsync_context *data) { @@ -1174,7 +1294,8 @@ void *database_value_dup (dbvalue_t *value) { // MARK: - COLUMN - -const void *database_column_blob (dbvm_t *vm, int index) { +const void *database_column_blob (dbvm_t *vm, int index, size_t *len) { + if (len) *len = sqlite3_column_bytes((sqlite3_stmt *)vm, index); return sqlite3_column_blob((sqlite3_stmt *)vm, index); } @@ -1263,14 +1384,4 @@ uint64_t dbmem_size (void *ptr) { return (uint64_t)sqlite3_msize(ptr); } -// MARK: - Used to implement Server Side RLS - -cloudsync_payload_apply_callback_t cloudsync_get_payload_apply_callback(void *db) { - return (sqlite3_libversion_number() >= 3044000) ? sqlite3_get_clientdata((sqlite3 *)db, CLOUDSYNC_PAYLOAD_APPLY_CALLBACK_KEY) : NULL; -} - -void cloudsync_set_payload_apply_callback(void *db, cloudsync_payload_apply_callback_t callback) { - if (sqlite3_libversion_number() >= 3044000) { - sqlite3_set_clientdata((sqlite3 *)db, CLOUDSYNC_PAYLOAD_APPLY_CALLBACK_KEY, (void*)callback, NULL); - } -} diff --git a/src/sqlite/sql_sqlite.c b/src/sqlite/sql_sqlite.c index 435111f..236a67b 100644 --- a/src/sqlite/sql_sqlite.c +++ b/src/sqlite/sql_sqlite.c @@ -37,7 +37,7 @@ const char * const SQL_SETTINGS_LOAD_GLOBAL = "SELECT key, value FROM cloudsync_settings;"; const char * const SQL_SETTINGS_LOAD_TABLE = - "SELECT lower(tbl_name), lower(col_name), key, value FROM cloudsync_table_settings ORDER BY tbl_name;"; + "SELECT lower(tbl_name), lower(col_name), key, value FROM cloudsync_table_settings ORDER BY tbl_name, col_name;"; const char * const SQL_CREATE_SETTINGS_TABLE = "CREATE TABLE IF NOT EXISTS cloudsync_settings (key TEXT PRIMARY KEY NOT NULL COLLATE NOCASE, value TEXT);"; @@ -276,3 +276,28 @@ const char * const SQL_CLOUDSYNC_SELECT_PKS_NOT_IN_SYNC_FOR_COL_FILTERED = const char * const SQL_CHANGES_INSERT_ROW = "INSERT INTO cloudsync_changes(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) " "VALUES (?,?,?,?,?,?,?,?,?);"; + +// MARK: Blocks (block-level LWW) + +const char * const SQL_BLOCKS_CREATE_TABLE = + "CREATE TABLE IF NOT EXISTS %s (" + "pk BLOB NOT NULL, " + "col_name TEXT NOT NULL, " + "col_value BLOB, " + "PRIMARY KEY (pk, col_name)) WITHOUT ROWID"; + +const char * const SQL_BLOCKS_UPSERT = + "INSERT OR REPLACE INTO %s (pk, col_name, col_value) VALUES (?1, ?2, ?3)"; + +const char * const SQL_BLOCKS_SELECT = + "SELECT col_value FROM %s WHERE pk = ?1 AND col_name = ?2"; + +const char * const SQL_BLOCKS_DELETE = + "DELETE FROM %s WHERE pk = ?1 AND col_name = ?2"; + +const char * const SQL_BLOCKS_LIST_ALIVE = + "SELECT b.col_value FROM %s b " + "JOIN %s m ON b.pk = m.pk AND b.col_name = m.col_name " + "WHERE b.pk = ?1 AND b.col_name LIKE ?2 " + "AND m.pk = ?3 AND m.col_name LIKE ?4 AND m.col_version %% 2 = 1 " + "ORDER BY b.col_name"; diff --git a/test/integration.c b/test/integration.c index 75a65e5..fb8334b 100644 --- a/test/integration.c +++ b/test/integration.c @@ -41,7 +41,7 @@ #define TERMINATE if (db) { db_exec(db, "SELECT cloudsync_terminate();"); } #define ABORT_TEST abort_test: ERROR_MSG TERMINATE if (db) sqlite3_close(db); return rc; -typedef enum { PRINT, NOPRINT, INTGR, GT0 } expected_type; +typedef enum { PRINT, NOPRINT, INTGR, GT0, STR } expected_type; typedef struct { expected_type type; @@ -87,6 +87,15 @@ static int callback(void *data, int argc, char **argv, char **names) { } else goto multiple_columns; break; + case STR: + if(argc == 1){ + if(!argv[0] || strcmp(argv[0], expect->value.s) != 0){ + printf("Error: expected from %s: \"%s\", got \"%s\"\n", names[0], expect->value.s, argv[0] ? argv[0] : "NULL"); + return SQLITE_ERROR; + } + } else goto multiple_columns; + break; + default: printf("Error: unknown expect type\n"); return SQLITE_ERROR; @@ -136,6 +145,16 @@ int db_expect_gt0 (sqlite3 *db, const char *sql) { return rc; } +int db_expect_str (sqlite3 *db, const char *sql, const char *expect) { + expected_t data; + data.type = STR; + data.value.s = expect; + + int rc = sqlite3_exec(db, sql, callback, &data, NULL); + if (rc != SQLITE_OK) printf("Error while executing %s: %s\n", sql, sqlite3_errmsg(db)); + return rc; +} + int open_load_ext(const char *db_path, sqlite3 **out_db) { sqlite3 *db = NULL; int rc = sqlite3_open(db_path, &db); @@ -205,15 +224,20 @@ int test_init (const char *db_path, int init) { rc = db_exec(db, "SELECT cloudsync_init('activities');"); RCHECK rc = db_exec(db, "SELECT cloudsync_init('workouts');"); RCHECK - // init network with connection string + apikey - char network_init[512]; + // init network with JSON connection string + char network_init[1024]; const char* conn_str = getenv("CONNECTION_STRING"); const char* apikey = getenv("APIKEY"); - if (!conn_str || !apikey) { - fprintf(stderr, "Error: CONNECTION_STRING or APIKEY not set.\n"); + const char* project_id = getenv("PROJECT_ID"); + const char* org_id = getenv("ORGANIZATION_ID"); + const char* database = getenv("DATABASE"); + if (!conn_str || !apikey || !project_id || !org_id || !database) { + fprintf(stderr, "Error: CONNECTION_STRING, APIKEY, PROJECT_ID, ORGANIZATION_ID, or DATABASE not set.\n"); exit(1); } - snprintf(network_init, sizeof(network_init), "SELECT cloudsync_network_init('%s?apikey=%s');", conn_str, apikey); + snprintf(network_init, sizeof(network_init), + "SELECT cloudsync_network_init('{\"address\":\"%s\",\"database\":\"%s\",\"projectID\":\"%s\",\"organizationID\":\"%s\",\"apikey\":\"%s\"}');", + conn_str, database, project_id, org_id, apikey); rc = db_exec(db, network_init); RCHECK rc = db_expect_int(db, "SELECT COUNT(*) as count FROM activities;", 0); RCHECK @@ -224,7 +248,7 @@ int test_init (const char *db_path, int init) { snprintf(sql, sizeof(sql), "INSERT INTO users (id, name) VALUES ('%s', '%s');", value, value); rc = db_exec(db, sql); RCHECK rc = db_expect_int(db, "SELECT COUNT(*) as count FROM users;", 1); RCHECK - rc = db_expect_gt0(db, "SELECT cloudsync_network_sync(250,10);"); RCHECK + rc = db_expect_gt0(db, "SELECT cloudsync_network_sync(250,10) ->> '$.receive.rows';"); RCHECK rc = db_expect_gt0(db, "SELECT COUNT(*) as count FROM users;"); RCHECK rc = db_expect_gt0(db, "SELECT COUNT(*) as count FROM activities;"); RCHECK rc = db_expect_int(db, "SELECT COUNT(*) as count FROM workouts;", 0); RCHECK @@ -275,15 +299,20 @@ int test_enable_disable(const char *db_path) { snprintf(sql, sizeof(sql), "INSERT INTO users (id, name) VALUES ('%s-should-sync', '%s-should-sync');", value, value); rc = db_exec(db, sql); RCHECK - // init network with connection string + apikey - char network_init[512]; + // init network with JSON connection string + char network_init[1024]; const char* conn_str = getenv("CONNECTION_STRING"); const char* apikey = getenv("APIKEY"); - if (!conn_str || !apikey) { - fprintf(stderr, "Error: CONNECTION_STRING or APIKEY not set.\n"); + const char* project_id = getenv("PROJECT_ID"); + const char* org_id = getenv("ORGANIZATION_ID"); + const char* database = getenv("DATABASE"); + if (!conn_str || !apikey || !project_id || !org_id || !database) { + fprintf(stderr, "Error: CONNECTION_STRING, APIKEY, PROJECT_ID, ORGANIZATION_ID, or DATABASE not set.\n"); exit(1); } - snprintf(network_init, sizeof(network_init), "SELECT cloudsync_network_init('%s?apikey=%s');", conn_str, apikey); + snprintf(network_init, sizeof(network_init), + "SELECT cloudsync_network_init('{\"address\":\"%s\",\"database\":\"%s\",\"projectID\":\"%s\",\"organizationID\":\"%s\",\"apikey\":\"%s\"}');", + conn_str, database, project_id, org_id, apikey); rc = db_exec(db, network_init); RCHECK rc = db_exec(db, "SELECT cloudsync_network_send_changes();"); RCHECK @@ -305,7 +334,7 @@ int test_enable_disable(const char *db_path) { // init network with connection string + apikey rc = db_exec(db2, network_init); RCHECK - rc = db_expect_gt0(db2, "SELECT cloudsync_network_sync(250,10);"); RCHECK + rc = db_expect_gt0(db2, "SELECT cloudsync_network_sync(250,10) ->> '$.receive.rows';"); RCHECK snprintf(sql, sizeof(sql), "SELECT COUNT(*) FROM users WHERE name='%s';", value); rc = db_expect_int(db2, sql, 0); RCHECK diff --git a/test/postgresql/27_rls_batch_merge.sql b/test/postgresql/27_rls_batch_merge.sql new file mode 100644 index 0000000..2ab51bf --- /dev/null +++ b/test/postgresql/27_rls_batch_merge.sql @@ -0,0 +1,356 @@ +-- 'RLS batch merge test' +-- Verifies that the deferred column-batch merge produces complete rows +-- that work correctly with PostgreSQL Row Level Security policies. +-- +-- Tests 1-3: cloudsync_payload_apply runs as superuser (service-role pattern). +-- RLS is enforced at the query layer when users access data. +-- +-- Tests 4-6: cloudsync_payload_apply runs as non-superuser (authenticated-role +-- pattern). RLS is enforced during the write itself. + +\set testid '27' +\ir helper_test_init.sql + +\set USER1 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' +\set USER2 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb' + +-- ============================================================ +-- DB A: source database (no RLS) +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_27_a; +CREATE DATABASE cloudsync_test_27_a; + +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE documents ( + id TEXT PRIMARY KEY NOT NULL, + user_id UUID, + title TEXT, + content TEXT +); +SELECT cloudsync_init('documents') AS _init_site_id_a \gset + +-- ============================================================ +-- DB B: target database (with RLS) +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_27_b; +CREATE DATABASE cloudsync_test_27_b; + +-- Create non-superuser role (ignore error if it already exists) +DO $$ BEGIN + IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'test_rls_user') THEN + CREATE ROLE test_rls_user LOGIN; + END IF; +END $$; + +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE documents ( + id TEXT PRIMARY KEY NOT NULL, + user_id UUID, + title TEXT, + content TEXT +); +SELECT cloudsync_init('documents') AS _init_site_id_b \gset + +-- Auth mock: auth.uid() reads from session variable app.current_user_id +CREATE SCHEMA IF NOT EXISTS auth; +CREATE OR REPLACE FUNCTION auth.uid() RETURNS UUID + LANGUAGE sql STABLE +AS $$ SELECT NULLIF(current_setting('app.current_user_id', true), '')::UUID; $$; + +-- Enable RLS +ALTER TABLE documents ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "select_own" ON documents FOR SELECT + USING (auth.uid() = user_id); +CREATE POLICY "insert_own" ON documents FOR INSERT + WITH CHECK (auth.uid() = user_id); +CREATE POLICY "update_own" ON documents FOR UPDATE + USING (auth.uid() = user_id) + WITH CHECK (auth.uid() = user_id); +CREATE POLICY "delete_own" ON documents FOR DELETE + USING (auth.uid() = user_id); + +-- Grant permissions to test_rls_user +GRANT USAGE ON SCHEMA public TO test_rls_user; +GRANT ALL ON ALL TABLES IN SCHEMA public TO test_rls_user; +GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO test_rls_user; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO test_rls_user; +GRANT USAGE ON SCHEMA auth TO test_rls_user; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA auth TO test_rls_user; + +-- ============================================================ +-- Test 1: Batch merge produces complete row — user1 doc synced +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +INSERT INTO documents VALUES ('doc1', :'USER1'::UUID, 'Title 1', 'Content 1'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_1 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +-- Save high-water mark so subsequent encodes only pick up new changes +SELECT COALESCE(max(db_version), 0) AS max_dbv_1 FROM cloudsync_changes \gset + +-- Apply as superuser (service-role pattern) +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_hex_1', 'hex')) AS apply_1 \gset + +-- 1 row × 3 non-PK columns = 3 column-change entries +SELECT (:apply_1::int = 3) AS apply_1_ok \gset +\if :apply_1_ok +\echo [PASS] (:testid) RLS: apply returned :apply_1 +\else +\echo [FAIL] (:testid) RLS: apply returned :apply_1 (expected 3) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify complete row written (all columns present) +SELECT COUNT(*) AS doc1_count FROM documents WHERE id = 'doc1' AND title = 'Title 1' AND content = 'Content 1' AND user_id = :'USER1'::UUID \gset +SELECT (:doc1_count::int = 1) AS test1_ok \gset +\if :test1_ok +\echo [PASS] (:testid) RLS: batch merge writes complete row +\else +\echo [FAIL] (:testid) RLS: batch merge writes complete row — got :doc1_count matching rows +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Sync user2 doc, then verify RLS hides it from user1 +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +INSERT INTO documents VALUES ('doc2', :'USER2'::UUID, 'Title 2', 'Content 2'); + +-- Encode only changes newer than test 1 (doc2 only) +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_2 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_1 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_2 FROM cloudsync_changes \gset + +-- Apply as superuser +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_hex_2', 'hex')) AS apply_2 \gset + +-- 1 row × 3 non-PK columns = 3 entries +SELECT (:apply_2::int = 3) AS apply_2_ok \gset +\if :apply_2_ok +\echo [PASS] (:testid) RLS: apply returned :apply_2 +\else +\echo [FAIL] (:testid) RLS: apply returned :apply_2 (expected 3) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify doc2 exists (superuser sees all) +SELECT COUNT(*) AS doc2_exists FROM documents WHERE id = 'doc2' \gset + +-- Now check as user1: RLS should hide doc2 (owned by user2) +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT COUNT(*) AS doc2_visible FROM documents WHERE id = 'doc2' \gset +RESET ROLE; + +SELECT (:doc2_exists::int = 1 AND :doc2_visible::int = 0) AS test2_ok \gset +\if :test2_ok +\echo [PASS] (:testid) RLS: user2 doc synced but hidden from user1 +\else +\echo [FAIL] (:testid) RLS: user2 doc synced but hidden from user1 — exists=:doc2_exists visible=:doc2_visible +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Update doc1, verify user1 sees update via RLS +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +UPDATE documents SET title = 'Title 1 Updated' WHERE id = 'doc1'; + +-- Encode only changes newer than test 2 (doc1 update only) +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_3 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_2 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_3 FROM cloudsync_changes \gset + +-- Apply as superuser +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_hex_3', 'hex')) AS apply_3 \gset + +-- 1 row × 1 changed column (title) = 1 entry +SELECT (:apply_3::int = 1) AS apply_3_ok \gset +\if :apply_3_ok +\echo [PASS] (:testid) RLS: apply returned :apply_3 +\else +\echo [FAIL] (:testid) RLS: apply returned :apply_3 (expected 1) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify update applied (superuser check) +SELECT COUNT(*) AS doc1_updated FROM documents WHERE id = 'doc1' AND title = 'Title 1 Updated' \gset + +-- Verify user1 can see the updated row via RLS +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT COUNT(*) AS doc1_visible FROM documents WHERE id = 'doc1' AND title = 'Title 1 Updated' \gset +RESET ROLE; + +SELECT (:doc1_updated::int = 1 AND :doc1_visible::int = 1) AS test3_ok \gset +\if :test3_ok +\echo [PASS] (:testid) RLS: update synced and visible to owner +\else +\echo [FAIL] (:testid) RLS: update synced and visible to owner — updated=:doc1_updated visible=:doc1_visible +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Authenticated insert allowed (own row) +-- cloudsync_payload_apply as non-superuser with matching user_id +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +INSERT INTO documents VALUES ('doc3', :'USER1'::UUID, 'Title 3', 'Content 3'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_4 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_3 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_4 FROM cloudsync_changes \gset + +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_4', 'hex')) AS apply_4 \gset +RESET ROLE; + +-- 1 row × 3 non-PK columns = 3 entries +SELECT (:apply_4::int = 3) AS apply_4_ok \gset +\if :apply_4_ok +\echo [PASS] (:testid) RLS auth: apply returned :apply_4 +\else +\echo [FAIL] (:testid) RLS auth: apply returned :apply_4 (expected 3) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify doc3 exists with all columns correct +SELECT COUNT(*) AS doc3_count FROM documents WHERE id = 'doc3' AND title = 'Title 3' AND content = 'Content 3' AND user_id = :'USER1'::UUID \gset +SELECT (:doc3_count::int = 1) AS test4_ok \gset +\if :test4_ok +\echo [PASS] (:testid) RLS auth: insert own row allowed +\else +\echo [FAIL] (:testid) RLS auth: insert own row allowed — got :doc3_count matching rows +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: Authenticated insert denied (other user's row) +-- cloudsync_payload_apply as non-superuser with mismatched user_id +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +INSERT INTO documents VALUES ('doc4', :'USER2'::UUID, 'Title 4', 'Content 4'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_5 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_4 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_5 FROM cloudsync_changes \gset + +-- Apply as test_rls_user with USER1 identity — should be denied (doc4 owned by USER2) +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_5', 'hex')) AS apply_5 \gset + +-- Reconnect for clean state after expected RLS denial +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql + +-- 1 row × 3 non-PK columns = 3 entries (returned even if denied) +SELECT (:apply_5::int = 3) AS apply_5_ok \gset +\if :apply_5_ok +\echo [PASS] (:testid) RLS auth: denied apply returned :apply_5 +\else +\echo [FAIL] (:testid) RLS auth: denied apply returned :apply_5 (expected 3) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify doc4 does NOT exist (superuser check) +SELECT COUNT(*) AS doc4_count FROM documents WHERE id = 'doc4' \gset +SELECT (:doc4_count::int = 0) AS test5_ok \gset +\if :test5_ok +\echo [PASS] (:testid) RLS auth: insert other user row denied +\else +\echo [FAIL] (:testid) RLS auth: insert other user row denied — got :doc4_count rows (expected 0) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Authenticated update allowed (own row) +-- cloudsync_payload_apply as non-superuser updating own row +-- ============================================================ +\connect cloudsync_test_27_a +\ir helper_psql_conn_setup.sql +UPDATE documents SET title = 'Title 3 Updated' WHERE id = 'doc3'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_6 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_5 \gset + +\connect cloudsync_test_27_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_6', 'hex')) AS apply_6 \gset +RESET ROLE; + +-- 1 row × 1 changed column (title) = 1 entry +SELECT (:apply_6::int = 1) AS apply_6_ok \gset +\if :apply_6_ok +\echo [PASS] (:testid) RLS auth: apply returned :apply_6 +\else +\echo [FAIL] (:testid) RLS auth: apply returned :apply_6 (expected 1) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify doc3 title was updated +SELECT COUNT(*) AS doc3_updated FROM documents WHERE id = 'doc3' AND title = 'Title 3 Updated' \gset +SELECT (:doc3_updated::int = 1) AS test6_ok \gset +\if :test6_ok +\echo [PASS] (:testid) RLS auth: update own row allowed +\else +\echo [FAIL] (:testid) RLS auth: update own row allowed — got :doc3_updated matching rows +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_test_27_a; +DROP DATABASE IF EXISTS cloudsync_test_27_b; +DROP ROLE IF EXISTS test_rls_user; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/28_db_version_tracking.sql b/test/postgresql/28_db_version_tracking.sql new file mode 100644 index 0000000..25255ee --- /dev/null +++ b/test/postgresql/28_db_version_tracking.sql @@ -0,0 +1,275 @@ +-- Test db_version/seq tracking in cloudsync_changes after payload apply +-- PostgreSQL equivalent of SQLite unit tests: +-- "Merge Test db_version 1" (do_test_merge_check_db_version) +-- "Merge Test db_version 2" (do_test_merge_check_db_version_2) + +\set testid '28' +\ir helper_test_init.sql + +-- ============================================================ +-- Setup: create databases A and B with the todo table +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_28_a; +DROP DATABASE IF EXISTS cloudsync_test_28_b; +CREATE DATABASE cloudsync_test_28_a; +CREATE DATABASE cloudsync_test_28_b; + +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE todo (id TEXT PRIMARY KEY NOT NULL, title TEXT, status TEXT); +SELECT cloudsync_init('todo', 'CLS', true) AS _init_a \gset + +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE todo (id TEXT PRIMARY KEY NOT NULL, title TEXT, status TEXT); +SELECT cloudsync_init('todo', 'CLS', true) AS _init_b \gset + +-- ============================================================ +-- Test 1: One-way merge (A -> B), mixed insert patterns +-- Mirrors do_test_merge_check_db_version from test/unit.c +-- ============================================================ + +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql + +-- Autocommit insert (db_version 1) +INSERT INTO todo VALUES ('ID1', 'Buy groceries', 'in_progress1'); + +-- Multi-row insert (db_version 2 — single statement) +INSERT INTO todo VALUES ('ID2', 'Buy bananas', 'in_progress2'), ('ID3', 'Buy vegetables', 'in_progress3'); + +-- Autocommit insert (db_version 3) +INSERT INTO todo VALUES ('ID4', 'Buy apples', 'in_progress4'); + +-- Transaction with 3 inserts (db_version 4 — one transaction) +BEGIN; +INSERT INTO todo VALUES ('ID5', 'Buy oranges', 'in_progress5'); +INSERT INTO todo VALUES ('ID6', 'Buy lemons', 'in_progress6'); +INSERT INTO todo VALUES ('ID7', 'Buy pizza', 'in_progress7'); +COMMIT; + +-- Encode payload +SELECT CASE WHEN payload IS NULL OR octet_length(payload) = 0 + THEN '' + ELSE '\x' || encode(payload, 'hex') + END AS payload_a_t1, + (payload IS NOT NULL AND octet_length(payload) > 0) AS payload_a_t1_ok +FROM ( + SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) AS payload + FROM cloudsync_changes + WHERE site_id = cloudsync_siteid() +) AS p \gset + +-- Apply to B +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +\if :payload_a_t1_ok +SELECT cloudsync_payload_apply(decode(substr(:'payload_a_t1', 3), 'hex')) AS _apply_t1 \gset +\endif + +-- Verify data matches +SELECT md5(COALESCE(string_agg(id || ':' || COALESCE(title, '') || ':' || COALESCE(status, ''), ',' ORDER BY id), '')) AS hash_b_t1 +FROM todo \gset + +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +SELECT md5(COALESCE(string_agg(id || ':' || COALESCE(title, '') || ':' || COALESCE(status, ''), ',' ORDER BY id), '')) AS hash_a_t1 +FROM todo \gset + +SELECT (:'hash_a_t1' = :'hash_b_t1') AS t1_data_ok \gset +\if :t1_data_ok +\echo [PASS] (:testid) db_version test 1: data roundtrip matches +\else +\echo [FAIL] (:testid) db_version test 1: data roundtrip mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify no repeated (db_version, seq) tuples on B +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +SELECT COUNT(*) AS dup_count_b_t1 +FROM ( + SELECT db_version, seq, COUNT(*) AS cnt + FROM cloudsync_changes + GROUP BY db_version, seq + HAVING COUNT(*) > 1 +) AS dups \gset + +SELECT (:dup_count_b_t1::int = 0) AS t1_no_dups_b \gset +\if :t1_no_dups_b +\echo [PASS] (:testid) db_version test 1: no duplicate (db_version, seq) on B +\else +\echo [FAIL] (:testid) db_version test 1: duplicate (db_version, seq) on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify row count +SELECT COUNT(*) AS row_count_b_t1 FROM todo \gset +SELECT (:row_count_b_t1::int = 7) AS t1_count_ok \gset +\if :t1_count_ok +\echo [PASS] (:testid) db_version test 1: row count correct (7) +\else +\echo [FAIL] (:testid) db_version test 1: expected 7 rows, got :row_count_b_t1 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Bidirectional merge (A -> B, B -> A), mixed patterns +-- Mirrors do_test_merge_check_db_version_2 from test/unit.c +-- ============================================================ + +-- Reset: drop and recreate databases +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_28_a; +DROP DATABASE IF EXISTS cloudsync_test_28_b; +CREATE DATABASE cloudsync_test_28_a; +CREATE DATABASE cloudsync_test_28_b; + +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE todo (id TEXT PRIMARY KEY NOT NULL, title TEXT, status TEXT); +SELECT cloudsync_init('todo', 'CLS', true) AS _init_a2 \gset + +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE todo (id TEXT PRIMARY KEY NOT NULL, title TEXT, status TEXT); +SELECT cloudsync_init('todo', 'CLS', true) AS _init_b2 \gset + +-- DB A: two autocommit inserts (db_version 1, 2) +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +INSERT INTO todo VALUES ('ID1', 'Buy groceries', 'in_progress'); +INSERT INTO todo VALUES ('ID2', 'Foo', 'Bar'); + +-- DB B: two autocommit inserts + one transaction with 2 inserts +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +INSERT INTO todo VALUES ('ID3', 'Foo3', 'Bar3'); +INSERT INTO todo VALUES ('ID4', 'Foo4', 'Bar4'); +BEGIN; +INSERT INTO todo VALUES ('ID5', 'Foo5', 'Bar5'); +INSERT INTO todo VALUES ('ID6', 'Foo6', 'Bar6'); +COMMIT; + +-- Encode A's payload +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +SELECT CASE WHEN payload IS NULL OR octet_length(payload) = 0 + THEN '' + ELSE '\x' || encode(payload, 'hex') + END AS payload_a_t2, + (payload IS NOT NULL AND octet_length(payload) > 0) AS payload_a_t2_ok +FROM ( + SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) AS payload + FROM cloudsync_changes + WHERE site_id = cloudsync_siteid() +) AS p \gset + +-- Encode B's payload +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +SELECT CASE WHEN payload IS NULL OR octet_length(payload) = 0 + THEN '' + ELSE '\x' || encode(payload, 'hex') + END AS payload_b_t2, + (payload IS NOT NULL AND octet_length(payload) > 0) AS payload_b_t2_ok +FROM ( + SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) AS payload + FROM cloudsync_changes + WHERE site_id = cloudsync_siteid() +) AS p \gset + +-- Apply A -> B +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +\if :payload_a_t2_ok +SELECT cloudsync_payload_apply(decode(substr(:'payload_a_t2', 3), 'hex')) AS _apply_a_to_b \gset +\endif + +-- Apply B -> A +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +\if :payload_b_t2_ok +SELECT cloudsync_payload_apply(decode(substr(:'payload_b_t2', 3), 'hex')) AS _apply_b_to_a \gset +\endif + +-- Verify data matches between A and B +SELECT md5(COALESCE(string_agg(id || ':' || COALESCE(title, '') || ':' || COALESCE(status, ''), ',' ORDER BY id), '')) AS hash_a_t2 +FROM todo \gset + +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +SELECT md5(COALESCE(string_agg(id || ':' || COALESCE(title, '') || ':' || COALESCE(status, ''), ',' ORDER BY id), '')) AS hash_b_t2 +FROM todo \gset + +SELECT (:'hash_a_t2' = :'hash_b_t2') AS t2_data_ok \gset +\if :t2_data_ok +\echo [PASS] (:testid) db_version test 2: bidirectional data matches +\else +\echo [FAIL] (:testid) db_version test 2: bidirectional data mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify row count (6 rows: ID1-ID6) +SELECT COUNT(*) AS row_count_t2 FROM todo \gset +SELECT (:row_count_t2::int = 6) AS t2_count_ok \gset +\if :t2_count_ok +\echo [PASS] (:testid) db_version test 2: row count correct (6) +\else +\echo [FAIL] (:testid) db_version test 2: expected 6 rows, got :row_count_t2 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify no repeated (db_version, seq) tuples on A +\connect cloudsync_test_28_a +\ir helper_psql_conn_setup.sql +SELECT COUNT(*) AS dup_count_a_t2 +FROM ( + SELECT db_version, seq, COUNT(*) AS cnt + FROM cloudsync_changes + GROUP BY db_version, seq + HAVING COUNT(*) > 1 +) AS dups \gset + +SELECT (:dup_count_a_t2::int = 0) AS t2_no_dups_a \gset +\if :t2_no_dups_a +\echo [PASS] (:testid) db_version test 2: no duplicate (db_version, seq) on A +\else +\echo [FAIL] (:testid) db_version test 2: duplicate (db_version, seq) on A +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify no repeated (db_version, seq) tuples on B +\connect cloudsync_test_28_b +\ir helper_psql_conn_setup.sql +SELECT COUNT(*) AS dup_count_b_t2 +FROM ( + SELECT db_version, seq, COUNT(*) AS cnt + FROM cloudsync_changes + GROUP BY db_version, seq + HAVING COUNT(*) > 1 +) AS dups \gset + +SELECT (:dup_count_b_t2::int = 0) AS t2_no_dups_b \gset +\if :t2_no_dups_b +\echo [PASS] (:testid) db_version test 2: no duplicate (db_version, seq) on B +\else +\echo [FAIL] (:testid) db_version test 2: duplicate (db_version, seq) on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +-- DROP DATABASE IF EXISTS cloudsync_test_28_a; +-- DROP DATABASE IF EXISTS cloudsync_test_28_b; +\endif diff --git a/test/postgresql/29_rls_multicol.sql b/test/postgresql/29_rls_multicol.sql new file mode 100644 index 0000000..de8f304 --- /dev/null +++ b/test/postgresql/29_rls_multicol.sql @@ -0,0 +1,435 @@ +-- 'RLS multi-column batch merge test' +-- Extends test 27 with more column types (INTEGER, BOOLEAN) and additional +-- test cases: update-denied, mixed payloads (per-PK savepoint isolation), +-- and NULL handling. +-- +-- Tests 1-2: superuser (service-role pattern) +-- Tests 3-8: authenticated-role pattern + +\set testid '29' +\ir helper_test_init.sql + +\set USER1 'aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa' +\set USER2 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb' + +-- ============================================================ +-- DB A: source database (no RLS) +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_29_a; +CREATE DATABASE cloudsync_test_29_a; + +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE tasks ( + id TEXT PRIMARY KEY NOT NULL, + user_id UUID, + title TEXT, + description TEXT, + priority INTEGER, + is_complete BOOLEAN +); +SELECT cloudsync_init('tasks') AS _init_site_id_a \gset + +-- ============================================================ +-- DB B: target database (with RLS) +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_test_29_b; +CREATE DATABASE cloudsync_test_29_b; + +-- Create non-superuser role (ignore error if it already exists) +DO $$ BEGIN + IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = 'test_rls_user') THEN + CREATE ROLE test_rls_user LOGIN; + END IF; +END $$; + +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE tasks ( + id TEXT PRIMARY KEY NOT NULL, + user_id UUID, + title TEXT, + description TEXT, + priority INTEGER, + is_complete BOOLEAN +); +SELECT cloudsync_init('tasks') AS _init_site_id_b \gset + +-- Auth mock: auth.uid() reads from session variable app.current_user_id +CREATE SCHEMA IF NOT EXISTS auth; +CREATE OR REPLACE FUNCTION auth.uid() RETURNS UUID + LANGUAGE sql STABLE +AS $$ SELECT NULLIF(current_setting('app.current_user_id', true), '')::UUID; $$; + +-- Enable RLS +ALTER TABLE tasks ENABLE ROW LEVEL SECURITY; + +CREATE POLICY "select_own" ON tasks FOR SELECT + USING (auth.uid() = user_id); +CREATE POLICY "insert_own" ON tasks FOR INSERT + WITH CHECK (auth.uid() = user_id); +CREATE POLICY "update_own" ON tasks FOR UPDATE + USING (auth.uid() = user_id) + WITH CHECK (auth.uid() = user_id); +CREATE POLICY "delete_own" ON tasks FOR DELETE + USING (auth.uid() = user_id); + +-- Grant permissions to test_rls_user +GRANT USAGE ON SCHEMA public TO test_rls_user; +GRANT ALL ON ALL TABLES IN SCHEMA public TO test_rls_user; +GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO test_rls_user; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA public TO test_rls_user; +GRANT USAGE ON SCHEMA auth TO test_rls_user; +GRANT EXECUTE ON ALL FUNCTIONS IN SCHEMA auth TO test_rls_user; + +-- ============================================================ +-- Test 1: Superuser multi-row insert with varied types +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +INSERT INTO tasks VALUES ('t1', :'USER1'::UUID, 'Task 1', 'Desc 1', 3, false); +INSERT INTO tasks VALUES ('t2', :'USER1'::UUID, 'Task 2', 'Desc 2', 1, true); +INSERT INTO tasks VALUES ('t3', :'USER2'::UUID, 'Task 3', 'Desc 3', 5, false); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_1 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_1 FROM cloudsync_changes \gset + +-- Apply as superuser +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_hex_1', 'hex')) AS apply_1 \gset + +-- 3 rows × 5 non-PK columns = 15 column-change entries +SELECT (:apply_1::int = 15) AS apply_1_ok \gset +\if :apply_1_ok +\echo [PASS] (:testid) RLS multicol: superuser multi-row apply returned :apply_1 +\else +\echo [FAIL] (:testid) RLS multicol: superuser multi-row apply returned :apply_1 (expected 15) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify all 3 rows with correct column values +SELECT COUNT(*) AS t1_ok FROM tasks WHERE id = 't1' AND user_id = :'USER1'::UUID AND title = 'Task 1' AND description = 'Desc 1' AND priority = 3 AND is_complete = false \gset +SELECT COUNT(*) AS t2_ok FROM tasks WHERE id = 't2' AND user_id = :'USER1'::UUID AND title = 'Task 2' AND description = 'Desc 2' AND priority = 1 AND is_complete = true \gset +SELECT COUNT(*) AS t3_ok FROM tasks WHERE id = 't3' AND user_id = :'USER2'::UUID AND title = 'Task 3' AND description = 'Desc 3' AND priority = 5 AND is_complete = false \gset +SELECT (:t1_ok::int = 1 AND :t2_ok::int = 1 AND :t3_ok::int = 1) AS test1_ok \gset +\if :test1_ok +\echo [PASS] (:testid) RLS multicol: superuser multi-row insert with varied types +\else +\echo [FAIL] (:testid) RLS multicol: superuser multi-row insert with varied types — t1=:t1_ok t2=:t2_ok t3=:t3_ok +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Superuser multi-column partial update +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +UPDATE tasks SET title = 'Task 1 Updated', priority = 10, is_complete = true WHERE id = 't1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_2 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_1 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_2 FROM cloudsync_changes \gset + +-- Apply as superuser +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_hex_2', 'hex')) AS apply_2 \gset + +-- 1 row × 3 changed columns (title, priority, is_complete) = 3 entries +SELECT (:apply_2::int = 3) AS apply_2_ok \gset +\if :apply_2_ok +\echo [PASS] (:testid) RLS multicol: superuser partial update apply returned :apply_2 +\else +\echo [FAIL] (:testid) RLS multicol: superuser partial update apply returned :apply_2 (expected 3) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify updated columns changed and description preserved +SELECT COUNT(*) AS t1_updated FROM tasks WHERE id = 't1' AND title = 'Task 1 Updated' AND description = 'Desc 1' AND priority = 10 AND is_complete = true \gset +SELECT (:t1_updated::int = 1) AS test2_ok \gset +\if :test2_ok +\echo [PASS] (:testid) RLS multicol: superuser partial update preserves unchanged columns +\else +\echo [FAIL] (:testid) RLS multicol: superuser partial update preserves unchanged columns — got :t1_updated +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Authenticated insert own row (all columns) +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +INSERT INTO tasks VALUES ('t4', :'USER1'::UUID, 'Task 4', 'Desc 4', 2, false); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_3 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_2 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_3 FROM cloudsync_changes \gset + +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_3', 'hex')) AS apply_3 \gset +RESET ROLE; + +-- 1 row × 5 non-PK columns = 5 entries +SELECT (:apply_3::int = 5) AS apply_3_ok \gset +\if :apply_3_ok +\echo [PASS] (:testid) RLS multicol auth: insert own row apply returned :apply_3 +\else +\echo [FAIL] (:testid) RLS multicol auth: insert own row apply returned :apply_3 (expected 5) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify row exists with all columns correct +SELECT COUNT(*) AS t4_count FROM tasks WHERE id = 't4' AND user_id = :'USER1'::UUID AND title = 'Task 4' AND description = 'Desc 4' AND priority = 2 AND is_complete = false \gset +SELECT (:t4_count::int = 1) AS test3_ok \gset +\if :test3_ok +\echo [PASS] (:testid) RLS multicol auth: insert own row allowed +\else +\echo [FAIL] (:testid) RLS multicol auth: insert own row allowed — got :t4_count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Authenticated insert denied (other user's row) +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +INSERT INTO tasks VALUES ('t5', :'USER2'::UUID, 'Task 5', 'Desc 5', 7, true); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_4 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_3 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_4 FROM cloudsync_changes \gset + +-- Apply as test_rls_user with USER1 identity — should be denied (t5 owned by USER2) +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_4', 'hex')) AS apply_4 \gset + +-- Reconnect for clean state after expected RLS denial +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql + +-- 1 row × 5 columns = 5 entries in payload (returned even if denied) +SELECT (:apply_4::int = 5) AS apply_4_ok \gset +\if :apply_4_ok +\echo [PASS] (:testid) RLS multicol auth: denied insert apply returned :apply_4 +\else +\echo [FAIL] (:testid) RLS multicol auth: denied insert apply returned :apply_4 (expected 5) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify t5 does NOT exist (superuser check) +SELECT COUNT(*) AS t5_count FROM tasks WHERE id = 't5' \gset +SELECT (:t5_count::int = 0) AS test4_ok \gset +\if :test4_ok +\echo [PASS] (:testid) RLS multicol auth: insert other user row denied +\else +\echo [FAIL] (:testid) RLS multicol auth: insert other user row denied — got :t5_count rows (expected 0) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: Authenticated update own row (multiple columns) +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +UPDATE tasks SET title = 'Task 4 Updated', priority = 9 WHERE id = 't4'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_5 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_4 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_5 FROM cloudsync_changes \gset + +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_5', 'hex')) AS apply_5 \gset +RESET ROLE; + +-- 1 row × 2 changed columns (title, priority) = 2 entries +SELECT (:apply_5::int = 2) AS apply_5_ok \gset +\if :apply_5_ok +\echo [PASS] (:testid) RLS multicol auth: update own row apply returned :apply_5 +\else +\echo [FAIL] (:testid) RLS multicol auth: update own row apply returned :apply_5 (expected 2) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify both columns changed, others preserved +SELECT COUNT(*) AS t4_updated FROM tasks WHERE id = 't4' AND title = 'Task 4 Updated' AND description = 'Desc 4' AND priority = 9 AND is_complete = false \gset +SELECT (:t4_updated::int = 1) AS test5_ok \gset +\if :test5_ok +\echo [PASS] (:testid) RLS multicol auth: update own row allowed +\else +\echo [FAIL] (:testid) RLS multicol auth: update own row allowed — got :t4_updated +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Authenticated update denied (other user's row) +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +-- t3 is owned by USER2, update it on A +UPDATE tasks SET title = 'Task 3 Hacked', priority = 99 WHERE id = 't3'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_6 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_5 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_6 FROM cloudsync_changes \gset + +-- Apply as test_rls_user with USER1 identity — should be denied (t3 owned by USER2) +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_6', 'hex')) AS apply_6 \gset + +-- Reconnect for clean state after expected RLS denial +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql + +-- 1 row × 2 changed columns (title, priority) = 2 entries in payload +SELECT (:apply_6::int = 2) AS apply_6_ok \gset +\if :apply_6_ok +\echo [PASS] (:testid) RLS multicol auth: denied update apply returned :apply_6 +\else +\echo [FAIL] (:testid) RLS multicol auth: denied update apply returned :apply_6 (expected 2) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify t3 still has original values (superuser check) +SELECT COUNT(*) AS t3_unchanged FROM tasks WHERE id = 't3' AND title = 'Task 3' AND priority = 5 \gset +SELECT (:t3_unchanged::int = 1) AS test6_ok \gset +\if :test6_ok +\echo [PASS] (:testid) RLS multicol auth: update other user row denied +\else +\echo [FAIL] (:testid) RLS multicol auth: update other user row denied — got :t3_unchanged (expected 1 unchanged) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 7: Mixed payload — own + other user's rows (per-PK savepoint) +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +INSERT INTO tasks VALUES ('t6', :'USER1'::UUID, 'Task 6', 'Desc 6', 4, false); +INSERT INTO tasks VALUES ('t7', :'USER2'::UUID, 'Task 7', 'Desc 7', 8, true); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_7 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_6 \gset + +SELECT COALESCE(max(db_version), 0) AS max_dbv_7 FROM cloudsync_changes \gset + +-- Apply as test_rls_user with USER1 identity +-- Per-PK savepoint: t6 (USER1) should succeed, t7 (USER2) should be denied +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_7', 'hex')) AS apply_7 \gset + +-- Reconnect for clean verification as superuser +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql + +-- 2 rows × 5 columns = 10 entries in payload +SELECT (:apply_7::int = 10) AS apply_7_ok \gset +\if :apply_7_ok +\echo [PASS] (:testid) RLS multicol auth: mixed payload apply returned :apply_7 +\else +\echo [FAIL] (:testid) RLS multicol auth: mixed payload apply returned :apply_7 (expected 10) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- t6 (own row) should exist, t7 (other's row) should NOT +SELECT COUNT(*) AS t6_exists FROM tasks WHERE id = 't6' AND user_id = :'USER1'::UUID AND title = 'Task 6' \gset +SELECT COUNT(*) AS t7_exists FROM tasks WHERE id = 't7' \gset +SELECT (:t6_exists::int = 1 AND :t7_exists::int = 0) AS test7_ok \gset +\if :test7_ok +\echo [PASS] (:testid) RLS multicol auth: mixed payload — per-PK savepoint isolation +\else +\echo [FAIL] (:testid) RLS multicol auth: mixed payload — t6=:t6_exists (expect 1) t7=:t7_exists (expect 0) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 8: NULL in non-ownership columns +-- ============================================================ +\connect cloudsync_test_29_a +\ir helper_psql_conn_setup.sql +INSERT INTO tasks VALUES ('t8', :'USER1'::UUID, 'Task 8', NULL, NULL, false); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_hex_8 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() + AND db_version > :max_dbv_7 \gset + +\connect cloudsync_test_29_b +\ir helper_psql_conn_setup.sql +SET app.current_user_id = :'USER1'; +SET ROLE test_rls_user; +SELECT cloudsync_payload_apply(decode(:'payload_hex_8', 'hex')) AS apply_8 \gset +RESET ROLE; + +-- 1 row × 5 non-PK columns = 5 entries +SELECT (:apply_8::int = 5) AS apply_8_ok \gset +\if :apply_8_ok +\echo [PASS] (:testid) RLS multicol auth: NULL columns apply returned :apply_8 +\else +\echo [FAIL] (:testid) RLS multicol auth: NULL columns apply returned :apply_8 (expected 5) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify NULLs preserved +SELECT COUNT(*) AS t8_count FROM tasks WHERE id = 't8' AND user_id = :'USER1'::UUID AND title = 'Task 8' AND description IS NULL AND priority IS NULL AND is_complete = false \gset +SELECT (:t8_count::int = 1) AS test8_ok \gset +\if :test8_ok +\echo [PASS] (:testid) RLS multicol auth: NULL in non-ownership columns preserved +\else +\echo [FAIL] (:testid) RLS multicol auth: NULL in non-ownership columns preserved — got :t8_count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_test_29_a; +DROP DATABASE IF EXISTS cloudsync_test_29_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/30_null_prikey_insert.sql b/test/postgresql/30_null_prikey_insert.sql new file mode 100644 index 0000000..c7dc675 --- /dev/null +++ b/test/postgresql/30_null_prikey_insert.sql @@ -0,0 +1,68 @@ +-- Test: NULL Primary Key Insert Rejection +-- Verifies that inserting a NULL primary key into a cloudsync-enabled table fails +-- and that the metatable only contains rows for valid inserts. + +\set testid '30' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +-- Cleanup and create test database +DROP DATABASE IF EXISTS cloudsync_test_30; +CREATE DATABASE cloudsync_test_30; + +\connect cloudsync_test_30 +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +-- Create table with primary key and init cloudsync +CREATE TABLE t_null_pk ( + id TEXT NOT NULL PRIMARY KEY, + value TEXT +); + +SELECT cloudsync_init('t_null_pk', 'CLS', true) AS _init \gset + +-- Test 1: INSERT with NULL primary key should fail +DO $$ +BEGIN + INSERT INTO t_null_pk (id, value) VALUES (NULL, 'test'); + RAISE EXCEPTION 'INSERT with NULL PK should have failed'; +EXCEPTION WHEN not_null_violation THEN + -- Expected +END $$; + +SELECT (COUNT(*) = 0) AS null_pk_rejected FROM t_null_pk \gset +\if :null_pk_rejected +\echo [PASS] (:testid) NULL PK insert rejected +\else +\echo [FAIL] (:testid) NULL PK insert was not rejected +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 2: INSERT with valid (non-NULL) primary key should succeed +INSERT INTO t_null_pk (id, value) VALUES ('valid_id', 'test'); + +SELECT (COUNT(*) = 1) AS valid_insert_ok FROM t_null_pk WHERE id = 'valid_id' \gset +\if :valid_insert_ok +\echo [PASS] (:testid) Valid PK insert succeeded +\else +\echo [FAIL] (:testid) Valid PK insert failed +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 3: Metatable should have exactly 1 row (from the valid insert only) +SELECT (COUNT(*) = 1) AS meta_row_ok FROM t_null_pk_cloudsync \gset +\if :meta_row_ok +\echo [PASS] (:testid) Metatable has exactly 1 row +\else +\echo [FAIL] (:testid) Metatable row count mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Cleanup +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_test_30; +\endif diff --git a/test/postgresql/31_alter_table_sync.sql b/test/postgresql/31_alter_table_sync.sql new file mode 100644 index 0000000..3508129 --- /dev/null +++ b/test/postgresql/31_alter_table_sync.sql @@ -0,0 +1,383 @@ +-- Alter Table Sync Test +-- Tests cloudsync_begin_alter and cloudsync_commit_alter functions. +-- Verifies that schema changes (add column) are handled correctly +-- and data syncs after alteration. + +\set testid '31' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +-- Cleanup and create test databases +DROP DATABASE IF EXISTS cloudsync_test_31a; +DROP DATABASE IF EXISTS cloudsync_test_31b; +CREATE DATABASE cloudsync_test_31a; +CREATE DATABASE cloudsync_test_31b; + +-- ============================================================================ +-- Setup Database A +-- ============================================================================ + +\connect cloudsync_test_31a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE products ( + id UUID PRIMARY KEY, + name TEXT NOT NULL DEFAULT '', + price DOUBLE PRECISION NOT NULL DEFAULT 0.0, + quantity INTEGER NOT NULL DEFAULT 0 +); + +SELECT cloudsync_init('products', 'CLS', false) AS _init_a \gset + +INSERT INTO products VALUES ('11111111-1111-1111-1111-111111111111', 'Product A1', 10.99, 100); +INSERT INTO products VALUES ('22222222-2222-2222-2222-222222222222', 'Product A2', 20.50, 200); + +-- ============================================================================ +-- Setup Database B with same schema +-- ============================================================================ + +\connect cloudsync_test_31b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; + +CREATE TABLE products ( + id UUID PRIMARY KEY, + name TEXT NOT NULL DEFAULT '', + price DOUBLE PRECISION NOT NULL DEFAULT 0.0, + quantity INTEGER NOT NULL DEFAULT 0 +); + +SELECT cloudsync_init('products', 'CLS', false) AS _init_b \gset + +INSERT INTO products VALUES ('33333333-3333-3333-3333-333333333333', 'Product B1', 30.00, 300); +INSERT INTO products VALUES ('44444444-4444-4444-4444-444444444444', 'Product B2', 40.75, 400); + +-- ============================================================================ +-- Initial Sync: A -> B and B -> A +-- ============================================================================ + +\echo [INFO] (:testid) === Initial Sync Before ALTER === + +-- Encode payload from A +\connect cloudsync_test_31a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset +SELECT encode( + cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), + 'hex' +) AS payload_a_hex +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +-- Apply A's payload to B, encode B's payload +\connect cloudsync_test_31b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset +SELECT cloudsync_payload_apply(decode(:'payload_a_hex', 'hex')) AS apply_a_to_b \gset + +SELECT encode( + cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), + 'hex' +) AS payload_b_hex +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +-- Apply B's payload to A, verify initial sync +\connect cloudsync_test_31a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset +SELECT cloudsync_payload_apply(decode(:'payload_b_hex', 'hex')) AS apply_b_to_a \gset + +SELECT COUNT(*) AS count_a_initial FROM products \gset + +\connect cloudsync_test_31b +\ir helper_psql_conn_setup.sql +SELECT COUNT(*) AS count_b_initial FROM products \gset + +SELECT (:count_a_initial = 4 AND :count_b_initial = 4) AS initial_sync_ok \gset +\if :initial_sync_ok +\echo [PASS] (:testid) Initial sync complete - both databases have 4 rows +\else +\echo [FAIL] (:testid) Initial sync failed - A: :count_a_initial, B: :count_b_initial +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================================ +-- ALTER TABLE on Database A (begin_alter + ALTER + commit_alter on SAME connection) +-- ============================================================================ + +\echo [INFO] (:testid) === ALTER TABLE on Database A === + +\connect cloudsync_test_31a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset + +SELECT cloudsync_begin_alter('products') AS begin_alter_a \gset +\if :begin_alter_a +\echo [PASS] (:testid) cloudsync_begin_alter succeeded on Database A +\else +\echo [FAIL] (:testid) cloudsync_begin_alter failed on Database A +SELECT (:fail::int + 1) AS fail \gset +\endif + +ALTER TABLE products ADD COLUMN description TEXT NOT NULL DEFAULT ''; + +SELECT cloudsync_commit_alter('products') AS commit_alter_a \gset +\if :commit_alter_a +\echo [PASS] (:testid) cloudsync_commit_alter succeeded on Database A +\else +\echo [FAIL] (:testid) cloudsync_commit_alter failed on Database A +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Insert and update post-ALTER data on A +INSERT INTO products (id, name, price, quantity, description) +VALUES ('55555555-5555-5555-5555-555555555555', 'New Product A', 55.55, 555, 'Added after alter on A'); + +UPDATE products SET description = 'Updated on A' WHERE id = '11111111-1111-1111-1111-111111111111'; +UPDATE products SET quantity = 150 WHERE id = '11111111-1111-1111-1111-111111111111'; + +-- Encode post-ALTER payload from A +SELECT encode( + cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), + 'hex' +) AS payload_a2_hex +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +SELECT (length(:'payload_a2_hex') > 0) AS payload_a2_created \gset +\if :payload_a2_created +\echo [PASS] (:testid) Post-alter payload encoded from Database A +\else +\echo [FAIL] (:testid) Post-alter payload empty from Database A +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================================ +-- ALTER TABLE on Database B (begin_alter + ALTER + commit_alter on SAME connection) +-- Apply A's payload, insert/update, encode B's payload +-- ============================================================================ + +\echo [INFO] (:testid) === ALTER TABLE on Database B === + +\connect cloudsync_test_31b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset + +SELECT cloudsync_begin_alter('products') AS begin_alter_b \gset +\if :begin_alter_b +\echo [PASS] (:testid) cloudsync_begin_alter succeeded on Database B +\else +\echo [FAIL] (:testid) cloudsync_begin_alter failed on Database B +SELECT (:fail::int + 1) AS fail \gset +\endif + +ALTER TABLE products ADD COLUMN description TEXT NOT NULL DEFAULT ''; + +SELECT cloudsync_commit_alter('products') AS commit_alter_b \gset +\if :commit_alter_b +\echo [PASS] (:testid) cloudsync_commit_alter succeeded on Database B +\else +\echo [FAIL] (:testid) cloudsync_commit_alter failed on Database B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Insert and update post-ALTER data on B +INSERT INTO products (id, name, price, quantity, description) +VALUES ('66666666-6666-6666-6666-666666666666', 'New Product B', 66.66, 666, 'Added after alter on B'); + +UPDATE products SET description = 'Updated on B' WHERE id = '33333333-3333-3333-3333-333333333333'; +UPDATE products SET quantity = 350 WHERE id = '33333333-3333-3333-3333-333333333333'; + +-- Apply A's post-alter payload to B +SELECT cloudsync_payload_apply(decode(:'payload_a2_hex', 'hex')) AS apply_a2_to_b \gset + +SELECT (:apply_a2_to_b >= 0) AS apply_a2_ok \gset +\if :apply_a2_ok +\echo [PASS] (:testid) Post-alter payload from A applied to B +\else +\echo [FAIL] (:testid) Post-alter payload from A failed to apply to B: :apply_a2_to_b +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Encode post-ALTER payload from B +SELECT encode( + cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), + 'hex' +) AS payload_b2_hex +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +-- ============================================================================ +-- Apply B's payload to A, then verify final state +-- ============================================================================ + +\echo [INFO] (:testid) === Apply B payload to A and verify === + +\connect cloudsync_test_31a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_init('products', 'CLS', false) AS _reinit \gset +SELECT cloudsync_payload_apply(decode(:'payload_b2_hex', 'hex')) AS apply_b2_to_a \gset + +SELECT (:apply_b2_to_a >= 0) AS apply_b2_ok \gset +\if :apply_b2_ok +\echo [PASS] (:testid) Post-alter payload from B applied to A +\else +\echo [FAIL] (:testid) Post-alter payload from B failed to apply to A: :apply_b2_to_a +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================================ +-- Verify final state +-- ============================================================================ + +\echo [INFO] (:testid) === Verify Final State === + +-- Compute hash of Database A +SELECT md5( + COALESCE( + string_agg( + id::text || ':' || + COALESCE(name, 'NULL') || ':' || + COALESCE(price::text, 'NULL') || ':' || + COALESCE(quantity::text, 'NULL') || ':' || + COALESCE(description, 'NULL'), + '|' ORDER BY id + ), + '' + ) +) AS hash_a_final FROM products \gset + +\echo [INFO] (:testid) Database A final hash: :hash_a_final + +-- Row count on A +SELECT COUNT(*) AS count_a_final FROM products \gset + +-- Verify new row from B exists in A +SELECT COUNT(*) = 1 AS new_row_b_ok +FROM products +WHERE id = '66666666-6666-6666-6666-666666666666' + AND name = 'New Product B' + AND price = 66.66 + AND quantity = 666 + AND description = 'Added after alter on B' \gset + +-- Verify updated row from B synced to A +SELECT COUNT(*) = 1 AS updated_row_b_ok +FROM products +WHERE id = '33333333-3333-3333-3333-333333333333' + AND description = 'Updated on B' + AND quantity = 350 \gset + +\connect cloudsync_test_31b +\ir helper_psql_conn_setup.sql + +-- Compute hash of Database B +SELECT md5( + COALESCE( + string_agg( + id::text || ':' || + COALESCE(name, 'NULL') || ':' || + COALESCE(price::text, 'NULL') || ':' || + COALESCE(quantity::text, 'NULL') || ':' || + COALESCE(description, 'NULL'), + '|' ORDER BY id + ), + '' + ) +) AS hash_b_final FROM products \gset + +\echo [INFO] (:testid) Database B final hash: :hash_b_final + +-- Row count on B +SELECT COUNT(*) AS count_b_final FROM products \gset + +-- Verify new row from A exists in B +SELECT COUNT(*) = 1 AS new_row_a_ok +FROM products +WHERE id = '55555555-5555-5555-5555-555555555555' + AND name = 'New Product A' + AND price = 55.55 + AND quantity = 555 + AND description = 'Added after alter on A' \gset + +-- Verify updated row from A synced to B +SELECT COUNT(*) = 1 AS updated_row_a_ok +FROM products +WHERE id = '11111111-1111-1111-1111-111111111111' + AND description = 'Updated on A' + AND quantity = 150 \gset + +-- Verify new column exists +SELECT COUNT(*) = 1 AS description_column_exists +FROM information_schema.columns +WHERE table_name = 'products' AND column_name = 'description' \gset + +-- ============================================================================ +-- Report results +-- ============================================================================ + +-- Compare final hashes +SELECT (:'hash_a_final' = :'hash_b_final') AS final_hashes_match \gset +\if :final_hashes_match +\echo [PASS] (:testid) Final data integrity verified - hashes match after ALTER +\else +\echo [FAIL] (:testid) Final data integrity check failed - A: :hash_a_final, B: :hash_b_final +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (:count_a_final = 6 AND :count_b_final = 6) AS row_counts_ok \gset +\if :row_counts_ok +\echo [PASS] (:testid) Row counts match (6 rows each) +\else +\echo [FAIL] (:testid) Row counts mismatch - A: :count_a_final, B: :count_b_final +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :new_row_a_ok +\echo [PASS] (:testid) New row from A synced to B with new schema +\else +\echo [FAIL] (:testid) New row from A not found or incorrect in B +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :new_row_b_ok +\echo [PASS] (:testid) New row from B synced to A with new schema +\else +\echo [FAIL] (:testid) New row from B not found or incorrect in A +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :updated_row_a_ok +\echo [PASS] (:testid) Updated row from A synced with new column values +\else +\echo [FAIL] (:testid) Updated row from A not synced correctly +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :updated_row_b_ok +\echo [PASS] (:testid) Updated row from B synced with new column values +\else +\echo [FAIL] (:testid) Updated row from B not synced correctly +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :description_column_exists +\echo [PASS] (:testid) Added column 'description' exists +\else +\echo [FAIL] (:testid) Added column 'description' not found +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================================ +-- Cleanup +-- ============================================================================ + +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_test_31a; +DROP DATABASE IF EXISTS cloudsync_test_31b; +\endif diff --git a/test/postgresql/32_block_lww.sql b/test/postgresql/32_block_lww.sql new file mode 100644 index 0000000..00dbf37 --- /dev/null +++ b/test/postgresql/32_block_lww.sql @@ -0,0 +1,146 @@ +-- 'Block-level LWW test' + +\set testid '32' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_test_a; +CREATE DATABASE cloudsync_block_test_a; + +\connect cloudsync_block_test_a +\ir helper_psql_conn_setup.sql + +CREATE EXTENSION IF NOT EXISTS cloudsync; + +-- Create a table with a text column for block-level LWW +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); + +-- Initialize cloudsync for the table +SELECT cloudsync_init('docs', 'CLS', true) AS _init \gset + +-- Configure body column as block-level +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol \gset + +-- Test 1: INSERT text, verify blocks table populated +INSERT INTO docs (id, body) VALUES ('doc1', 'line1 +line2 +line3'); + +-- Verify blocks table was created +SELECT EXISTS(SELECT 1 FROM information_schema.tables WHERE table_name = 'docs_cloudsync_blocks') AS blocks_table_exists \gset +\if :blocks_table_exists +\echo [PASS] (:testid) Blocks table created +\else +\echo [FAIL] (:testid) Blocks table not created +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify blocks have been stored (3 lines = 3 blocks) +SELECT count(*) AS block_count FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:block_count::int = 3) AS insert_blocks_ok \gset +\if :insert_blocks_ok +\echo [PASS] (:testid) Block insert: 3 blocks created +\else +\echo [FAIL] (:testid) Block insert: expected 3 blocks, got :block_count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify metadata has block entries (col_name contains \x1F separator) +SELECT count(*) AS meta_block_count FROM docs_cloudsync WHERE col_name LIKE 'body' || chr(31) || '%' \gset +SELECT (:meta_block_count::int = 3) AS meta_blocks_ok \gset +\if :meta_blocks_ok +\echo [PASS] (:testid) Block metadata: 3 block entries in _cloudsync +\else +\echo [FAIL] (:testid) Block metadata: expected 3 entries, got :meta_block_count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 2: UPDATE text (modify one line, add one line) +UPDATE docs SET body = 'line1 +line2_modified +line3 +line4' WHERE id = 'doc1'; + +-- Verify blocks updated (should now have 4 blocks) +SELECT count(*) AS block_count2 FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:block_count2::int = 4) AS update_blocks_ok \gset +\if :update_blocks_ok +\echo [PASS] (:testid) Block update: 4 blocks after update +\else +\echo [FAIL] (:testid) Block update: expected 4 blocks, got :block_count2 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 3: Materialize and verify round-trip +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat \gset +SELECT body AS materialized_body FROM docs WHERE id = 'doc1' \gset + +SELECT (:'materialized_body' = 'line1 +line2_modified +line3 +line4') AS materialize_ok \gset +\if :materialize_ok +\echo [PASS] (:testid) Text materialize: reconstructed text matches +\else +\echo [FAIL] (:testid) Text materialize: text mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 4: Verify col_value works for block entries +SELECT count(*) AS col_value_count FROM docs_cloudsync +WHERE col_name LIKE 'body' || chr(31) || '%' +AND cloudsync_col_value('docs', col_name, pk) IS NOT NULL \gset +SELECT (:col_value_count::int > 0) AS col_value_ok \gset +\if :col_value_ok +\echo [PASS] (:testid) col_value works for block entries +\else +\echo [FAIL] (:testid) col_value returned NULL for block entries +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Test 5: Sync roundtrip - encode payload from db A before disconnecting +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS block_payload_hex +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_test_b; +CREATE DATABASE cloudsync_block_test_b; +\connect cloudsync_block_test_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_b \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_b \gset + +SELECT cloudsync_payload_apply(decode(:'block_payload_hex', 'hex')) AS _apply_b \gset + +-- Materialize on db B +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat_b \gset +SELECT body AS body_b FROM docs WHERE id = 'doc1' \gset + +SELECT (:'body_b' = 'line1 +line2_modified +line3 +line4') AS sync_ok \gset +\if :sync_ok +\echo [PASS] (:testid) Block sync roundtrip: text matches after apply + materialize +\else +\echo [FAIL] (:testid) Block sync roundtrip: text mismatch on db B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Cleanup +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_block_test_a; +DROP DATABASE IF EXISTS cloudsync_block_test_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/33_block_lww_extended.sql b/test/postgresql/33_block_lww_extended.sql new file mode 100644 index 0000000..6b11338 --- /dev/null +++ b/test/postgresql/33_block_lww_extended.sql @@ -0,0 +1,339 @@ +-- 'Block-level LWW extended tests: DELETE, empty text, multi-update, conflict' + +\set testid '33' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_ext_a; +DROP DATABASE IF EXISTS cloudsync_block_ext_b; +CREATE DATABASE cloudsync_block_ext_a; +CREATE DATABASE cloudsync_block_ext_b; + +-- ============================================================ +-- Setup db A +-- ============================================================ +\connect cloudsync_block_ext_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_a \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_a \gset + +-- ============================================================ +-- Test 1: DELETE marks tombstone, block metadata dropped +-- ============================================================ +INSERT INTO docs (id, body) VALUES ('doc1', 'line1 +line2 +line3'); + +-- Verify 3 block metadata entries exist +SELECT count(*) AS meta_before FROM docs_cloudsync WHERE col_name LIKE 'body' || chr(31) || '%' \gset +SELECT (:meta_before::int = 3) AS meta_before_ok \gset +\if :meta_before_ok +\echo [PASS] (:testid) Delete pre-check: 3 block metadata entries +\else +\echo [FAIL] (:testid) Delete pre-check: expected 3 metadata, got :meta_before +SELECT (:fail::int + 1) AS fail \gset +\endif + +DELETE FROM docs WHERE id = 'doc1'; + +-- Tombstone should exist with even version (deleted) +SELECT count(*) AS tombstone_count FROM docs_cloudsync WHERE col_name = '__[RIP]__' AND col_version % 2 = 0 \gset +SELECT (:tombstone_count::int = 1) AS tombstone_ok \gset +\if :tombstone_ok +\echo [PASS] (:testid) Delete: tombstone exists with even version +\else +\echo [FAIL] (:testid) Delete: expected 1 tombstone, got :tombstone_count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Block metadata should be dropped +SELECT count(*) AS meta_after FROM docs_cloudsync WHERE col_name LIKE 'body' || chr(31) || '%' \gset +SELECT (:meta_after::int = 0) AS meta_dropped_ok \gset +\if :meta_dropped_ok +\echo [PASS] (:testid) Delete: block metadata dropped +\else +\echo [FAIL] (:testid) Delete: expected 0 metadata after delete, got :meta_after +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Row should be gone from base table +SELECT count(*) AS row_after FROM docs WHERE id = 'doc1' \gset +SELECT (:row_after::int = 0) AS row_gone_ok \gset +\if :row_gone_ok +\echo [PASS] (:testid) Delete: row removed from base table +\else +\echo [FAIL] (:testid) Delete: row still in base table +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Empty text creates single block +-- ============================================================ +INSERT INTO docs (id, body) VALUES ('doc_empty', ''); + +SELECT count(*) AS empty_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_empty') \gset +SELECT (:empty_blocks::int = 1) AS empty_block_ok \gset +\if :empty_block_ok +\echo [PASS] (:testid) Empty text: 1 block created +\else +\echo [FAIL] (:testid) Empty text: expected 1 block, got :empty_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Update from empty to multi-line +UPDATE docs SET body = 'NewLine1 +NewLine2' WHERE id = 'doc_empty'; + +SELECT count(*) AS updated_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_empty') \gset +SELECT (:updated_blocks::int = 2) AS update_from_empty_ok \gset +\if :update_from_empty_ok +\echo [PASS] (:testid) Empty text: 2 blocks after update +\else +\echo [FAIL] (:testid) Empty text: expected 2 blocks after update, got :updated_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Multi-update block counts +-- ============================================================ +INSERT INTO docs (id, body) VALUES ('doc_multi', 'A +B +C'); + +-- Update 1: remove middle line +UPDATE docs SET body = 'A +C' WHERE id = 'doc_multi'; + +SELECT count(*) AS blocks1 FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_multi') \gset +SELECT (:blocks1::int = 2) AS multi1_ok \gset +\if :multi1_ok +\echo [PASS] (:testid) Multi-update: 2 blocks after removing middle +\else +\echo [FAIL] (:testid) Multi-update: expected 2, got :blocks1 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Update 2: add two lines +UPDATE docs SET body = 'A +X +C +Y' WHERE id = 'doc_multi'; + +SELECT count(*) AS blocks2 FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_multi') \gset +SELECT (:blocks2::int = 4) AS multi2_ok \gset +\if :multi2_ok +\echo [PASS] (:testid) Multi-update: 4 blocks after adding lines +\else +\echo [FAIL] (:testid) Multi-update: expected 4, got :blocks2 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Update 3: collapse to single line +UPDATE docs SET body = 'SINGLE' WHERE id = 'doc_multi'; + +SELECT count(*) AS blocks3 FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_multi') \gset +SELECT (:blocks3::int = 1) AS multi3_ok \gset +\if :multi3_ok +\echo [PASS] (:testid) Multi-update: 1 block after collapse +\else +\echo [FAIL] (:testid) Multi-update: expected 1, got :blocks3 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Materialize and verify +SELECT cloudsync_text_materialize('docs', 'body', 'doc_multi') AS _mat_multi \gset +SELECT body AS multi_body FROM docs WHERE id = 'doc_multi' \gset +SELECT (:'multi_body' = 'SINGLE') AS multi_mat_ok \gset +\if :multi_mat_ok +\echo [PASS] (:testid) Multi-update: materialize matches +\else +\echo [FAIL] (:testid) Multi-update: materialize mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Two-database conflict on same block +-- ============================================================ + +-- Setup db B +\connect cloudsync_block_ext_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_b \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_b \gset + +-- Insert initial doc on db A +\connect cloudsync_block_ext_a +INSERT INTO docs (id, body) VALUES ('doc_conflict', 'Same +Middle +End'); + +-- Sync A -> B (round 1) +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_a_r1 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_ext_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_a_r1', 'hex')) AS _apply_b_r1 \gset + +-- Materialize on B to get body +SELECT cloudsync_text_materialize('docs', 'body', 'doc_conflict') AS _mat_b_init \gset + +-- Verify B has the initial doc +SELECT body AS body_b_init FROM docs WHERE id = 'doc_conflict' \gset +SELECT (:'body_b_init' = 'Same +Middle +End') AS init_sync_ok \gset +\if :init_sync_ok +\echo [PASS] (:testid) Conflict: initial sync to B matches +\else +\echo [FAIL] (:testid) Conflict: initial sync to B mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Site A edits first line +\connect cloudsync_block_ext_a +UPDATE docs SET body = 'SiteA +Middle +End' WHERE id = 'doc_conflict'; + +-- Site B edits first line (conflict!) +\connect cloudsync_block_ext_b +UPDATE docs SET body = 'SiteB +Middle +End' WHERE id = 'doc_conflict'; + +-- Collect payloads from both sites +\connect cloudsync_block_ext_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_a_r2 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_ext_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_b_r2 +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +-- Apply A's changes to B +SELECT cloudsync_payload_apply(decode(:'payload_a_r2', 'hex')) AS _apply_b_r2 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_conflict') AS _mat_b_r2 \gset + +-- Apply B's changes to A +\connect cloudsync_block_ext_a +SELECT cloudsync_payload_apply(decode(:'payload_b_r2', 'hex')) AS _apply_a_r2 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_conflict') AS _mat_a_r2 \gset + +-- Both should converge +SELECT body AS body_a_final FROM docs WHERE id = 'doc_conflict' \gset + +\connect cloudsync_block_ext_b +SELECT body AS body_b_final FROM docs WHERE id = 'doc_conflict' \gset + +-- Bodies must match (convergence) +SELECT (:'body_a_final' = :'body_b_final') AS converge_ok \gset +\if :converge_ok +\echo [PASS] (:testid) Conflict: databases converge after sync +\else +\echo [FAIL] (:testid) Conflict: databases diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Unchanged lines must be preserved +SELECT (position('Middle' in :'body_a_final') > 0) AS has_middle \gset +\if :has_middle +\echo [PASS] (:testid) Conflict: unchanged line 'Middle' preserved +\else +\echo [FAIL] (:testid) Conflict: 'Middle' missing from result +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('End' in :'body_a_final') > 0) AS has_end \gset +\if :has_end +\echo [PASS] (:testid) Conflict: unchanged line 'End' preserved +\else +\echo [FAIL] (:testid) Conflict: 'End' missing from result +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- One of the conflicting edits must win +SELECT (position('SiteA' in :'body_a_final') > 0 OR position('SiteB' in :'body_a_final') > 0) AS has_winner \gset +\if :has_winner +\echo [PASS] (:testid) Conflict: one site edit won (LWW) +\else +\echo [FAIL] (:testid) Conflict: neither SiteA nor SiteB in result +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: DELETE then re-INSERT (reinsert) +-- ============================================================ +\connect cloudsync_block_ext_a + +INSERT INTO docs (id, body) VALUES ('doc_reinsert', 'Old1 +Old2'); +DELETE FROM docs WHERE id = 'doc_reinsert'; + +-- Block metadata should be dropped after delete +SELECT count(*) AS meta_reinsert_del FROM docs_cloudsync +WHERE pk = cloudsync_pk_encode('doc_reinsert') +AND col_name LIKE 'body' || chr(31) || '%' \gset +SELECT (:meta_reinsert_del::int = 0) AS reinsert_meta_del_ok \gset +\if :reinsert_meta_del_ok +\echo [PASS] (:testid) Reinsert: metadata dropped after delete +\else +\echo [FAIL] (:testid) Reinsert: expected 0 metadata, got :meta_reinsert_del +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Re-insert with new content +INSERT INTO docs (id, body) VALUES ('doc_reinsert', 'New1 +New2 +New3'); + +SELECT count(*) AS meta_reinsert_new FROM docs_cloudsync +WHERE pk = cloudsync_pk_encode('doc_reinsert') +AND col_name LIKE 'body' || chr(31) || '%' \gset +SELECT (:meta_reinsert_new::int = 3) AS reinsert_meta_ok \gset +\if :reinsert_meta_ok +\echo [PASS] (:testid) Reinsert: 3 block metadata after re-insert +\else +\echo [FAIL] (:testid) Reinsert: expected 3 metadata, got :meta_reinsert_new +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync to B and materialize +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_reinsert +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_ext_b +SELECT cloudsync_payload_apply(decode(:'payload_reinsert', 'hex')) AS _apply_reinsert \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_reinsert') AS _mat_reinsert \gset +SELECT body AS body_reinsert FROM docs WHERE id = 'doc_reinsert' \gset + +SELECT (:'body_reinsert' = 'New1 +New2 +New3') AS reinsert_sync_ok \gset +\if :reinsert_sync_ok +\echo [PASS] (:testid) Reinsert: sync roundtrip matches +\else +\echo [FAIL] (:testid) Reinsert: sync mismatch on db B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Cleanup +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_block_ext_a; +DROP DATABASE IF EXISTS cloudsync_block_ext_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/34_block_lww_advanced.sql b/test/postgresql/34_block_lww_advanced.sql new file mode 100644 index 0000000..ea40e8a --- /dev/null +++ b/test/postgresql/34_block_lww_advanced.sql @@ -0,0 +1,698 @@ +-- 'Block-level LWW advanced tests: noconflict, add+edit, three-way, mixed cols, NULL->text, interleaved, custom delimiter, large text, rapid updates' + +\set testid '34' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_adv_a; +DROP DATABASE IF EXISTS cloudsync_block_adv_b; +DROP DATABASE IF EXISTS cloudsync_block_adv_c; +CREATE DATABASE cloudsync_block_adv_a; +CREATE DATABASE cloudsync_block_adv_b; +CREATE DATABASE cloudsync_block_adv_c; + +-- ============================================================ +-- Test 1: Non-conflicting edits on different blocks +-- Site A edits line 1, Site B edits line 3 — BOTH should survive +-- ============================================================ +\connect cloudsync_block_adv_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_a \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_a \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_b \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_b \gset + +-- Insert initial on A +\connect cloudsync_block_adv_a +INSERT INTO docs (id, body) VALUES ('doc1', 'Line1 +Line2 +Line3'); + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_init +FROM cloudsync_changes +WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_init', 'hex')) AS _apply_init \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat_init \gset + +-- Site A: edit first line +\connect cloudsync_block_adv_a +UPDATE docs SET body = 'EditedByA +Line2 +Line3' WHERE id = 'doc1'; + +-- Site B: edit third line (no conflict — different block) +\connect cloudsync_block_adv_b +UPDATE docs SET body = 'Line1 +Line2 +EditedByB' WHERE id = 'doc1'; + +-- Collect payloads +\connect cloudsync_block_adv_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +-- Apply A -> B, B -> A +SELECT cloudsync_payload_apply(decode(:'payload_a', 'hex')) AS _apply_ab \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat_b \gset + +\connect cloudsync_block_adv_a +SELECT cloudsync_payload_apply(decode(:'payload_b', 'hex')) AS _apply_ba \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat_a \gset + +-- Both should converge +SELECT body AS body_a FROM docs WHERE id = 'doc1' \gset +\connect cloudsync_block_adv_b +SELECT body AS body_b FROM docs WHERE id = 'doc1' \gset + +SELECT (:'body_a' = :'body_b') AS converge_ok \gset +\if :converge_ok +\echo [PASS] (:testid) NoConflict: databases converge +\else +\echo [FAIL] (:testid) NoConflict: databases diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Both edits should be preserved +SELECT (position('EditedByA' in :'body_a') > 0) AS has_a_edit \gset +\if :has_a_edit +\echo [PASS] (:testid) NoConflict: Site A edit preserved +\else +\echo [FAIL] (:testid) NoConflict: Site A edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('EditedByB' in :'body_a') > 0) AS has_b_edit \gset +\if :has_b_edit +\echo [PASS] (:testid) NoConflict: Site B edit preserved +\else +\echo [FAIL] (:testid) NoConflict: Site B edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('Line2' in :'body_a') > 0) AS has_middle \gset +\if :has_middle +\echo [PASS] (:testid) NoConflict: unchanged line preserved +\else +\echo [FAIL] (:testid) NoConflict: unchanged line missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Concurrent add + edit +-- Site A adds a line, Site B modifies an existing line +-- ============================================================ +\connect cloudsync_block_adv_a +INSERT INTO docs (id, body) VALUES ('doc2', 'Alpha +Bravo'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_d2_init +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_d2_init', 'hex')) AS _apply_d2 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc2') AS _mat_d2 \gset + +-- Site A: add a new line at end +\connect cloudsync_block_adv_a +UPDATE docs SET body = 'Alpha +Bravo +Charlie' WHERE id = 'doc2'; + +-- Site B: modify first line +\connect cloudsync_block_adv_b +UPDATE docs SET body = 'AlphaEdited +Bravo' WHERE id = 'doc2'; + +\connect cloudsync_block_adv_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_d2a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_d2b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +SELECT cloudsync_payload_apply(decode(:'payload_d2a', 'hex')) AS _apply_d2ab \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc2') AS _mat_d2b \gset + +\connect cloudsync_block_adv_a +SELECT cloudsync_payload_apply(decode(:'payload_d2b', 'hex')) AS _apply_d2ba \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc2') AS _mat_d2a \gset + +SELECT body AS body_d2a FROM docs WHERE id = 'doc2' \gset +\connect cloudsync_block_adv_b +SELECT body AS body_d2b FROM docs WHERE id = 'doc2' \gset + +SELECT (:'body_d2a' = :'body_d2b') AS d2_converge \gset +\if :d2_converge +\echo [PASS] (:testid) Add+Edit: databases converge +\else +\echo [FAIL] (:testid) Add+Edit: databases diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('Charlie' in :'body_d2a') > 0) AS has_charlie \gset +\if :has_charlie +\echo [PASS] (:testid) Add+Edit: added line Charlie preserved +\else +\echo [FAIL] (:testid) Add+Edit: added line Charlie missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('Bravo' in :'body_d2a') > 0) AS has_bravo \gset +\if :has_bravo +\echo [PASS] (:testid) Add+Edit: unchanged Bravo preserved +\else +\echo [FAIL] (:testid) Add+Edit: Bravo missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Three-way sync — 3 databases, each edits a different line +-- ============================================================ +\connect cloudsync_block_adv_c +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init_c \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _setcol_c \gset + +-- Insert initial on A +\connect cloudsync_block_adv_a +INSERT INTO docs (id, body) VALUES ('doc3', 'L1 +L2 +L3 +L4'); + +-- Sync A -> B, A -> C +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3init +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3init', 'hex')) AS _apply_3b \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat_3b \gset + +\connect cloudsync_block_adv_c +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3init', 'hex')) AS _apply_3c \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat_3c \gset + +-- A edits line 1 +\connect cloudsync_block_adv_a +UPDATE docs SET body = 'S0 +L2 +L3 +L4' WHERE id = 'doc3'; + +-- B edits line 2 +\connect cloudsync_block_adv_b +UPDATE docs SET body = 'L1 +S1 +L3 +L4' WHERE id = 'doc3'; + +-- C edits line 4 +\connect cloudsync_block_adv_c +UPDATE docs SET body = 'L1 +L2 +L3 +S2' WHERE id = 'doc3'; + +-- Collect all payloads +\connect cloudsync_block_adv_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_c +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3c +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +-- Full mesh apply: each site receives from the other two +\connect cloudsync_block_adv_a +SELECT cloudsync_payload_apply(decode(:'payload_3b', 'hex')) AS _3ab \gset +SELECT cloudsync_payload_apply(decode(:'payload_3c', 'hex')) AS _3ac \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat_3a_final \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3a', 'hex')) AS _3ba \gset +SELECT cloudsync_payload_apply(decode(:'payload_3c', 'hex')) AS _3bc \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat_3b_final \gset + +\connect cloudsync_block_adv_c +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3a', 'hex')) AS _3ca \gset +SELECT cloudsync_payload_apply(decode(:'payload_3b', 'hex')) AS _3cb \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat_3c_final \gset + +-- All three should converge +\connect cloudsync_block_adv_a +SELECT body AS body_3a FROM docs WHERE id = 'doc3' \gset +\connect cloudsync_block_adv_b +SELECT body AS body_3b FROM docs WHERE id = 'doc3' \gset +\connect cloudsync_block_adv_c +SELECT body AS body_3c FROM docs WHERE id = 'doc3' \gset + +SELECT (:'body_3a' = :'body_3b' AND :'body_3b' = :'body_3c') AS three_converge \gset +\if :three_converge +\echo [PASS] (:testid) Three-way: all 3 databases converge +\else +\echo [FAIL] (:testid) Three-way: databases diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('S0' in :'body_3a') > 0) AS has_s0 \gset +\if :has_s0 +\echo [PASS] (:testid) Three-way: Site A edit preserved +\else +\echo [FAIL] (:testid) Three-way: Site A edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('S1' in :'body_3a') > 0) AS has_s1 \gset +\if :has_s1 +\echo [PASS] (:testid) Three-way: Site B edit preserved +\else +\echo [FAIL] (:testid) Three-way: Site B edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('S2' in :'body_3a') > 0) AS has_s2 \gset +\if :has_s2 +\echo [PASS] (:testid) Three-way: Site C edit preserved +\else +\echo [FAIL] (:testid) Three-way: Site C edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Mixed block + normal columns +-- ============================================================ +\connect cloudsync_block_adv_a +DROP TABLE IF EXISTS notes; +CREATE TABLE notes (id TEXT PRIMARY KEY NOT NULL, body TEXT, title TEXT); +SELECT cloudsync_init('notes', 'CLS', true) AS _init_notes_a \gset +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block') AS _setcol_notes_a \gset + +\connect cloudsync_block_adv_b +DROP TABLE IF EXISTS notes; +CREATE TABLE notes (id TEXT PRIMARY KEY NOT NULL, body TEXT, title TEXT); +SELECT cloudsync_init('notes', 'CLS', true) AS _init_notes_b \gset +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block') AS _setcol_notes_b \gset + +\connect cloudsync_block_adv_a +INSERT INTO notes (id, body, title) VALUES ('n1', 'Line1 +Line2 +Line3', 'My Title'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_notes_init +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'notes' \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_notes_init', 'hex')) AS _apply_notes \gset +SELECT cloudsync_text_materialize('notes', 'body', 'n1') AS _mat_notes \gset + +-- A: edit block line 1 + title +\connect cloudsync_block_adv_a +UPDATE notes SET body = 'EditedLine1 +Line2 +Line3', title = 'Title From A' WHERE id = 'n1'; + +-- B: edit block line 3 + title (title conflicts via normal LWW) +\connect cloudsync_block_adv_b +UPDATE notes SET body = 'Line1 +Line2 +EditedLine3', title = 'Title From B' WHERE id = 'n1'; + +\connect cloudsync_block_adv_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_notes_a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'notes' \gset + +\connect cloudsync_block_adv_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_notes_b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'notes' \gset + +SELECT cloudsync_payload_apply(decode(:'payload_notes_a', 'hex')) AS _apply_notes_ab \gset +SELECT cloudsync_text_materialize('notes', 'body', 'n1') AS _mat_notes_b \gset + +\connect cloudsync_block_adv_a +SELECT cloudsync_payload_apply(decode(:'payload_notes_b', 'hex')) AS _apply_notes_ba \gset +SELECT cloudsync_text_materialize('notes', 'body', 'n1') AS _mat_notes_a \gset + +SELECT body AS notes_body_a FROM notes WHERE id = 'n1' \gset +SELECT title AS notes_title_a FROM notes WHERE id = 'n1' \gset +\connect cloudsync_block_adv_b +SELECT body AS notes_body_b FROM notes WHERE id = 'n1' \gset +SELECT title AS notes_title_b FROM notes WHERE id = 'n1' \gset + +SELECT (:'notes_body_a' = :'notes_body_b') AS mixed_body_ok \gset +\if :mixed_body_ok +\echo [PASS] (:testid) MixedCols: body converges +\else +\echo [FAIL] (:testid) MixedCols: body diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('EditedLine1' in :'notes_body_a') > 0 AND position('EditedLine3' in :'notes_body_a') > 0) AS both_edits \gset +\if :both_edits +\echo [PASS] (:testid) MixedCols: both block edits preserved +\else +\echo [FAIL] (:testid) MixedCols: block edits missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (:'notes_title_a' = :'notes_title_b') AS mixed_title_ok \gset +\if :mixed_title_ok +\echo [PASS] (:testid) MixedCols: title converges (normal LWW) +\else +\echo [FAIL] (:testid) MixedCols: title diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: NULL to text transition +-- ============================================================ +\connect cloudsync_block_adv_a +INSERT INTO docs (id, body) VALUES ('doc_null', NULL); + +-- Verify 1 block for NULL +SELECT count(*) AS null_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_null') \gset +SELECT (:null_blocks::int = 1) AS null_block_ok \gset +\if :null_block_ok +\echo [PASS] (:testid) NULL->Text: 1 block for NULL body +\else +\echo [FAIL] (:testid) NULL->Text: expected 1 block, got :null_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Update to multi-line +UPDATE docs SET body = 'Hello +World +Foo' WHERE id = 'doc_null'; + +SELECT count(*) AS text_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_null') \gset +SELECT (:text_blocks::int = 3) AS text_block_ok \gset +\if :text_block_ok +\echo [PASS] (:testid) NULL->Text: 3 blocks after update +\else +\echo [FAIL] (:testid) NULL->Text: expected 3 blocks, got :text_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync and verify +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_null +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_null', 'hex')) AS _apply_null \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_null') AS _mat_null \gset + +SELECT body AS body_null FROM docs WHERE id = 'doc_null' \gset +SELECT (:'body_null' = 'Hello +World +Foo') AS null_text_ok \gset +\if :null_text_ok +\echo [PASS] (:testid) NULL->Text: sync roundtrip matches +\else +\echo [FAIL] (:testid) NULL->Text: sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Interleaved inserts — multiple rounds between existing lines +-- ============================================================ +\connect cloudsync_block_adv_a +INSERT INTO docs (id, body) VALUES ('doc_inter', 'A +B'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_inter_init +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_inter_init', 'hex')) AS _apply_inter \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_inter') AS _mat_inter \gset + +-- Round 1: A inserts between A and B +\connect cloudsync_block_adv_a +UPDATE docs SET body = 'A +C +B' WHERE id = 'doc_inter'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_inter_r1 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_inter_r1', 'hex')) AS _r1 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_inter') AS _mat_r1 \gset + +-- Round 2: B inserts between A and C +\connect cloudsync_block_adv_b +UPDATE docs SET body = 'A +D +C +B' WHERE id = 'doc_inter'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_inter_r2 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset +\connect cloudsync_block_adv_a +SELECT cloudsync_payload_apply(decode(:'payload_inter_r2', 'hex')) AS _r2 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_inter') AS _mat_r2 \gset + +-- Round 3: A inserts between D and C +\connect cloudsync_block_adv_a +UPDATE docs SET body = 'A +D +E +C +B' WHERE id = 'doc_inter'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_inter_r3 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_inter_r3', 'hex')) AS _r3 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc_inter') AS _mat_r3 \gset + +\connect cloudsync_block_adv_a +SELECT body AS inter_body_a FROM docs WHERE id = 'doc_inter' \gset +\connect cloudsync_block_adv_b +SELECT body AS inter_body_b FROM docs WHERE id = 'doc_inter' \gset + +SELECT (:'inter_body_a' = :'inter_body_b') AS inter_converge \gset +\if :inter_converge +\echo [PASS] (:testid) Interleaved: databases converge +\else +\echo [FAIL] (:testid) Interleaved: databases diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT count(*) AS inter_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc_inter') \gset +SELECT (:inter_blocks::int = 5) AS inter_count_ok \gset +\if :inter_count_ok +\echo [PASS] (:testid) Interleaved: 5 blocks after 3 rounds +\else +\echo [FAIL] (:testid) Interleaved: expected 5 blocks, got :inter_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 7: Custom delimiter (paragraph separator: double newline) +-- ============================================================ +\connect cloudsync_block_adv_a +DROP TABLE IF EXISTS paragraphs; +CREATE TABLE paragraphs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('paragraphs', 'CLS', true) AS _init_para \gset +SELECT cloudsync_set_column('paragraphs', 'body', 'algo', 'block') AS _setcol_para \gset +SELECT cloudsync_set_column('paragraphs', 'body', 'delimiter', E'\n\n') AS _setdelim \gset + +\connect cloudsync_block_adv_b +DROP TABLE IF EXISTS paragraphs; +CREATE TABLE paragraphs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('paragraphs', 'CLS', true) AS _init_para_b \gset +SELECT cloudsync_set_column('paragraphs', 'body', 'algo', 'block') AS _setcol_para_b \gset +SELECT cloudsync_set_column('paragraphs', 'body', 'delimiter', E'\n\n') AS _setdelim_b \gset + +\connect cloudsync_block_adv_a +INSERT INTO paragraphs (id, body) VALUES ('p1', E'Para one line1\nline2\n\nPara two\n\nPara three'); + +-- Should produce 3 blocks (3 paragraphs) +SELECT count(*) AS para_blocks FROM paragraphs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('p1') \gset +SELECT (:para_blocks::int = 3) AS para_ok \gset +\if :para_ok +\echo [PASS] (:testid) CustomDelim: 3 paragraph blocks +\else +\echo [FAIL] (:testid) CustomDelim: expected 3 blocks, got :para_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync and verify roundtrip +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_para +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'paragraphs' \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_para', 'hex')) AS _apply_para \gset +SELECT cloudsync_text_materialize('paragraphs', 'body', 'p1') AS _mat_para \gset + +SELECT body AS para_body FROM paragraphs WHERE id = 'p1' \gset +SELECT (:'para_body' = E'Para one line1\nline2\n\nPara two\n\nPara three') AS para_roundtrip \gset +\if :para_roundtrip +\echo [PASS] (:testid) CustomDelim: sync roundtrip matches +\else +\echo [FAIL] (:testid) CustomDelim: sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 8: Large text — 200 lines +-- ============================================================ +\connect cloudsync_block_adv_a +\ir helper_psql_conn_setup.sql +INSERT INTO docs (id, body) +SELECT 'bigdoc', string_agg('Line ' || lpad(i::text, 3, '0') || ' content', E'\n' ORDER BY i) +FROM generate_series(0, 199) AS s(i); + +SELECT count(*) AS big_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('bigdoc') \gset +SELECT (:big_blocks::int = 200) AS big_ok \gset +\if :big_ok +\echo [PASS] (:testid) LargeText: 200 blocks created +\else +\echo [FAIL] (:testid) LargeText: expected 200 blocks, got :big_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- All positions unique +SELECT count(DISTINCT col_name) AS big_distinct FROM docs_cloudsync +WHERE col_name LIKE 'body' || chr(31) || '%' +AND pk = cloudsync_pk_encode('bigdoc') \gset +SELECT (:big_distinct::int = 200) AS big_unique \gset +\if :big_unique +\echo [PASS] (:testid) LargeText: 200 unique position IDs +\else +\echo [FAIL] (:testid) LargeText: expected 200 unique positions, got :big_distinct +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync and verify +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_big +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_big', 'hex')) AS _apply_big \gset +SELECT cloudsync_text_materialize('docs', 'body', 'bigdoc') AS _mat_big \gset + +SELECT body AS big_body_b FROM docs WHERE id = 'bigdoc' \gset +\connect cloudsync_block_adv_a +SELECT body AS big_body_a FROM docs WHERE id = 'bigdoc' \gset + +SELECT (:'big_body_a' = :'big_body_b') AS big_match \gset +\if :big_match +\echo [PASS] (:testid) LargeText: sync roundtrip matches +\else +\echo [FAIL] (:testid) LargeText: sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 9: Rapid sequential updates — 50 updates on same row +-- ============================================================ +\connect cloudsync_block_adv_a +\ir helper_psql_conn_setup.sql +INSERT INTO docs (id, body) VALUES ('rapid', 'Start'); + +DO $$ +DECLARE + i INT; + new_body TEXT := ''; +BEGIN + FOR i IN 0..49 LOOP + IF i > 0 THEN new_body := new_body || E'\n'; END IF; + new_body := new_body || 'Update' || i; + UPDATE docs SET body = new_body WHERE id = 'rapid'; + END LOOP; +END $$; + +SELECT count(*) AS rapid_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('rapid') \gset +SELECT (:rapid_blocks::int = 50) AS rapid_ok \gset +\if :rapid_ok +\echo [PASS] (:testid) RapidUpdates: 50 blocks after 50 updates +\else +\echo [FAIL] (:testid) RapidUpdates: expected 50 blocks, got :rapid_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync and verify +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_rapid +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_adv_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_rapid', 'hex')) AS _apply_rapid \gset +SELECT cloudsync_text_materialize('docs', 'body', 'rapid') AS _mat_rapid \gset + +SELECT body AS rapid_body_b FROM docs WHERE id = 'rapid' \gset +\connect cloudsync_block_adv_a +SELECT body AS rapid_body_a FROM docs WHERE id = 'rapid' \gset + +SELECT (:'rapid_body_a' = :'rapid_body_b') AS rapid_match \gset +\if :rapid_match +\echo [PASS] (:testid) RapidUpdates: sync roundtrip matches +\else +\echo [FAIL] (:testid) RapidUpdates: sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('Update0' in :'rapid_body_a') > 0) AS has_first \gset +\if :has_first +\echo [PASS] (:testid) RapidUpdates: first update present +\else +\echo [FAIL] (:testid) RapidUpdates: first update missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (position('Update49' in :'rapid_body_a') > 0) AS has_last \gset +\if :has_last +\echo [PASS] (:testid) RapidUpdates: last update present +\else +\echo [FAIL] (:testid) RapidUpdates: last update missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Cleanup +\ir helper_test_cleanup.sql +\if :should_cleanup +DROP DATABASE IF EXISTS cloudsync_block_adv_a; +DROP DATABASE IF EXISTS cloudsync_block_adv_b; +DROP DATABASE IF EXISTS cloudsync_block_adv_c; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/35_block_lww_edge_cases.sql b/test/postgresql/35_block_lww_edge_cases.sql new file mode 100644 index 0000000..4692994 --- /dev/null +++ b/test/postgresql/35_block_lww_edge_cases.sql @@ -0,0 +1,420 @@ +-- 'Block-level LWW edge cases: unicode, special chars, delete vs edit, two block cols, text->NULL, payload sync, idempotent, ordering' + +\set testid '35' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_edge_a; +DROP DATABASE IF EXISTS cloudsync_block_edge_b; +CREATE DATABASE cloudsync_block_edge_a; +CREATE DATABASE cloudsync_block_edge_b; + +-- ============================================================ +-- Test 1: Unicode / multibyte content (emoji, CJK, accented) +-- ============================================================ +\connect cloudsync_block_edge_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _sc \gset + +-- Insert unicode text on A +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc1', E'Hello \U0001F600\nBonjour caf\u00e9\n\u65e5\u672c\u8a9e\u30c6\u30b9\u30c8'); + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload1 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload1', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc1') AS _mat \gset + +SELECT (body LIKE E'Hello %') AS unicode_ok FROM docs WHERE id = 'doc1' \gset +\if :unicode_ok +\echo [PASS] (:testid) Unicode: body starts with Hello +\else +\echo [FAIL] (:testid) Unicode: body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Check line count (3 lines = 2 newlines) +SELECT (length(body) - length(replace(body, E'\n', '')) = 2) AS unicode_lines FROM docs WHERE id = 'doc1' \gset +\if :unicode_lines +\echo [PASS] (:testid) Unicode: 3 lines present +\else +\echo [FAIL] (:testid) Unicode: wrong line count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Special characters (tabs, backslashes, quotes) +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc2', E'line\twith\ttabs\nback\\\\slash\nO''Brien said "hi"'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload2 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc2') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload2', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc2') AS _mat \gset + +SELECT (body LIKE E'%\t%') AS special_tabs FROM docs WHERE id = 'doc2' \gset +\if :special_tabs +\echo [PASS] (:testid) SpecialChars: tabs preserved +\else +\echo [FAIL] (:testid) SpecialChars: tabs lost +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE '%Brien%') AS special_quotes FROM docs WHERE id = 'doc2' \gset +\if :special_quotes +\echo [PASS] (:testid) SpecialChars: quotes preserved +\else +\echo [FAIL] (:testid) SpecialChars: quotes lost +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Delete vs edit — A deletes block 1, B edits block 2 +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc3', E'Alpha\nBeta\nGamma'); + +-- Sync initial to B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload3i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc3') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload3i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat \gset + +-- A: remove first line +\connect cloudsync_block_edge_a +UPDATE docs SET body = E'Beta\nGamma' WHERE id = 'doc3'; + +-- B: edit second line +\connect cloudsync_block_edge_b +UPDATE docs SET body = E'Alpha\nBetaEdited\nGamma' WHERE id = 'doc3'; + +-- Sync A -> B +\connect cloudsync_block_edge_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload3a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc3') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload3a', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc3') AS _mat \gset + +-- B should have: Alpha removed (A wins), BetaEdited kept (B's edit) +SELECT (body NOT LIKE '%Alpha%') AS dve_no_alpha FROM docs WHERE id = 'doc3' \gset +\if :dve_no_alpha +\echo [PASS] (:testid) DelVsEdit: Alpha removed +\else +\echo [FAIL] (:testid) DelVsEdit: Alpha still present +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE '%BetaEdited%') AS dve_beta FROM docs WHERE id = 'doc3' \gset +\if :dve_beta +\echo [PASS] (:testid) DelVsEdit: BetaEdited present +\else +\echo [FAIL] (:testid) DelVsEdit: BetaEdited missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE '%Gamma%') AS dve_gamma FROM docs WHERE id = 'doc3' \gset +\if :dve_gamma +\echo [PASS] (:testid) DelVsEdit: Gamma present +\else +\echo [FAIL] (:testid) DelVsEdit: Gamma missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Two block columns on the same table (body + notes) +-- ============================================================ +\connect cloudsync_block_edge_a +DROP TABLE IF EXISTS articles; +CREATE TABLE articles (id TEXT PRIMARY KEY NOT NULL, body TEXT, notes TEXT); +SELECT cloudsync_init('articles', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('articles', 'body', 'algo', 'block') AS _sc1 \gset +SELECT cloudsync_set_column('articles', 'notes', 'algo', 'block') AS _sc2 \gset + +\connect cloudsync_block_edge_b +DROP TABLE IF EXISTS articles; +CREATE TABLE articles (id TEXT PRIMARY KEY NOT NULL, body TEXT, notes TEXT); +SELECT cloudsync_init('articles', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('articles', 'body', 'algo', 'block') AS _sc1 \gset +SELECT cloudsync_set_column('articles', 'notes', 'algo', 'block') AS _sc2 \gset + +-- Insert on A +\connect cloudsync_block_edge_a +INSERT INTO articles (id, body, notes) VALUES ('art1', E'Body line 1\nBody line 2', E'Note 1\nNote 2'); + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload4 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'articles' \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload4', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('articles', 'body', 'art1') AS _mb \gset +SELECT cloudsync_text_materialize('articles', 'notes', 'art1') AS _mn \gset + +SELECT (body = E'Body line 1\nBody line 2') AS twocol_body FROM articles WHERE id = 'art1' \gset +\if :twocol_body +\echo [PASS] (:testid) TwoBlockCols: body matches +\else +\echo [FAIL] (:testid) TwoBlockCols: body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (notes = E'Note 1\nNote 2') AS twocol_notes FROM articles WHERE id = 'art1' \gset +\if :twocol_notes +\echo [PASS] (:testid) TwoBlockCols: notes matches +\else +\echo [FAIL] (:testid) TwoBlockCols: notes mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit body on A, notes on B — then sync +\connect cloudsync_block_edge_a +UPDATE articles SET body = E'Body EDITED\nBody line 2' WHERE id = 'art1'; + +\connect cloudsync_block_edge_b +UPDATE articles SET notes = E'Note 1\nNote EDITED' WHERE id = 'art1'; + +-- Sync A -> B +\connect cloudsync_block_edge_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload4b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'articles' \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload4b', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('articles', 'body', 'art1') AS _mb \gset +SELECT cloudsync_text_materialize('articles', 'notes', 'art1') AS _mn \gset + +SELECT (body LIKE '%Body EDITED%') AS twocol_body_ed FROM articles WHERE id = 'art1' \gset +\if :twocol_body_ed +\echo [PASS] (:testid) TwoBlockCols: body edited +\else +\echo [FAIL] (:testid) TwoBlockCols: body edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (notes LIKE '%Note EDITED%') AS twocol_notes_ed FROM articles WHERE id = 'art1' \gset +\if :twocol_notes_ed +\echo [PASS] (:testid) TwoBlockCols: notes kept +\else +\echo [FAIL] (:testid) TwoBlockCols: notes edit lost +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: Text -> NULL (update to NULL removes all blocks) +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc5', E'Line1\nLine2\nLine3'); + +-- Verify blocks created +SELECT (count(*) = 3) AS blk_ok FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc5') \gset +\if :blk_ok +\echo [PASS] (:testid) TextToNull: 3 blocks created +\else +\echo [FAIL] (:testid) TextToNull: wrong initial block count +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Update to NULL +UPDATE docs SET body = NULL WHERE id = 'doc5'; + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload5 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc5') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload5', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc5') AS _mat \gset + +SELECT (body IS NULL) AS null_remote FROM docs WHERE id = 'doc5' \gset +\if :null_remote +\echo [PASS] (:testid) TextToNull: body is NULL on remote +\else +\echo [FAIL] (:testid) TextToNull: body not NULL on remote +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Payload-based sync with non-conflicting edits +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc6', E'First\nSecond\nThird'); + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload6i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc6') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload6i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc6') AS _mat \gset + +-- A edits line 1 +\connect cloudsync_block_edge_a +UPDATE docs SET body = E'FirstEdited\nSecond\nThird' WHERE id = 'doc6'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload6a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc6') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload6a', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc6') AS _mat \gset + +SELECT (body = E'FirstEdited\nSecond\nThird') AS payload_ok FROM docs WHERE id = 'doc6' \gset +\if :payload_ok +\echo [PASS] (:testid) PayloadSync: body matches +\else +\echo [FAIL] (:testid) PayloadSync: body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 7: Idempotent apply — same payload twice is a no-op +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc7', E'AAA\nBBB\nCCC'); + +-- Sync initial +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload7i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc7') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload7i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc7') AS _mat \gset + +-- A edits +\connect cloudsync_block_edge_a +UPDATE docs SET body = E'AAA-edited\nBBB\nCCC' WHERE id = 'doc7'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload7e +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc7') \gset + +-- Apply TWICE to B +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload7e', 'hex')) AS _app1 \gset +SELECT cloudsync_payload_apply(decode(:'payload7e', 'hex')) AS _app2 \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc7') AS _mat \gset + +SELECT (body LIKE '%AAA-edited%') AS idemp_ok FROM docs WHERE id = 'doc7' \gset +\if :idemp_ok +\echo [PASS] (:testid) Idempotent: body matches after double apply +\else +\echo [FAIL] (:testid) Idempotent: body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 8: Block position ordering — sequential inserts preserve order after sync +-- ============================================================ +\connect cloudsync_block_edge_a +INSERT INTO docs (id, body) VALUES ('doc8', E'Top\nBottom'); + +-- Sync initial to B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload8i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc8') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload8i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc8') AS _mat \gset + +-- A: add two lines between Top and Bottom +\connect cloudsync_block_edge_a +UPDATE docs SET body = E'Top\nMiddle1\nMiddle2\nBottom' WHERE id = 'doc8'; + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload8a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' +AND pk = cloudsync_pk_encode('doc8') \gset + +\connect cloudsync_block_edge_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload8a', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'doc8') AS _mat \gset + +SELECT (body LIKE 'Top%') AS ord_top FROM docs WHERE id = 'doc8' \gset +\if :ord_top +\echo [PASS] (:testid) Ordering: Top first +\else +\echo [FAIL] (:testid) Ordering: Top not first +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE '%Bottom') AS ord_bottom FROM docs WHERE id = 'doc8' \gset +\if :ord_bottom +\echo [PASS] (:testid) Ordering: Bottom last +\else +\echo [FAIL] (:testid) Ordering: Bottom not last +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Middle1 should come before Middle2 +SELECT (position('Middle1' IN body) < position('Middle2' IN body)) AS ord_correct FROM docs WHERE id = 'doc8' \gset +\if :ord_correct +\echo [PASS] (:testid) Ordering: Middle1 before Middle2 +\else +\echo [FAIL] (:testid) Ordering: wrong order +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body = E'Top\nMiddle1\nMiddle2\nBottom') AS ord_exact FROM docs WHERE id = 'doc8' \gset +\if :ord_exact +\echo [PASS] (:testid) Ordering: exact match +\else +\echo [FAIL] (:testid) Ordering: content mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_block_edge_a; +DROP DATABASE IF EXISTS cloudsync_block_edge_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/36_block_lww_round3.sql b/test/postgresql/36_block_lww_round3.sql new file mode 100644 index 0000000..7156faf --- /dev/null +++ b/test/postgresql/36_block_lww_round3.sql @@ -0,0 +1,476 @@ +-- 'Block-level LWW round 3: composite PK, empty vs null, delete+reinsert, integer PK, multi-row, non-overlapping add, long line, whitespace' + +\set testid '36' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_r3_a; +DROP DATABASE IF EXISTS cloudsync_block_r3_b; +CREATE DATABASE cloudsync_block_r3_a; +CREATE DATABASE cloudsync_block_r3_b; + +-- ============================================================ +-- Test 1: Composite primary key (text + int) with block column +-- ============================================================ +\connect cloudsync_block_r3_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (owner TEXT NOT NULL, seq INTEGER NOT NULL, body TEXT, PRIMARY KEY(owner, seq)); +SELECT cloudsync_init('docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS docs; +CREATE TABLE docs (owner TEXT NOT NULL, seq INTEGER NOT NULL, body TEXT, PRIMARY KEY(owner, seq)); +SELECT cloudsync_init('docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('docs', 'body', 'algo', 'block') AS _sc \gset + +-- Insert on A +\connect cloudsync_block_r3_a +INSERT INTO docs (owner, seq, body) VALUES ('alice', 1, E'Line1\nLine2\nLine3'); + +SELECT count(*) AS cpk_blocks FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('alice', 1) \gset +SELECT (:'cpk_blocks'::int = 3) AS cpk_blk_ok \gset +\if :cpk_blk_ok +\echo [PASS] (:testid) CompositePK: 3 blocks created +\else +\echo [FAIL] (:testid) CompositePK: expected 3 blocks, got :cpk_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload1 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload1', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'alice', 1) AS _mat \gset + +SELECT (body = E'Line1\nLine2\nLine3') AS cpk_body_ok FROM docs WHERE owner = 'alice' AND seq = 1 \gset +\if :cpk_body_ok +\echo [PASS] (:testid) CompositePK: body matches on B +\else +\echo [FAIL] (:testid) CompositePK: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit on B, sync back +UPDATE docs SET body = E'Line1\nEdited2\nLine3' WHERE owner = 'alice' AND seq = 1; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload1b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'docs' \gset + +\connect cloudsync_block_r3_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload1b', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('docs', 'body', 'alice', 1) AS _mat \gset + +SELECT (body = E'Line1\nEdited2\nLine3') AS cpk_rev_ok FROM docs WHERE owner = 'alice' AND seq = 1 \gset +\if :cpk_rev_ok +\echo [PASS] (:testid) CompositePK: reverse sync body matches +\else +\echo [FAIL] (:testid) CompositePK: reverse sync body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: Empty string vs NULL +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS edocs; +CREATE TABLE edocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('edocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('edocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS edocs; +CREATE TABLE edocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('edocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('edocs', 'body', 'algo', 'block') AS _sc \gset + +-- Insert empty string on A +\connect cloudsync_block_r3_a +INSERT INTO edocs (id, body) VALUES ('doc1', ''); + +SELECT count(*) AS evn_blocks FROM edocs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'evn_blocks'::int = 1) AS evn_blk_ok \gset +\if :evn_blk_ok +\echo [PASS] (:testid) EmptyVsNull: 1 block for empty string +\else +\echo [FAIL] (:testid) EmptyVsNull: expected 1 block, got :evn_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync to B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload2 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'edocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload2', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('edocs', 'body', 'doc1') AS _mat \gset + +SELECT (body IS NOT NULL AND body = '') AS evn_empty_ok FROM edocs WHERE id = 'doc1' \gset +\if :evn_empty_ok +\echo [PASS] (:testid) EmptyVsNull: body is empty string (not NULL) +\else +\echo [FAIL] (:testid) EmptyVsNull: body should be empty string +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: DELETE row then re-insert with different content +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS rdocs; +CREATE TABLE rdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('rdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('rdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS rdocs; +CREATE TABLE rdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('rdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('rdocs', 'body', 'algo', 'block') AS _sc \gset + +-- Insert and sync +\connect cloudsync_block_r3_a +INSERT INTO rdocs (id, body) VALUES ('doc1', E'Old1\nOld2\nOld3'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload3i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'rdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload3i', 'hex')) AS _app \gset + +-- Delete on A +\connect cloudsync_block_r3_a +DELETE FROM rdocs WHERE id = 'doc1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload3d +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'rdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload3d', 'hex')) AS _app \gset + +SELECT (count(*) = 0) AS dr_deleted FROM rdocs WHERE id = 'doc1' \gset +\if :dr_deleted +\echo [PASS] (:testid) DelReinsert: row deleted on B +\else +\echo [FAIL] (:testid) DelReinsert: row not deleted on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Re-insert with different content on A +\connect cloudsync_block_r3_a +INSERT INTO rdocs (id, body) VALUES ('doc1', E'New1\nNew2'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload3r +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'rdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload3r', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('rdocs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'New1\nNew2') AS dr_body_ok FROM rdocs WHERE id = 'doc1' \gset +\if :dr_body_ok +\echo [PASS] (:testid) DelReinsert: body matches after re-insert +\else +\echo [FAIL] (:testid) DelReinsert: body mismatch after re-insert +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: INTEGER primary key with block column +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS notes; +CREATE TABLE notes (id INTEGER PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('notes', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS notes; +CREATE TABLE notes (id INTEGER PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('notes', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('notes', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_a +INSERT INTO notes (id, body) VALUES (42, E'First\nSecond\nThird'); + +SELECT count(*) AS ipk_blocks FROM notes_cloudsync_blocks WHERE pk = cloudsync_pk_encode(42) \gset +SELECT (:'ipk_blocks'::int = 3) AS ipk_blk_ok \gset +\if :ipk_blk_ok +\echo [PASS] (:testid) IntegerPK: 3 blocks created +\else +\echo [FAIL] (:testid) IntegerPK: expected 3 blocks, got :ipk_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload4 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'notes' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload4', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('notes', 'body', 42) AS _mat \gset + +SELECT (body = E'First\nSecond\nThird') AS ipk_body_ok FROM notes WHERE id = 42 \gset +\if :ipk_body_ok +\echo [PASS] (:testid) IntegerPK: body matches on B +\else +\echo [FAIL] (:testid) IntegerPK: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: Multiple rows with block columns in a single sync +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS mdocs; +CREATE TABLE mdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('mdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('mdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS mdocs; +CREATE TABLE mdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('mdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('mdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_a +INSERT INTO mdocs (id, body) VALUES ('r1', E'R1-Line1\nR1-Line2'); +INSERT INTO mdocs (id, body) VALUES ('r2', E'R2-Alpha\nR2-Beta\nR2-Gamma'); +INSERT INTO mdocs (id, body) VALUES ('r3', 'R3-Only'); +UPDATE mdocs SET body = E'R1-Edited\nR1-Line2' WHERE id = 'r1'; +UPDATE mdocs SET body = 'R3-Changed' WHERE id = 'r3'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload5 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'mdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload5', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('mdocs', 'body', 'r1') AS _m1 \gset +SELECT cloudsync_text_materialize('mdocs', 'body', 'r2') AS _m2 \gset +SELECT cloudsync_text_materialize('mdocs', 'body', 'r3') AS _m3 \gset + +SELECT (body = E'R1-Edited\nR1-Line2') AS mr_r1 FROM mdocs WHERE id = 'r1' \gset +\if :mr_r1 +\echo [PASS] (:testid) MultiRow: r1 matches +\else +\echo [FAIL] (:testid) MultiRow: r1 mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body = E'R2-Alpha\nR2-Beta\nR2-Gamma') AS mr_r2 FROM mdocs WHERE id = 'r2' \gset +\if :mr_r2 +\echo [PASS] (:testid) MultiRow: r2 matches +\else +\echo [FAIL] (:testid) MultiRow: r2 mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body = 'R3-Changed') AS mr_r3 FROM mdocs WHERE id = 'r3' \gset +\if :mr_r3 +\echo [PASS] (:testid) MultiRow: r3 matches +\else +\echo [FAIL] (:testid) MultiRow: r3 mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Concurrent add at non-overlapping positions (top vs bottom) +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS ndocs; +CREATE TABLE ndocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('ndocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('ndocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS ndocs; +CREATE TABLE ndocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('ndocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('ndocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_a +INSERT INTO ndocs (id, body) VALUES ('doc1', E'A\nB\nC'); + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload6i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'ndocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload6i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('ndocs', 'body', 'doc1') AS _mat \gset + +-- A: add at top -> X A B C +\connect cloudsync_block_r3_a +UPDATE ndocs SET body = E'X\nA\nB\nC' WHERE id = 'doc1'; + +-- B: add at bottom -> A B C Y +\connect cloudsync_block_r3_b +UPDATE ndocs SET body = E'A\nB\nC\nY' WHERE id = 'doc1'; + +-- Sync A -> B +\connect cloudsync_block_r3_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload6a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'ndocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload6a', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('ndocs', 'body', 'doc1') AS _mat \gset + +SELECT (body LIKE '%X%') AS no_x FROM ndocs WHERE id = 'doc1' \gset +\if :no_x +\echo [PASS] (:testid) NonOverlap: X present +\else +\echo [FAIL] (:testid) NonOverlap: X missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE '%Y%') AS no_y FROM ndocs WHERE id = 'doc1' \gset +\if :no_y +\echo [PASS] (:testid) NonOverlap: Y present +\else +\echo [FAIL] (:testid) NonOverlap: Y missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (body LIKE 'X%' OR body LIKE E'%\nX\n%') AS no_x_before FROM ndocs WHERE id = 'doc1' \gset +\if :no_x_before +\echo [PASS] (:testid) NonOverlap: X before A +\else +\echo [FAIL] (:testid) NonOverlap: X not before A +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 7: Very long single line (10K chars) +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS ldocs; +CREATE TABLE ldocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('ldocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('ldocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS ldocs; +CREATE TABLE ldocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('ldocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('ldocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_a +INSERT INTO ldocs (id, body) VALUES ('doc1', repeat('ABCDEFGHIJ', 1000)); + +SELECT count(*) AS ll_blocks FROM ldocs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'ll_blocks'::int = 1) AS ll_blk_ok \gset +\if :ll_blk_ok +\echo [PASS] (:testid) LongLine: 1 block for 10K char line +\else +\echo [FAIL] (:testid) LongLine: expected 1 block, got :ll_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload7 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'ldocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload7', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('ldocs', 'body', 'doc1') AS _mat \gset + +SELECT (body = repeat('ABCDEFGHIJ', 1000)) AS ll_body_ok FROM ldocs WHERE id = 'doc1' \gset +\if :ll_body_ok +\echo [PASS] (:testid) LongLine: body matches on B +\else +\echo [FAIL] (:testid) LongLine: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 8: Whitespace and empty lines (delimiter edge cases) +-- ============================================================ +\connect cloudsync_block_r3_a +DROP TABLE IF EXISTS wdocs; +CREATE TABLE wdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('wdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('wdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_b +DROP TABLE IF EXISTS wdocs; +CREATE TABLE wdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('wdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('wdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r3_a +-- Text: "Line1\n\n spaces \n\t\ttabs\n\nLine6\n" = 7 blocks +INSERT INTO wdocs (id, body) VALUES ('doc1', E'Line1\n\n spaces \n\t\ttabs\n\nLine6\n'); + +SELECT count(*) AS ws_blocks FROM wdocs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'ws_blocks'::int = 7) AS ws_blk_ok \gset +\if :ws_blk_ok +\echo [PASS] (:testid) Whitespace: 7 blocks with empty/whitespace lines +\else +\echo [FAIL] (:testid) Whitespace: expected 7 blocks, got :ws_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload8 +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'wdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload8', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('wdocs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Line1\n\n spaces \n\t\ttabs\n\nLine6\n') AS ws_body_ok FROM wdocs WHERE id = 'doc1' \gset +\if :ws_body_ok +\echo [PASS] (:testid) Whitespace: body matches with whitespace preserved +\else +\echo [FAIL] (:testid) Whitespace: body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit: remove empty lines +\connect cloudsync_block_r3_a +UPDATE wdocs SET body = E'Line1\n spaces \n\t\ttabs\nLine6' WHERE id = 'doc1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload8b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'wdocs' \gset + +\connect cloudsync_block_r3_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload8b', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('wdocs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Line1\n spaces \n\t\ttabs\nLine6') AS ws_edit_ok FROM wdocs WHERE id = 'doc1' \gset +\if :ws_edit_ok +\echo [PASS] (:testid) Whitespace: edited body matches +\else +\echo [FAIL] (:testid) Whitespace: edited body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_block_r3_a; +DROP DATABASE IF EXISTS cloudsync_block_r3_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/37_block_lww_round4.sql b/test/postgresql/37_block_lww_round4.sql new file mode 100644 index 0000000..2b0c77b --- /dev/null +++ b/test/postgresql/37_block_lww_round4.sql @@ -0,0 +1,500 @@ +-- 'Block-level LWW round 4: UUID PK, RLS+blocks, multi-table, 3-site convergence, custom delimiter sync, mixed column updates' + +\set testid '37' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_r4_a; +DROP DATABASE IF EXISTS cloudsync_block_r4_b; +DROP DATABASE IF EXISTS cloudsync_block_r4_c; +DROP DATABASE IF EXISTS cloudsync_block_3s_a; +DROP DATABASE IF EXISTS cloudsync_block_3s_b; +DROP DATABASE IF EXISTS cloudsync_block_3s_c; +CREATE DATABASE cloudsync_block_r4_a; +CREATE DATABASE cloudsync_block_r4_b; +CREATE DATABASE cloudsync_block_r4_c; + +-- ============================================================ +-- Test 1: UUID primary key with block column +-- ============================================================ +\connect cloudsync_block_r4_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS uuid_docs; +CREATE TABLE uuid_docs (id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid(), body TEXT); +SELECT cloudsync_init('uuid_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('uuid_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS uuid_docs; +CREATE TABLE uuid_docs (id UUID PRIMARY KEY NOT NULL DEFAULT gen_random_uuid(), body TEXT); +SELECT cloudsync_init('uuid_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('uuid_docs', 'body', 'algo', 'block') AS _sc \gset + +-- Insert on A with explicit UUID +\connect cloudsync_block_r4_a +INSERT INTO uuid_docs (id, body) VALUES ('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11', E'UUID-Line1\nUUID-Line2\nUUID-Line3'); + +SELECT count(*) AS uuid_blocks FROM uuid_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11') \gset +SELECT (:'uuid_blocks'::int = 3) AS uuid_blk_ok \gset +\if :uuid_blk_ok +\echo [PASS] (:testid) UUID_PK: 3 blocks created +\else +\echo [FAIL] (:testid) UUID_PK: expected 3 blocks, got :uuid_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_uuid +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'uuid_docs' \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_uuid', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('uuid_docs', 'body', 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11') AS _mat \gset + +SELECT (body = E'UUID-Line1\nUUID-Line2\nUUID-Line3') AS uuid_body_ok FROM uuid_docs WHERE id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11' \gset +\if :uuid_body_ok +\echo [PASS] (:testid) UUID_PK: body matches on B +\else +\echo [FAIL] (:testid) UUID_PK: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit on B, reverse sync +\connect cloudsync_block_r4_b +UPDATE uuid_docs SET body = E'UUID-Line1\nUUID-Edited\nUUID-Line3' WHERE id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_uuid_r +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'uuid_docs' \gset + +\connect cloudsync_block_r4_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_uuid_r', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('uuid_docs', 'body', 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11') AS _mat \gset + +SELECT (body = E'UUID-Line1\nUUID-Edited\nUUID-Line3') AS uuid_rev_ok FROM uuid_docs WHERE id = 'a0eebc99-9c0b-4ef8-bb6d-6bb9bd380a11' \gset +\if :uuid_rev_ok +\echo [PASS] (:testid) UUID_PK: reverse sync matches +\else +\echo [FAIL] (:testid) UUID_PK: reverse sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 2: RLS filter + block columns +-- Only rows matching filter should have block tracking +-- ============================================================ +\connect cloudsync_block_r4_a +DROP TABLE IF EXISTS rls_docs; +CREATE TABLE rls_docs (id TEXT PRIMARY KEY NOT NULL, owner_id INTEGER, body TEXT); +SELECT cloudsync_init('rls_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('rls_docs', 'body', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_filter('rls_docs', 'owner_id = 1') AS _sf \gset + +\connect cloudsync_block_r4_b +DROP TABLE IF EXISTS rls_docs; +CREATE TABLE rls_docs (id TEXT PRIMARY KEY NOT NULL, owner_id INTEGER, body TEXT); +SELECT cloudsync_init('rls_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('rls_docs', 'body', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_filter('rls_docs', 'owner_id = 1') AS _sf \gset + +-- Insert matching row (owner_id=1) and non-matching row (owner_id=2) +\connect cloudsync_block_r4_a +INSERT INTO rls_docs (id, owner_id, body) VALUES ('match1', 1, E'Filtered-Line1\nFiltered-Line2'); +INSERT INTO rls_docs (id, owner_id, body) VALUES ('nomatch', 2, E'Hidden-Line1\nHidden-Line2'); + +-- Check: matching row has blocks, non-matching does not +SELECT count(*) AS rls_match_blocks FROM rls_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('match1') \gset +SELECT count(*) AS rls_nomatch_blocks FROM rls_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('nomatch') \gset + +SELECT (:'rls_match_blocks'::int = 2) AS rls_match_ok \gset +\if :rls_match_ok +\echo [PASS] (:testid) RLS+Blocks: matching row has 2 blocks +\else +\echo [FAIL] (:testid) RLS+Blocks: expected 2 blocks for matching row, got :rls_match_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (:'rls_nomatch_blocks'::int = 0) AS rls_nomatch_ok \gset +\if :rls_nomatch_ok +\echo [PASS] (:testid) RLS+Blocks: non-matching row has 0 blocks +\else +\echo [FAIL] (:testid) RLS+Blocks: expected 0 blocks for non-matching row, got :rls_nomatch_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync: only matching row should appear in changes +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_rls +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'rls_docs' \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_rls', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('rls_docs', 'body', 'match1') AS _mat \gset + +SELECT (body = E'Filtered-Line1\nFiltered-Line2') AS rls_sync_ok FROM rls_docs WHERE id = 'match1' \gset +\if :rls_sync_ok +\echo [PASS] (:testid) RLS+Blocks: matching row synced with correct body +\else +\echo [FAIL] (:testid) RLS+Blocks: matching row body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- non-matching row should NOT exist on B +SELECT (count(*) = 0) AS rls_norow_ok FROM rls_docs WHERE id = 'nomatch' \gset +\if :rls_norow_ok +\echo [PASS] (:testid) RLS+Blocks: non-matching row not synced +\else +\echo [FAIL] (:testid) RLS+Blocks: non-matching row should not exist on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 3: Multi-table blocks — two tables with block columns in same payload +-- ============================================================ +\connect cloudsync_block_r4_a +DROP TABLE IF EXISTS articles; +DROP TABLE IF EXISTS comments; +CREATE TABLE articles (id TEXT PRIMARY KEY NOT NULL, content TEXT); +CREATE TABLE comments (id TEXT PRIMARY KEY NOT NULL, text_body TEXT); +SELECT cloudsync_init('articles', 'CLS', true) AS _init \gset +SELECT cloudsync_init('comments', 'CLS', true) AS _init2 \gset +SELECT cloudsync_set_column('articles', 'content', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_column('comments', 'text_body', 'algo', 'block') AS _sc2 \gset + +\connect cloudsync_block_r4_b +DROP TABLE IF EXISTS articles; +DROP TABLE IF EXISTS comments; +CREATE TABLE articles (id TEXT PRIMARY KEY NOT NULL, content TEXT); +CREATE TABLE comments (id TEXT PRIMARY KEY NOT NULL, text_body TEXT); +SELECT cloudsync_init('articles', 'CLS', true) AS _init \gset +SELECT cloudsync_init('comments', 'CLS', true) AS _init2 \gset +SELECT cloudsync_set_column('articles', 'content', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_column('comments', 'text_body', 'algo', 'block') AS _sc2 \gset + +\connect cloudsync_block_r4_a +INSERT INTO articles (id, content) VALUES ('art1', E'Para1\nPara2\nPara3'); +INSERT INTO comments (id, text_body) VALUES ('cmt1', E'Comment-Line1\nComment-Line2'); +UPDATE articles SET content = E'Para1-Edited\nPara2\nPara3' WHERE id = 'art1'; + +-- Single payload containing changes from both tables +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_mt +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_mt', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('articles', 'content', 'art1') AS _m1 \gset +SELECT cloudsync_text_materialize('comments', 'text_body', 'cmt1') AS _m2 \gset + +SELECT (content = E'Para1-Edited\nPara2\nPara3') AS mt_art_ok FROM articles WHERE id = 'art1' \gset +\if :mt_art_ok +\echo [PASS] (:testid) MultiTable: articles content matches +\else +\echo [FAIL] (:testid) MultiTable: articles content mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (text_body = E'Comment-Line1\nComment-Line2') AS mt_cmt_ok FROM comments WHERE id = 'cmt1' \gset +\if :mt_cmt_ok +\echo [PASS] (:testid) MultiTable: comments text_body matches +\else +\echo [FAIL] (:testid) MultiTable: comments text_body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 4: Three-site convergence with block columns +-- All three sites make different edits, pairwise sync, verify convergence +-- Uses dedicated databases so all 3 have identical schema +-- ============================================================ +\connect postgres +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_block_3s_a; +DROP DATABASE IF EXISTS cloudsync_block_3s_b; +DROP DATABASE IF EXISTS cloudsync_block_3s_c; +CREATE DATABASE cloudsync_block_3s_a; +CREATE DATABASE cloudsync_block_3s_b; +CREATE DATABASE cloudsync_block_3s_c; + +\connect cloudsync_block_3s_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE tdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('tdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('tdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_3s_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE tdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('tdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('tdocs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_3s_c +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +CREATE TABLE tdocs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('tdocs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('tdocs', 'body', 'algo', 'block') AS _sc \gset + +-- Initial insert on A, sync to B and C +\connect cloudsync_block_3s_a +INSERT INTO tdocs (id, body) VALUES ('doc1', E'Line1\nLine2\nLine3\nLine4\nLine5'); + +-- Full changes from A (includes schema info) +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3s_init +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'tdocs' \gset + +\connect cloudsync_block_3s_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3s_init', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('tdocs', 'body', 'doc1') AS _mat \gset + +\connect cloudsync_block_3s_c +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3s_init', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('tdocs', 'body', 'doc1') AS _mat \gset + +-- Each site edits a DIFFERENT line (no conflicts) +-- A edits line 1 +\connect cloudsync_block_3s_a +UPDATE tdocs SET body = E'Line1-A\nLine2\nLine3\nLine4\nLine5' WHERE id = 'doc1'; + +-- B edits line 3 +\connect cloudsync_block_3s_b +UPDATE tdocs SET body = E'Line1\nLine2\nLine3-B\nLine4\nLine5' WHERE id = 'doc1'; + +-- C edits line 5 +\connect cloudsync_block_3s_c +UPDATE tdocs SET body = E'Line1\nLine2\nLine3\nLine4\nLine5-C' WHERE id = 'doc1'; + +-- Collect ALL changes from each site (not filtered by site_id) +-- This includes the schema info that recipients need +\connect cloudsync_block_3s_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3s_a +FROM cloudsync_changes WHERE tbl = 'tdocs' \gset + +\connect cloudsync_block_3s_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3s_b +FROM cloudsync_changes WHERE tbl = 'tdocs' \gset + +\connect cloudsync_block_3s_c +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_3s_c +FROM cloudsync_changes WHERE tbl = 'tdocs' \gset + +-- Apply all to A (B's and C's changes) +\connect cloudsync_block_3s_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3s_b', 'hex')) AS _app_b \gset +SELECT cloudsync_payload_apply(decode(:'payload_3s_c', 'hex')) AS _app_c \gset +SELECT cloudsync_text_materialize('tdocs', 'body', 'doc1') AS _mat \gset + +-- Apply all to B (A's and C's changes) +\connect cloudsync_block_3s_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3s_a', 'hex')) AS _app_a \gset +SELECT cloudsync_payload_apply(decode(:'payload_3s_c', 'hex')) AS _app_c \gset +SELECT cloudsync_text_materialize('tdocs', 'body', 'doc1') AS _mat \gset + +-- Apply all to C (A's and B's changes) +\connect cloudsync_block_3s_c +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_3s_a', 'hex')) AS _app_a \gset +SELECT cloudsync_payload_apply(decode(:'payload_3s_b', 'hex')) AS _app_b \gset +SELECT cloudsync_text_materialize('tdocs', 'body', 'doc1') AS _mat \gset + +-- All three should converge +\connect cloudsync_block_3s_a +SELECT body AS body_a FROM tdocs WHERE id = 'doc1' \gset +\connect cloudsync_block_3s_b +SELECT body AS body_b FROM tdocs WHERE id = 'doc1' \gset +\connect cloudsync_block_3s_c +SELECT body AS body_c FROM tdocs WHERE id = 'doc1' \gset + +SELECT (:'body_a' = :'body_b') AS ab_match \gset +SELECT (:'body_b' = :'body_c') AS bc_match \gset + +\if :ab_match +\echo [PASS] (:testid) 3-Site: A and B converge +\else +\echo [FAIL] (:testid) 3-Site: A and B diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :bc_match +\echo [PASS] (:testid) 3-Site: B and C converge +\else +\echo [FAIL] (:testid) 3-Site: B and C diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- All edits should be present (non-conflicting) +SELECT (position('Line1-A' in :'body_a') > 0) AS has_a \gset +SELECT (position('Line3-B' in :'body_a') > 0) AS has_b \gset +SELECT (position('Line5-C' in :'body_a') > 0) AS has_c \gset + +\if :has_a +\echo [PASS] (:testid) 3-Site: Site A edit present +\else +\echo [FAIL] (:testid) 3-Site: Site A edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :has_b +\echo [PASS] (:testid) 3-Site: Site B edit present +\else +\echo [FAIL] (:testid) 3-Site: Site B edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :has_c +\echo [PASS] (:testid) 3-Site: Site C edit present +\else +\echo [FAIL] (:testid) 3-Site: Site C edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 5: Custom delimiter sync roundtrip +-- Uses paragraph delimiter (double newline), edits, syncs +-- ============================================================ +\connect cloudsync_block_r4_a +DROP TABLE IF EXISTS para_docs; +CREATE TABLE para_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('para_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('para_docs', 'body', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_column('para_docs', 'body', 'delimiter', E'\n\n') AS _sd \gset + +\connect cloudsync_block_r4_b +DROP TABLE IF EXISTS para_docs; +CREATE TABLE para_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('para_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('para_docs', 'body', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_column('para_docs', 'body', 'delimiter', E'\n\n') AS _sd \gset + +\connect cloudsync_block_r4_a +INSERT INTO para_docs (id, body) VALUES ('doc1', E'First paragraph.\n\nSecond paragraph.\n\nThird paragraph.'); + +SELECT count(*) AS pd_blocks FROM para_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'pd_blocks'::int = 3) AS pd_blk_ok \gset +\if :pd_blk_ok +\echo [PASS] (:testid) CustomDelimSync: 3 paragraph blocks +\else +\echo [FAIL] (:testid) CustomDelimSync: expected 3 blocks, got :pd_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_pd +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'para_docs' \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_pd', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('para_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'First paragraph.\n\nSecond paragraph.\n\nThird paragraph.') AS pd_sync_ok FROM para_docs WHERE id = 'doc1' \gset +\if :pd_sync_ok +\echo [PASS] (:testid) CustomDelimSync: body matches on B +\else +\echo [FAIL] (:testid) CustomDelimSync: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit paragraph 2 on B, sync back +\connect cloudsync_block_r4_b +UPDATE para_docs SET body = E'First paragraph.\n\nEdited second paragraph.\n\nThird paragraph.' WHERE id = 'doc1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_pd_r +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'para_docs' \gset + +\connect cloudsync_block_r4_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_pd_r', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('para_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'First paragraph.\n\nEdited second paragraph.\n\nThird paragraph.') AS pd_rev_ok FROM para_docs WHERE id = 'doc1' \gset +\if :pd_rev_ok +\echo [PASS] (:testid) CustomDelimSync: reverse sync matches +\else +\echo [FAIL] (:testid) CustomDelimSync: reverse sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 6: Block column + regular LWW column — mixed update +-- Single UPDATE changes both block col and regular col +-- ============================================================ +\connect cloudsync_block_r4_a +DROP TABLE IF EXISTS mixed_docs; +CREATE TABLE mixed_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT, title TEXT); +SELECT cloudsync_init('mixed_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('mixed_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r4_b +DROP TABLE IF EXISTS mixed_docs; +CREATE TABLE mixed_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT, title TEXT); +SELECT cloudsync_init('mixed_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('mixed_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r4_a +INSERT INTO mixed_docs (id, body, title) VALUES ('doc1', E'Body-Line1\nBody-Line2', 'Original Title'); + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_mix_i +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'mixed_docs' \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_mix_i', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('mixed_docs', 'body', 'doc1') AS _mat \gset + +-- Update BOTH columns simultaneously on A +\connect cloudsync_block_r4_a +UPDATE mixed_docs SET body = E'Body-Edited1\nBody-Line2', title = 'New Title' WHERE id = 'doc1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_mix_u +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'mixed_docs' \gset + +\connect cloudsync_block_r4_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_mix_u', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('mixed_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Body-Edited1\nBody-Line2') AS mix_body_ok FROM mixed_docs WHERE id = 'doc1' \gset +\if :mix_body_ok +\echo [PASS] (:testid) MixedUpdate: block column body matches +\else +\echo [FAIL] (:testid) MixedUpdate: block column body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT (title = 'New Title') AS mix_title_ok FROM mixed_docs WHERE id = 'doc1' \gset +\if :mix_title_ok +\echo [PASS] (:testid) MixedUpdate: regular column title matches +\else +\echo [FAIL] (:testid) MixedUpdate: regular column title mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_block_r4_a; +DROP DATABASE IF EXISTS cloudsync_block_r4_b; +DROP DATABASE IF EXISTS cloudsync_block_r4_c; +DROP DATABASE IF EXISTS cloudsync_block_3s_a; +DROP DATABASE IF EXISTS cloudsync_block_3s_b; +DROP DATABASE IF EXISTS cloudsync_block_3s_c; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/38_block_lww_round5.sql b/test/postgresql/38_block_lww_round5.sql new file mode 100644 index 0000000..8e796f0 --- /dev/null +++ b/test/postgresql/38_block_lww_round5.sql @@ -0,0 +1,433 @@ +-- 'Block-level LWW round 5: large blocks, payload idempotency composite PK, init with existing data, drop/re-add block config, delimiter-in-content' + +\set testid '38' +\ir helper_test_init.sql + +\connect postgres +\ir helper_psql_conn_setup.sql + +DROP DATABASE IF EXISTS cloudsync_block_r5_a; +DROP DATABASE IF EXISTS cloudsync_block_r5_b; +CREATE DATABASE cloudsync_block_r5_a; +CREATE DATABASE cloudsync_block_r5_b; + +-- ============================================================ +-- Test 7: Large number of blocks (200+ lines) +-- Verify diff and materialize work correctly at scale +-- ============================================================ +\connect cloudsync_block_r5_a +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS big_docs; +CREATE TABLE big_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('big_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('big_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +CREATE EXTENSION IF NOT EXISTS cloudsync; +DROP TABLE IF EXISTS big_docs; +CREATE TABLE big_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('big_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('big_docs', 'body', 'algo', 'block') AS _sc \gset + +-- Generate 250-line text +\connect cloudsync_block_r5_a +INSERT INTO big_docs (id, body) +SELECT 'doc1', string_agg('Line-' || gs::text, E'\n' ORDER BY gs) +FROM generate_series(1, 250) gs; + +SELECT count(*) AS big_blocks FROM big_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'big_blocks'::int = 250) AS big_blk_ok \gset +\if :big_blk_ok +\echo [PASS] (:testid) LargeBlocks: 250 blocks created +\else +\echo [FAIL] (:testid) LargeBlocks: expected 250 blocks, got :big_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit a few lines scattered through the document +UPDATE big_docs SET body = ( + SELECT string_agg( + CASE + WHEN gs = 50 THEN 'EDITED-50' + WHEN gs = 150 THEN 'EDITED-150' + WHEN gs = 200 THEN 'EDITED-200' + ELSE 'Line-' || gs::text + END, + E'\n' ORDER BY gs + ) FROM generate_series(1, 250) gs +) WHERE id = 'doc1'; + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_big +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'big_docs' \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_big', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('big_docs', 'body', 'doc1') AS _mat \gset + +-- Verify edited lines are present +SELECT (position('EDITED-50' in body) > 0) AS big_e50 FROM big_docs WHERE id = 'doc1' \gset +SELECT (position('EDITED-150' in body) > 0) AS big_e150 FROM big_docs WHERE id = 'doc1' \gset +SELECT (position('EDITED-200' in body) > 0) AS big_e200 FROM big_docs WHERE id = 'doc1' \gset + +\if :big_e50 +\echo [PASS] (:testid) LargeBlocks: EDITED-50 present +\else +\echo [FAIL] (:testid) LargeBlocks: EDITED-50 missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :big_e150 +\echo [PASS] (:testid) LargeBlocks: EDITED-150 present +\else +\echo [FAIL] (:testid) LargeBlocks: EDITED-150 missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :big_e200 +\echo [PASS] (:testid) LargeBlocks: EDITED-200 present +\else +\echo [FAIL] (:testid) LargeBlocks: EDITED-200 missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Verify block count still 250 (edits don't change count) +SELECT count(*) AS big_blocks2 FROM big_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'big_blocks2'::int = 250) AS big_cnt_ok \gset +\if :big_cnt_ok +\echo [PASS] (:testid) LargeBlocks: block count stable after sync +\else +\echo [FAIL] (:testid) LargeBlocks: expected 250 blocks after sync, got :big_blocks2 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 8: Payload idempotency with composite PK +-- Apply same payload twice, verify no duplication or corruption +-- ============================================================ +\connect cloudsync_block_r5_a +DROP TABLE IF EXISTS idem_docs; +CREATE TABLE idem_docs (owner TEXT NOT NULL, seq INTEGER NOT NULL, body TEXT, PRIMARY KEY(owner, seq)); +SELECT cloudsync_init('idem_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('idem_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r5_b +DROP TABLE IF EXISTS idem_docs; +CREATE TABLE idem_docs (owner TEXT NOT NULL, seq INTEGER NOT NULL, body TEXT, PRIMARY KEY(owner, seq)); +SELECT cloudsync_init('idem_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('idem_docs', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r5_a +INSERT INTO idem_docs (owner, seq, body) VALUES ('bob', 1, E'Idem-Line1\nIdem-Line2\nIdem-Line3'); +UPDATE idem_docs SET body = E'Idem-Line1\nIdem-Edited\nIdem-Line3' WHERE owner = 'bob' AND seq = 1; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_idem +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'idem_docs' \gset + +-- Apply on B — first time +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_idem', 'hex')) AS _app1 \gset +SELECT cloudsync_text_materialize('idem_docs', 'body', 'bob', 1) AS _mat1 \gset + +SELECT (body = E'Idem-Line1\nIdem-Edited\nIdem-Line3') AS idem1_ok FROM idem_docs WHERE owner = 'bob' AND seq = 1 \gset +\if :idem1_ok +\echo [PASS] (:testid) Idempotent: first apply correct +\else +\echo [FAIL] (:testid) Idempotent: first apply mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +SELECT count(*) AS idem_meta1 FROM idem_docs_cloudsync WHERE pk = cloudsync_pk_encode('bob', 1) \gset + +-- Apply SAME payload again — second time (idempotent) +SELECT cloudsync_payload_apply(decode(:'payload_idem', 'hex')) AS _app2 \gset +SELECT cloudsync_text_materialize('idem_docs', 'body', 'bob', 1) AS _mat2 \gset + +SELECT (body = E'Idem-Line1\nIdem-Edited\nIdem-Line3') AS idem2_ok FROM idem_docs WHERE owner = 'bob' AND seq = 1 \gset +\if :idem2_ok +\echo [PASS] (:testid) Idempotent: second apply still correct +\else +\echo [FAIL] (:testid) Idempotent: body corrupted after double apply +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Metadata count should not change +SELECT count(*) AS idem_meta2 FROM idem_docs_cloudsync WHERE pk = cloudsync_pk_encode('bob', 1) \gset +SELECT (:'idem_meta1' = :'idem_meta2') AS idem_meta_ok \gset +\if :idem_meta_ok +\echo [PASS] (:testid) Idempotent: metadata count unchanged after double apply +\else +\echo [FAIL] (:testid) Idempotent: metadata count changed (:idem_meta1 vs :idem_meta2) +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 9: Init with pre-existing data, then enable block column +-- Table has rows before cloudsync_set_column algo=block +-- ============================================================ +\connect cloudsync_block_r5_a +DROP TABLE IF EXISTS predata; +CREATE TABLE predata (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('predata', 'CLS', true) AS _init \gset + +-- Insert rows BEFORE enabling block algorithm +INSERT INTO predata (id, body) VALUES ('pre1', E'Pre-Line1\nPre-Line2'); +INSERT INTO predata (id, body) VALUES ('pre2', E'Pre-Alpha\nPre-Beta\nPre-Gamma'); + +-- Now enable block on the column +SELECT cloudsync_set_column('predata', 'body', 'algo', 'block') AS _sc \gset + +\connect cloudsync_block_r5_b +DROP TABLE IF EXISTS predata; +CREATE TABLE predata (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('predata', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('predata', 'body', 'algo', 'block') AS _sc \gset + +-- Update a pre-existing row on A to trigger block creation +\connect cloudsync_block_r5_a +UPDATE predata SET body = E'Pre-Line1\nPre-Edited2' WHERE id = 'pre1'; + +SELECT count(*) AS pre_blocks FROM predata_cloudsync_blocks WHERE pk = cloudsync_pk_encode('pre1') \gset +SELECT (:'pre_blocks'::int >= 2) AS pre_blk_ok \gset +\if :pre_blk_ok +\echo [PASS] (:testid) PreExisting: blocks created after update +\else +\echo [FAIL] (:testid) PreExisting: expected >= 2 blocks, got :pre_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync to B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_pre +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'predata' \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_pre', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('predata', 'body', 'pre1') AS _mat \gset + +SELECT (body = E'Pre-Line1\nPre-Edited2') AS pre_sync_ok FROM predata WHERE id = 'pre1' \gset +\if :pre_sync_ok +\echo [PASS] (:testid) PreExisting: synced body matches after late block enable +\else +\echo [FAIL] (:testid) PreExisting: synced body mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- pre2 should also sync (as regular LWW or with insert sentinel) +SELECT (count(*) = 1) AS pre2_exists FROM predata WHERE id = 'pre2' \gset +\if :pre2_exists +\echo [PASS] (:testid) PreExisting: pre2 row synced +\else +\echo [FAIL] (:testid) PreExisting: pre2 row missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 10: Remove block algo then re-add +-- ============================================================ +\connect cloudsync_block_r5_a +DROP TABLE IF EXISTS toggle_docs; +CREATE TABLE toggle_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('toggle_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('toggle_docs', 'body', 'algo', 'block') AS _sc1 \gset + +\connect cloudsync_block_r5_b +DROP TABLE IF EXISTS toggle_docs; +CREATE TABLE toggle_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('toggle_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('toggle_docs', 'body', 'algo', 'block') AS _sc1 \gset + +-- Insert with blocks on A +\connect cloudsync_block_r5_a +INSERT INTO toggle_docs (id, body) VALUES ('doc1', E'Toggle-Line1\nToggle-Line2'); + +SELECT count(*) AS tog_blocks1 FROM toggle_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'tog_blocks1'::int = 2) AS tog_blk1_ok \gset +\if :tog_blk1_ok +\echo [PASS] (:testid) Toggle: blocks created initially +\else +\echo [FAIL] (:testid) Toggle: expected 2 blocks initially, got :tog_blocks1 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Remove block algo (set to default LWW) +SELECT cloudsync_set_column('toggle_docs', 'body', 'algo', 'lww') AS _sc2 \gset + +-- Update while in LWW mode — should NOT create new blocks +UPDATE toggle_docs SET body = E'Toggle-LWW-Updated' WHERE id = 'doc1'; + +-- Re-enable block algo +SELECT cloudsync_set_column('toggle_docs', 'body', 'algo', 'block') AS _sc3 \gset + +-- Update with blocks re-enabled +UPDATE toggle_docs SET body = E'Toggle-Block-Again1\nToggle-Block-Again2\nToggle-Block-Again3' WHERE id = 'doc1'; + +SELECT count(*) AS tog_blocks2 FROM toggle_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'tog_blocks2'::int = 3) AS tog_blk2_ok \gset +\if :tog_blk2_ok +\echo [PASS] (:testid) Toggle: 3 blocks after re-enable +\else +\echo [FAIL] (:testid) Toggle: expected 3 blocks after re-enable, got :tog_blocks2 +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync to B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_tog +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'toggle_docs' \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_tog', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('toggle_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Toggle-Block-Again1\nToggle-Block-Again2\nToggle-Block-Again3') AS tog_sync_ok FROM toggle_docs WHERE id = 'doc1' \gset +\if :tog_sync_ok +\echo [PASS] (:testid) Toggle: body matches after re-enable and sync +\else +\echo [FAIL] (:testid) Toggle: body mismatch after re-enable and sync +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Test 11: Text containing the delimiter character as content +-- Default delimiter is \n — content has no real structure, just embedded newlines +-- ============================================================ +\connect cloudsync_block_r5_a +DROP TABLE IF EXISTS delim_docs; +CREATE TABLE delim_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('delim_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('delim_docs', 'body', 'algo', 'block') AS _sc \gset +-- Use paragraph delimiter (double newline) +SELECT cloudsync_set_column('delim_docs', 'body', 'delimiter', E'\n\n') AS _sd \gset + +\connect cloudsync_block_r5_b +DROP TABLE IF EXISTS delim_docs; +CREATE TABLE delim_docs (id TEXT PRIMARY KEY NOT NULL, body TEXT); +SELECT cloudsync_init('delim_docs', 'CLS', true) AS _init \gset +SELECT cloudsync_set_column('delim_docs', 'body', 'algo', 'block') AS _sc \gset +SELECT cloudsync_set_column('delim_docs', 'body', 'delimiter', E'\n\n') AS _sd \gset + +-- Content with single newlines inside paragraphs (not delimiters) +\connect cloudsync_block_r5_a +INSERT INTO delim_docs (id, body) VALUES ('doc1', E'Paragraph one\nstill paragraph one.\n\nParagraph two\nstill para two.\n\nParagraph three.'); + +-- Should be 3 blocks (split by double newline) +SELECT count(*) AS dc_blocks FROM delim_docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1') \gset +SELECT (:'dc_blocks'::int = 3) AS dc_blk_ok \gset +\if :dc_blk_ok +\echo [PASS] (:testid) DelimContent: 3 paragraph blocks (single newlines inside) +\else +\echo [FAIL] (:testid) DelimContent: expected 3 blocks, got :dc_blocks +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Sync A -> B +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_dc +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'delim_docs' \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_dc', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('delim_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Paragraph one\nstill paragraph one.\n\nParagraph two\nstill para two.\n\nParagraph three.') AS dc_sync_ok FROM delim_docs WHERE id = 'doc1' \gset +\if :dc_sync_ok +\echo [PASS] (:testid) DelimContent: body matches on B (embedded newlines preserved) +\else +\echo [FAIL] (:testid) DelimContent: body mismatch on B +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Edit paragraph 2 on B (change only the second paragraph), sync back +\connect cloudsync_block_r5_b +UPDATE delim_docs SET body = E'Paragraph one\nstill paragraph one.\n\nEdited paragraph two.\n\nParagraph three.' WHERE id = 'doc1'; + +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_dc_r +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'delim_docs' \gset + +\connect cloudsync_block_r5_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_dc_r', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('delim_docs', 'body', 'doc1') AS _mat \gset + +SELECT (body = E'Paragraph one\nstill paragraph one.\n\nEdited paragraph two.\n\nParagraph three.') AS dc_rev_ok FROM delim_docs WHERE id = 'doc1' \gset +\if :dc_rev_ok +\echo [PASS] (:testid) DelimContent: reverse sync matches (paragraph edit) +\else +\echo [FAIL] (:testid) DelimContent: reverse sync mismatch +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- Concurrent edit: A edits para 1, B edits para 3 +\connect cloudsync_block_r5_a +UPDATE delim_docs SET body = E'Edited para one by A.\n\nEdited paragraph two.\n\nParagraph three.' WHERE id = 'doc1'; + +\connect cloudsync_block_r5_b +UPDATE delim_docs SET body = E'Paragraph one\nstill paragraph one.\n\nEdited paragraph two.\n\nEdited para three by B.' WHERE id = 'doc1'; + +\connect cloudsync_block_r5_a +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_dc_a +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'delim_docs' \gset + +\connect cloudsync_block_r5_b +SELECT encode(cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq), 'hex') AS payload_dc_b +FROM cloudsync_changes WHERE site_id = cloudsync_siteid() AND tbl = 'delim_docs' \gset + +-- Apply cross +\connect cloudsync_block_r5_a +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_dc_b', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('delim_docs', 'body', 'doc1') AS _mat \gset + +\connect cloudsync_block_r5_b +\ir helper_psql_conn_setup.sql +SELECT cloudsync_payload_apply(decode(:'payload_dc_a', 'hex')) AS _app \gset +SELECT cloudsync_text_materialize('delim_docs', 'body', 'doc1') AS _mat \gset + +-- Both should converge and both edits should be present +\connect cloudsync_block_r5_a +SELECT md5(body) AS dc_md5_a FROM delim_docs WHERE id = 'doc1' \gset +\connect cloudsync_block_r5_b +SELECT md5(body) AS dc_md5_b FROM delim_docs WHERE id = 'doc1' \gset + +SELECT (:'dc_md5_a' = :'dc_md5_b') AS dc_converge \gset +\if :dc_converge +\echo [PASS] (:testid) DelimContent: concurrent paragraph edits converge +\else +\echo [FAIL] (:testid) DelimContent: concurrent paragraph edits diverged +SELECT (:fail::int + 1) AS fail \gset +\endif + +\connect cloudsync_block_r5_a +SELECT (position('Edited para one by A.' in body) > 0) AS dc_has_a FROM delim_docs WHERE id = 'doc1' \gset +SELECT (position('Edited para three by B.' in body) > 0) AS dc_has_b FROM delim_docs WHERE id = 'doc1' \gset + +\if :dc_has_a +\echo [PASS] (:testid) DelimContent: site A paragraph edit present +\else +\echo [FAIL] (:testid) DelimContent: site A paragraph edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +\if :dc_has_b +\echo [PASS] (:testid) DelimContent: site B paragraph edit present +\else +\echo [FAIL] (:testid) DelimContent: site B paragraph edit missing +SELECT (:fail::int + 1) AS fail \gset +\endif + +-- ============================================================ +-- Cleanup +-- ============================================================ +\ir helper_test_cleanup.sql +\if :should_cleanup +\ir helper_psql_conn_setup.sql +DROP DATABASE IF EXISTS cloudsync_block_r5_a; +DROP DATABASE IF EXISTS cloudsync_block_r5_b; +\else +\echo [INFO] !!!!! +\endif diff --git a/test/postgresql/full_test.sql b/test/postgresql/full_test.sql index 12f020f..b86ac24 100644 --- a/test/postgresql/full_test.sql +++ b/test/postgresql/full_test.sql @@ -34,6 +34,19 @@ \ir 24_nullable_types_roundtrip.sql \ir 25_boolean_type_issue.sql \ir 26_row_filter.sql +\ir 27_rls_batch_merge.sql +\ir 28_db_version_tracking.sql +\ir 29_rls_multicol.sql +\ir 30_null_prikey_insert.sql + +\ir 31_alter_table_sync.sql +\ir 32_block_lww.sql +\ir 33_block_lww_extended.sql +\ir 34_block_lww_advanced.sql +\ir 35_block_lww_edge_cases.sql +\ir 36_block_lww_round3.sql +\ir 37_block_lww_round4.sql +\ir 38_block_lww_round5.sql -- 'Test summary' \echo '\nTest summary:' diff --git a/test/unit.c b/test/unit.c index 80ac905..0487f9d 100644 --- a/test/unit.c +++ b/test/unit.c @@ -169,7 +169,7 @@ DATABASE_RESULT unit_exec (cloudsync_context *data, const char *sql, const char char *buffer = NULL; if (type == SQLITE_BLOB) { - const void *bvalue = database_column_blob(pstmt, i); + const void *bvalue = database_column_blob(pstmt, i, NULL); if (bvalue) { buffer = (char *)cloudsync_memory_alloc(len); if (!buffer) {rc = SQLITE_NOMEM; goto unitexec_finalize;} @@ -405,161 +405,6 @@ bool file_delete_internal (const char *path) { // MARK: - -#ifndef UNITTEST_OMIT_RLS_VALIDATION -typedef struct { - bool in_savepoint; - bool is_approved; - bool last_is_delete; - char *last_tbl; - void *last_pk; - int64_t last_pk_len; - int64_t last_db_version; -} unittest_payload_apply_rls_status; - -bool unittest_validate_changed_row(sqlite3 *db, cloudsync_context *data, char *tbl_name, void *pk, int64_t pklen) { - // verify row - bool ret = false; - bool vm_persistent; - sqlite3_stmt *vm = cloudsync_colvalue_stmt(data, tbl_name, &vm_persistent); - if (!vm) goto cleanup; - - // bind primary key values (the return code is the pk count) - int rc = pk_decode_prikey((char *)pk, (size_t)pklen, pk_decode_bind_callback, (void *)vm); - if (rc < 0) goto cleanup; - - // execute vm - rc = sqlite3_step(vm); - if (rc == SQLITE_DONE) { - rc = SQLITE_OK; - } else if (rc == SQLITE_ROW) { - rc = SQLITE_OK; - ret = true; - } - -cleanup: - if (vm_persistent) sqlite3_reset(vm); - else sqlite3_finalize(vm); - - return ret; -} - -int unittest_payload_apply_reset_transaction(sqlite3 *db, unittest_payload_apply_rls_status *s, bool create_new) { - int rc = SQLITE_OK; - - if (s->in_savepoint == true) { - if (s->is_approved) rc = sqlite3_exec(db, "RELEASE unittest_payload_apply_transaction", NULL, NULL, NULL); - else rc = sqlite3_exec(db, "ROLLBACK TO unittest_payload_apply_transaction; RELEASE unittest_payload_apply_transaction", NULL, NULL, NULL); - if (rc == SQLITE_OK) s->in_savepoint = false; - } - if (create_new) { - rc = sqlite3_exec(db, "SAVEPOINT unittest_payload_apply_transaction", NULL, NULL, NULL); - if (rc == SQLITE_OK) s->in_savepoint = true; - } - return rc; -} - -bool unittest_payload_apply_rls_callback(void **xdata, cloudsync_pk_decode_bind_context *d, void *_db, void *_data, int step, int rc) { - sqlite3 *db = (sqlite3 *)_db; - cloudsync_context *data = (cloudsync_context *)_data; - - bool is_approved = false; - unittest_payload_apply_rls_status *s; - if (*xdata) { - s = (unittest_payload_apply_rls_status *)*xdata; - } else { - s = cloudsync_memory_zeroalloc(sizeof(unittest_payload_apply_rls_status)); - s->is_approved = true; - *xdata = s; - } - - // extract context info - int64_t colname_len = 0; - char *colname = cloudsync_pk_context_colname(d, &colname_len); - - int64_t tbl_len = 0; - char *tbl = cloudsync_pk_context_tbl(d, &tbl_len); - - int64_t pk_len = 0; - void *pk = cloudsync_pk_context_pk(d, &pk_len); - - int64_t cl = cloudsync_pk_context_cl(d); - int64_t db_version = cloudsync_pk_context_dbversion(d); - - switch (step) { - case CLOUDSYNC_PAYLOAD_APPLY_WILL_APPLY: { - // if the tbl name or the prikey has changed, then verify if the row is valid - // must use strncmp because strings in xdata are not zero-terminated - bool tbl_changed = (s->last_tbl && (strlen(s->last_tbl) != (size_t)tbl_len || strncmp(s->last_tbl, tbl, (size_t)tbl_len) != 0)); - bool pk_changed = (s->last_pk && pk && cloudsync_blob_compare(s->last_pk, s->last_pk_len, pk, pk_len) != 0); - if (s->is_approved - && !s->last_is_delete - && (tbl_changed || pk_changed)) { - s->is_approved = unittest_validate_changed_row(db, data, s->last_tbl, s->last_pk, s->last_pk_len); - } - - s->last_is_delete = ((size_t)colname_len == strlen(CLOUDSYNC_TOMBSTONE_VALUE) && - strncmp(colname, CLOUDSYNC_TOMBSTONE_VALUE, (size_t)colname_len) == 0 - ) && cl % 2 == 0; - - // update the last_tbl value, if needed - if (!s->last_tbl || - !tbl || - (strlen(s->last_tbl) != (size_t)tbl_len) || - strncmp(s->last_tbl, tbl, (size_t)tbl_len) != 0) { - if (s->last_tbl) cloudsync_memory_free(s->last_tbl); - if (tbl && tbl_len > 0) s->last_tbl = cloudsync_string_ndup(tbl, tbl_len); - else s->last_tbl = NULL; - } - - // update the last_prikey and len values, if needed - if (!s->last_pk || !pk || cloudsync_blob_compare(s->last_pk, s->last_pk_len, pk, pk_len) != 0) { - if (s->last_pk) cloudsync_memory_free(s->last_pk); - if (pk && pk_len > 0) { - s->last_pk = cloudsync_memory_alloc(pk_len); - memcpy(s->last_pk, pk, pk_len); - s->last_pk_len = pk_len; - } else { - s->last_pk = NULL; - s->last_pk_len = 0; - } - } - - // commit the previous transaction, if any - // begin new transacion, if needed - if (s->last_db_version != db_version) { - rc = unittest_payload_apply_reset_transaction(db, s, true); - if (rc != SQLITE_OK) printf("unittest_payload_apply error in reset_transaction: (%d) %s\n", rc, sqlite3_errmsg(db)); - - // reset local variables - s->last_db_version = db_version; - s->is_approved = true; - } - - is_approved = s->is_approved; - break; - } - case CLOUDSYNC_PAYLOAD_APPLY_DID_APPLY: - is_approved = s->is_approved; - break; - case CLOUDSYNC_PAYLOAD_APPLY_CLEANUP: - if (s->is_approved && !s->last_is_delete) s->is_approved = unittest_validate_changed_row(db, data, s->last_tbl, s->last_pk, s->last_pk_len); - rc = unittest_payload_apply_reset_transaction(db, s, false); - if (s->last_tbl) cloudsync_memory_free(s->last_tbl); - if (s->last_pk) { - cloudsync_memory_free(s->last_pk); - s->last_pk_len = 0; - } - is_approved = s->is_approved; - - cloudsync_memory_free(s); - *xdata = NULL; - break; - } - - return is_approved; -} -#endif - // MARK: - #ifndef CLOUDSYNC_OMIT_PRINT_RESULT @@ -1716,7 +1561,7 @@ bool do_test_pk (sqlite3 *db, int ntest, bool print_result) { if (do_test_pk_single_value(db, SQLITE_INTEGER, -15592946911031981, 0, NULL, print_result) == false) goto finalize; if (do_test_pk_single_value(db, SQLITE_INTEGER, -922337203685477580, 0, NULL, print_result) == false) goto finalize; if (do_test_pk_single_value(db, SQLITE_FLOAT, 0, -9223372036854775.808, NULL, print_result) == false) goto finalize; - if (do_test_pk_single_value(db, SQLITE_NULL, 0, 0, NULL, print_result) == false) goto finalize; + // SQLITE_NULL is no longer valid for primary keys (runtime NULL check rejects it) if (do_test_pk_single_value(db, SQLITE_TEXT, 0, 0, "Hello World", print_result) == false) goto finalize; char blob[] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16}; if (do_test_pk_single_value(db, SQLITE_BLOB, sizeof(blob), 0, blob, print_result) == false) goto finalize; @@ -1932,8 +1777,7 @@ bool do_test_dbutils (void) { // manually load extension sqlite3_cloudsync_init(db, NULL, NULL); - cloudsync_set_payload_apply_callback(db, unittest_payload_apply_rls_callback); - + // test context create and free data = cloudsync_context_create(db); if (!data) return false; @@ -2082,8 +1926,8 @@ bool do_test_dbutils (void) { char *site_id_blob; int64_t site_id_blob_size; - int64_t dbver1, seq1; - rc = database_select_blob_2int(data, "SELECT cloudsync_siteid(), cloudsync_db_version(), cloudsync_seq();", &site_id_blob, &site_id_blob_size, &dbver1, &seq1); + int64_t dbver1; + rc = database_select_blob_int(data, "SELECT cloudsync_siteid(), cloudsync_db_version();", &site_id_blob, &site_id_blob_size, &dbver1); if (rc != SQLITE_OK || site_id_blob == NULL ||dbver1 != db_version) goto finalize; cloudsync_memory_free(site_id_blob); @@ -2173,6 +2017,43 @@ bool do_test_error_cases (sqlite3 *db) { return true; } +bool do_test_null_prikey_insert (sqlite3 *db) { + // Create a table with a primary key that allows NULL (no NOT NULL constraint) + const char *sql = "CREATE TABLE IF NOT EXISTS t_null_pk (id TEXT PRIMARY KEY, value TEXT);" + "SELECT cloudsync_init('t_null_pk');"; + int rc = sqlite3_exec(db, sql, NULL, NULL, NULL); + if (rc != SQLITE_OK) return false; + + // Attempt to insert a row with NULL primary key — should fail + char *errmsg = NULL; + sql = "INSERT INTO t_null_pk (id, value) VALUES (NULL, 'test');"; + rc = sqlite3_exec(db, sql, NULL, NULL, &errmsg); + if (rc == SQLITE_OK) return false; // should have failed + if (!errmsg) return false; + + // Verify the error message matches the expected format + const char *expected = "Insert aborted because primary key in table t_null_pk contains NULL values."; + bool match = (strcmp(errmsg, expected) == 0); + sqlite3_free(errmsg); + if (!match) return false; + + // Verify that a non-NULL primary key insert succeeds + sql = "INSERT INTO t_null_pk (id, value) VALUES ('valid_id', 'test');"; + rc = sqlite3_exec(db, sql, NULL, NULL, NULL); + if (rc != SQLITE_OK) return false; + + // Verify the metatable has exactly 1 row (only the valid insert) + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db, "SELECT COUNT(*) FROM t_null_pk_cloudsync;", -1, &stmt, NULL); + if (rc != SQLITE_OK) return false; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); return false; } + int count = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + if (count != 1) return false; + + return true; +} + bool do_test_internal_functions (void) { sqlite3 *db = NULL; sqlite3_stmt *vm = NULL; @@ -2381,8 +2262,8 @@ bool do_test_pk_decode_count_from_buffer(void) { rc = sqlite3_cloudsync_init(db, NULL, NULL); if (rc != SQLITE_OK) goto cleanup; - // Encode multiple values - const char *sql = "SELECT cloudsync_pk_encode(123, 'text value', 3.14, X'DEADBEEF', NULL);"; + // Encode multiple values (no NULL — primary keys cannot contain NULL) + const char *sql = "SELECT cloudsync_pk_encode(123, 'text value', 3.14, X'DEADBEEF');"; rc = sqlite3_prepare_v2(db, sql, -1, &stmt, NULL); if (rc != SQLITE_OK) goto cleanup; @@ -2403,7 +2284,7 @@ bool do_test_pk_decode_count_from_buffer(void) { // The count is embedded in the first byte of the encoded pk size_t seek = 0; int n = pk_decode(buffer, (size_t)pklen, -1, &seek, -1, NULL, NULL); - if (n != 5) goto cleanup; // Should decode 5 values + if (n != 4) goto cleanup; // Should decode 4 values result = true; @@ -2849,8 +2730,8 @@ bool do_test_sql_pk_decode(void) { rc = sqlite3_cloudsync_init(db, NULL, NULL); if (rc != SQLITE_OK) goto cleanup; - // Create a primary key with multiple values - rc = sqlite3_prepare_v2(db, "SELECT cloudsync_pk_encode(123, 'hello', 3.14, X'DEADBEEF', NULL);", -1, &stmt, NULL); + // Create a primary key with multiple values (no NULL — primary keys cannot contain NULL) + rc = sqlite3_prepare_v2(db, "SELECT cloudsync_pk_encode(123, 'hello', 3.14, X'DEADBEEF');", -1, &stmt, NULL); if (rc != SQLITE_OK) goto cleanup; rc = sqlite3_step(stmt); @@ -2934,21 +2815,6 @@ bool do_test_sql_pk_decode(void) { sqlite3_finalize(stmt); stmt = NULL; - // Test cloudsync_pk_decode for NULL (index 5) - rc = sqlite3_prepare_v2(db, "SELECT cloudsync_pk_decode(?, 5);", -1, &stmt, NULL); - if (rc != SQLITE_OK) goto cleanup; - - rc = sqlite3_bind_blob(stmt, 1, pk_copy, pk_len, SQLITE_STATIC); - if (rc != SQLITE_OK) goto cleanup; - - rc = sqlite3_step(stmt); - if (rc != SQLITE_ROW) goto cleanup; - - if (sqlite3_column_type(stmt, 0) != SQLITE_NULL) goto cleanup; - - sqlite3_finalize(stmt); - stmt = NULL; - result = true; cleanup: @@ -3881,8 +3747,7 @@ sqlite3 *do_create_database (void) { // manually load extension sqlite3_cloudsync_init(db, NULL, NULL); - cloudsync_set_payload_apply_callback(db, unittest_payload_apply_rls_callback); - + return db; } @@ -3894,7 +3759,7 @@ void do_build_database_path (char buf[256], int i, time_t timestamp, int ntest) #endif } -sqlite3 *do_create_database_file_v2 (int i, time_t timestamp, int ntest, bool set_payload_apply_callback) { +sqlite3 *do_create_database_file_v2 (int i, time_t timestamp, int ntest) { sqlite3 *db = NULL; // open database in home dir @@ -3906,18 +3771,17 @@ sqlite3 *do_create_database_file_v2 (int i, time_t timestamp, int ntest, bool se sqlite3_close(db); return NULL; } - + sqlite3_exec(db, "PRAGMA journal_mode=WAL;", NULL, NULL, NULL); - + // manually load extension sqlite3_cloudsync_init(db, NULL, NULL); - if (set_payload_apply_callback) cloudsync_set_payload_apply_callback(db, unittest_payload_apply_rls_callback); return db; } sqlite3 *do_create_database_file (int i, time_t timestamp, int ntest) { - return do_create_database_file_v2(i, timestamp, ntest, false); + return do_create_database_file_v2(i, timestamp, ntest); } bool do_test_merge (int nclients, bool print_result, bool cleanup_databases) { @@ -3939,7 +3803,7 @@ bool do_test_merge (int nclients, bool print_result, bool cleanup_databases) { time_t timestamp = time(NULL); int saved_counter = test_counter; for (int i=0; i= MAX_SIMULATED_CLIENTS) { + nclients = MAX_SIMULATED_CLIENTS; + } else if (nclients < 2) { + nclients = 2; + } + + time_t timestamp = time(NULL); + int saved_counter = test_counter; + for (int i = 0; i < nclients; ++i) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (db[i] == false) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE tasks (id TEXT PRIMARY KEY NOT NULL, user_id TEXT, title TEXT, priority INTEGER);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('tasks');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + } + + // --- Phase 1: baseline sync (no triggers) --- + rc = sqlite3_exec(db[0], "INSERT INTO tasks VALUES ('t1', 'user1', 'Task 1', 3);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + rc = sqlite3_exec(db[0], "INSERT INTO tasks VALUES ('t2', 'user2', 'Task 2', 5);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + rc = sqlite3_exec(db[0], "INSERT INTO tasks VALUES ('t3', 'user1', 'Task 3', 1);", NULL, NULL, NULL); if (rc != SQLITE_OK) goto finalize; - - // manually load extension - sqlite3_cloudsync_init(db, NULL, NULL); - cloudsync_set_payload_apply_callback(db, unittest_payload_apply_rls_callback); - printf("Testing CloudSync version %s\n", CLOUDSYNC_VERSION); - printf("=================================\n"); + if (do_merge_using_payload(db[0], db[1], only_locals, true) == false) goto finalize; - result += test_report("PK Test:", do_test_pk(db, 10000, print_result)); - result += test_report("UUID Test:", do_test_uuid(db, 1000, print_result)); - result += test_report("Comparison Test:", do_test_compare(db, print_result)); - result += test_report("RowID Test:", do_test_rowid(50000, print_result)); - result += test_report("Algo Names Test:", do_test_algo_names()); - result += test_report("DBUtils Test:", do_test_dbutils()); - result += test_report("Minor Test:", do_test_others(db)); - result += test_report("Test Error Cases:", do_test_error_cases(db)); - result += test_report("Test Single PK:", do_test_single_pk(print_result)); - - int test_mask = TEST_INSERT | TEST_UPDATE | TEST_DELETE; - int table_mask = TEST_PRIKEYS | TEST_NOCOLS; - #if !CLOUDSYNC_DISABLE_ROWIDONLY_TABLES - table_mask |= TEST_NOPRIKEYS; - #endif - - // test local changes - result += test_report("Local Test:", do_test_local(test_mask, table_mask, db, print_result)); - result += test_report("VTab Test: ", do_test_vtab(db)); - result += test_report("Functions Test:", do_test_functions(db, print_result)); - result += test_report("Functions Test (Int):", do_test_internal_functions()); - result += test_report("String Func Test:", do_test_string_replace_prefix()); - result += test_report("String Lowercase Test:", do_test_string_lowercase()); - result += test_report("Context Functions Test:", do_test_context_functions()); - result += test_report("PK Decode Count Test:", do_test_pk_decode_count_from_buffer()); - result += test_report("Error Handling Test:", do_test_error_handling()); - result += test_report("Terminate Test:", do_test_terminate()); - result += test_report("Hash Function Test:", do_test_hash_function()); - result += test_report("Blob Compare Test:", do_test_blob_compare()); - result += test_report("Blob Compare Large:", do_test_blob_compare_large_sizes()); - result += test_report("Deterministic Flags:", do_test_deterministic_flags()); - result += test_report("Schema Hash Roundtrip:", do_test_schema_hash_consistency()); - result += test_report("String Functions Test:", do_test_string_functions()); - result += test_report("UUID Functions Test:", do_test_uuid_functions()); - result += test_report("RowID Decode Test:", do_test_rowid_decode()); - result += test_report("SQL Schema Funcs Test:", do_test_sql_schema_functions()); - result += test_report("SQL PK Decode Test:", do_test_sql_pk_decode()); - result += test_report("PK Negative Values Test:", do_test_pk_negative_values()); - result += test_report("Settings Functions Test:", do_test_settings_functions()); - result += test_report("Sync/Enabled Funcs Test:", do_test_sync_enabled_functions()); - result += test_report("SQL UUID Func Test:", do_test_sql_uuid_function()); - result += test_report("PK Encode Edge Cases:", do_test_pk_encode_edge_cases()); - result += test_report("Col Value Func Test:", do_test_col_value_function()); - result += test_report("Is Sync Func Test:", do_test_is_sync_function()); - result += test_report("Insert/Update/Delete:", do_test_insert_update_delete_sql()); - result += test_report("Binary Comparison Test:", do_test_binary_comparison()); - result += test_report("PK Decode Malformed:", do_test_pk_decode_malformed()); - result += test_report("Test Many Columns:", do_test_many_columns(600, db)); - result += test_report("Payload Buffer Test (500KB):", do_test_payload_buffer(500 * 1024)); - result += test_report("Payload Buffer Test (600KB):", do_test_payload_buffer(600 * 1024)); - result += test_report("Payload Buffer Test (1MB):", do_test_payload_buffer(1024 * 1024)); - result += test_report("Payload Buffer Test (10MB):", do_test_payload_buffer(10 * 1024 * 1024)); + // Verify: B has 3 rows + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT COUNT(*) FROM tasks;", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + int count = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + if (count != 3) { + printf("Phase 1: expected 3 rows, got %d\n", count); + goto finalize; + } + } - // close local database - close_db(db); - db = NULL; - - // simulate remote merge - result += test_report("Merge Test:", do_test_merge(3, print_result, cleanup_databases)); - result += test_report("Merge Test 2:", do_test_merge_2(3, TEST_PRIKEYS, print_result, cleanup_databases)); - result += test_report("Merge Test 3:", do_test_merge_2(3, TEST_NOCOLS, print_result, cleanup_databases)); - result += test_report("Merge Test 4:", do_test_merge_4(2, print_result, cleanup_databases)); - result += test_report("Merge Test 5:", do_test_merge_5(2, print_result, cleanup_databases, false)); - result += test_report("Merge Test db_version 1:", do_test_merge_check_db_version(2, print_result, cleanup_databases, true, false)); - result += test_report("Merge Test db_version 1-cb:", do_test_merge_check_db_version(2, print_result, cleanup_databases, true, true)); - result += test_report("Merge Test db_version 2:", do_test_merge_check_db_version_2(2, print_result, cleanup_databases, true, false)); - result += test_report("Merge Test db_version 2-cb:", do_test_merge_check_db_version_2(2, print_result, cleanup_databases, true, true)); - result += test_report("Merge Test Insert Changes", do_test_insert_cloudsync_changes(print_result, cleanup_databases)); - result += test_report("Merge Alter Schema 1:", do_test_merge_alter_schema_1(2, print_result, cleanup_databases, false)); - result += test_report("Merge Alter Schema 2:", do_test_merge_alter_schema_2(2, print_result, cleanup_databases, false)); - result += test_report("Merge Two Tables Test:", do_test_merge_two_tables(2, print_result, cleanup_databases)); - result += test_report("Merge Conflicting PKeys:", do_test_merge_conflicting_pkeys(2, print_result, cleanup_databases)); - result += test_report("Merge Large Dataset:", do_test_merge_large_dataset(3, print_result, cleanup_databases)); - result += test_report("Merge Nested Transactions:", do_test_merge_nested_transactions(2, print_result, cleanup_databases)); - result += test_report("Merge Three Way:", do_test_merge_three_way(3, print_result, cleanup_databases)); - result += test_report("Merge NULL Values:", do_test_merge_null_values(2, print_result, cleanup_databases)); - result += test_report("Merge BLOB Data:", do_test_merge_blob_data(2, print_result, cleanup_databases)); - result += test_report("Merge Mixed Operations:", do_test_merge_mixed_operations(2, print_result, cleanup_databases)); - result += test_report("Merge Hub-Spoke:", do_test_merge_hub_spoke(4, print_result, cleanup_databases)); - result += test_report("Merge Timestamp Precision:", do_test_merge_timestamp_precision(2, print_result, cleanup_databases)); - result += test_report("Merge Partial Failure:", do_test_merge_partial_failure(2, print_result, cleanup_databases)); - result += test_report("Merge Rollback Scenarios:", do_test_merge_rollback_scenarios(2, print_result, cleanup_databases)); - result += test_report("Merge Circular:", do_test_merge_circular(3, print_result, cleanup_databases)); - result += test_report("Merge Foreign Keys:", do_test_merge_foreign_keys(2, print_result, cleanup_databases)); - // Expected failure: TRIGGERs are not fully supported by this extension. - // result += test_report("Merge Triggers:", do_test_merge_triggers(2, print_result, cleanup_databases)); - result += test_report("Merge Index Consistency:", do_test_merge_index_consistency(2, print_result, cleanup_databases)); - result += test_report("Merge JSON Columns:", do_test_merge_json_columns(2, print_result, cleanup_databases)); - result += test_report("Merge Concurrent Attempts:", do_test_merge_concurrent_attempts(3, print_result, cleanup_databases)); - result += test_report("Merge Composite PK 10 Clients:", do_test_merge_composite_pk_10_clients(10, print_result, cleanup_databases)); - result += test_report("PriKey NULL Test:", do_test_prikey(2, print_result, cleanup_databases)); - result += test_report("Test Double Init:", do_test_double_init(2, cleanup_databases)); - - // test grow-only set - result += test_report("Test GrowOnlySet:", do_test_gos(6, print_result, cleanup_databases)); - result += test_report("Test Network Enc/Dec:", do_test_network_encode_decode(2, print_result, cleanup_databases, false)); - result += test_report("Test Network Enc/Dec 2:", do_test_network_encode_decode(2, print_result, cleanup_databases, true)); - result += test_report("Test Fill Initial Data:", do_test_fill_initial_data(3, print_result, cleanup_databases)); - result += test_report("Test Alter Table 1:", do_test_alter(3, 1, print_result, cleanup_databases)); - result += test_report("Test Alter Table 2:", do_test_alter(3, 2, print_result, cleanup_databases)); - result += test_report("Test Alter Table 3:", do_test_alter(3, 3, print_result, cleanup_databases)); + // --- Phase 2: INSERT denial with triggers on B --- + rc = sqlite3_exec(db[1], + "CREATE TRIGGER rls_deny_insert BEFORE INSERT ON tasks " + "FOR EACH ROW WHEN NEW.user_id != 'user1' " + "BEGIN SELECT RAISE(ABORT, 'row violates RLS policy'); END;", + NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; - // test row-level filter - result += test_report("Test Row Filter:", do_test_row_filter(2, print_result, cleanup_databases)); + rc = sqlite3_exec(db[1], + "CREATE TRIGGER rls_deny_update BEFORE UPDATE ON tasks " + "FOR EACH ROW WHEN NEW.user_id != 'user1' " + "BEGIN SELECT RAISE(ABORT, 'row violates RLS policy'); END;", + NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + + rc = sqlite3_exec(db[0], "INSERT INTO tasks VALUES ('t4', 'user1', 'Task 4', 2);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + rc = sqlite3_exec(db[0], "INSERT INTO tasks VALUES ('t5', 'user2', 'Task 5', 7);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + + // Merge with partial-failure tolerance: cloudsync_payload_decode returns error + // when any PK is denied, but allowed PKs are already committed via per-PK savepoints. + { + sqlite3_stmt *sel = NULL, *ins = NULL; + const char *sel_sql = only_locals + ? "SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) FROM cloudsync_changes WHERE site_id=cloudsync_siteid();" + : "SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) FROM cloudsync_changes;"; + rc = sqlite3_prepare_v2(db[0], sel_sql, -1, &sel, NULL); + if (rc != SQLITE_OK) { sqlite3_finalize(sel); goto finalize; } + rc = sqlite3_prepare_v2(db[1], "SELECT cloudsync_payload_decode(?);", -1, &ins, NULL); + if (rc != SQLITE_OK) { sqlite3_finalize(sel); sqlite3_finalize(ins); goto finalize; } + + while (sqlite3_step(sel) == SQLITE_ROW) { + sqlite3_value *v = sqlite3_column_value(sel, 0); + if (sqlite3_value_type(v) == SQLITE_NULL) continue; + sqlite3_bind_value(ins, 1, v); + sqlite3_step(ins); // partial failure expected — ignore rc + sqlite3_reset(ins); + } + sqlite3_finalize(sel); + sqlite3_finalize(ins); + } + + // Verify: t4 present (user1 → allowed) + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT COUNT(*) FROM tasks WHERE id='t4';", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + int count = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + if (count != 1) { + printf("Phase 2: t4 expected 1 row, got %d\n", count); + goto finalize; + } + } + + // Verify: t5 absent (user2 → denied) + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT COUNT(*) FROM tasks WHERE id='t5';", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + int count = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + if (count != 0) { + printf("Phase 2: t5 expected 0 rows, got %d\n", count); + goto finalize; + } + } + + // Verify: total 4 rows on B (t1, t2, t3 from phase 1 + t4) + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT COUNT(*) FROM tasks;", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + int count = sqlite3_column_int(stmt, 0); + sqlite3_finalize(stmt); + if (count != 4) { + printf("Phase 2: expected 4 total rows, got %d\n", count); + goto finalize; + } + } + + // --- Phase 3: UPDATE denial --- + rc = sqlite3_exec(db[0], "UPDATE tasks SET title='Task 1 Updated', priority=10 WHERE id='t1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + rc = sqlite3_exec(db[0], "UPDATE tasks SET title='Task 2 Hacked', priority=99 WHERE id='t2';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto finalize; + + // Merge with partial-failure tolerance (same pattern as phase 2) + { + sqlite3_stmt *sel = NULL, *ins = NULL; + const char *sel_sql = only_locals + ? "SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) FROM cloudsync_changes WHERE site_id=cloudsync_siteid();" + : "SELECT cloudsync_payload_encode(tbl, pk, col_name, col_value, col_version, db_version, site_id, cl, seq) FROM cloudsync_changes;"; + rc = sqlite3_prepare_v2(db[0], sel_sql, -1, &sel, NULL); + if (rc != SQLITE_OK) { sqlite3_finalize(sel); goto finalize; } + rc = sqlite3_prepare_v2(db[1], "SELECT cloudsync_payload_decode(?);", -1, &ins, NULL); + if (rc != SQLITE_OK) { sqlite3_finalize(sel); sqlite3_finalize(ins); goto finalize; } + + while (sqlite3_step(sel) == SQLITE_ROW) { + sqlite3_value *v = sqlite3_column_value(sel, 0); + if (sqlite3_value_type(v) == SQLITE_NULL) continue; + sqlite3_bind_value(ins, 1, v); + sqlite3_step(ins); // partial failure expected — ignore rc + sqlite3_reset(ins); + } + sqlite3_finalize(sel); + sqlite3_finalize(ins); + } + + // Verify: t1 updated (user1 → allowed) + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT title, priority FROM tasks WHERE id='t1';", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + const char *title = (const char *)sqlite3_column_text(stmt, 0); + int priority = sqlite3_column_int(stmt, 1); + bool ok = (strcmp(title, "Task 1 Updated") == 0) && (priority == 10); + sqlite3_finalize(stmt); + if (!ok) { + printf("Phase 3: t1 update not applied (title='%s', priority=%d)\n", title, priority); + goto finalize; + } + } + + // Verify: t2 unchanged (user2 → denied) + { + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[1], "SELECT title, priority FROM tasks WHERE id='t2';", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto finalize; + if (sqlite3_step(stmt) != SQLITE_ROW) { sqlite3_finalize(stmt); goto finalize; } + const char *title = (const char *)sqlite3_column_text(stmt, 0); + int priority = sqlite3_column_int(stmt, 1); + bool ok = (strcmp(title, "Task 2") == 0) && (priority == 5); + sqlite3_finalize(stmt); + if (!ok) { + printf("Phase 3: t2 should be unchanged (title='%s', priority=%d)\n", title, priority); + goto finalize; + } + } + + result = true; + rc = SQLITE_OK; + +finalize: + for (int i = 0; i < nclients; ++i) { + if (rc != SQLITE_OK && db[i] && (sqlite3_errcode(db[i]) != SQLITE_OK)) + printf("do_test_rls_trigger_denial error: %s\n", sqlite3_errmsg(db[i])); + if (db[i]) { + if (sqlite3_get_autocommit(db[i]) == 0) { + result = false; + printf("do_test_rls_trigger_denial error: db %d is in transaction\n", i); + } + int counter = close_db(db[i]); + if (counter > 0) { + result = false; + printf("do_test_rls_trigger_denial error: db %d has %d unterminated statements\n", i, counter); + } + } + if (cleanup_databases) { + char buf[256]; + do_build_database_path(buf, i, timestamp, saved_counter++); + file_delete_internal(buf); + } + } + return result; +} + +// MARK: - Block-level LWW Tests - + +static int64_t do_select_int(sqlite3 *db, const char *sql) { + sqlite3_stmt *stmt = NULL; + int64_t val = -1; + if (sqlite3_prepare_v2(db, sql, -1, &stmt, NULL) == SQLITE_OK) { + if (sqlite3_step(stmt) == SQLITE_ROW) { + val = sqlite3_column_int64(stmt, 0); + } + } + if (stmt) sqlite3_finalize(stmt); + return val; +} + +static char *do_select_text(sqlite3 *db, const char *sql) { + sqlite3_stmt *stmt = NULL; + char *val = NULL; + if (sqlite3_prepare_v2(db, sql, -1, &stmt, NULL) == SQLITE_OK) { + if (sqlite3_step(stmt) == SQLITE_ROW) { + const char *t = (const char *)sqlite3_column_text(stmt, 0); + if (t) val = sqlite3_mprintf("%s", t); + } + } + if (stmt) sqlite3_finalize(stmt); + return val; +} + +bool do_test_block_lww_insert(int nclients, bool print_result, bool cleanup_databases) { + // Test: INSERT into a table with a block column properly splits text into blocks + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_insert: CREATE TABLE failed: %s\n", sqlite3_errmsg(db[i])); goto fail; } + + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_insert: cloudsync_init failed: %s\n", sqlite3_errmsg(db[i])); goto fail; } + + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_insert: set_column failed: %s\n", sqlite3_errmsg(db[i])); goto fail; } + } + + // Insert a document with 3 lines + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line 1\nLine 2\nLine 3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_insert: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Verify blocks were created in the blocks table + int64_t block_count = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (block_count != 3) { + printf("block_insert: expected 3 blocks, got %" PRId64 "\n", block_count); + goto fail; + } + + // Verify metadata entries for blocks (col_name contains \x1F) + int64_t meta_count = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (meta_count != 3) { + printf("block_insert: expected 3 block metadata entries, got %" PRId64 "\n", meta_count); + goto fail; + } + + // Verify no metadata entry for the whole 'body' column + int64_t whole_meta = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name = 'body';"); + if (whole_meta != 0) { + printf("block_insert: expected 0 whole-column metadata entries, got %" PRId64 "\n", whole_meta); + goto fail; + } + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +bool do_test_block_lww_update(int nclients, bool print_result, bool cleanup_databases) { + // Test: UPDATE on a block column performs block diff + sqlite3 *db[1] = {NULL}; + time_t timestamp = time(NULL); + int rc; + + db[0] = do_create_database_file(0, timestamp, test_counter++); + if (!db[0]) return false; + + rc = sqlite3_exec(db[0], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Insert initial text + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'AAA\nBBB\nCCC');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_update: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + int64_t blocks_before = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + + // Update: change middle line and add a new line + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'AAA\nXXX\nCCC\nDDD' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_update: UPDATE failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + int64_t blocks_after = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + + // Should have 4 blocks after update (AAA, XXX, CCC, DDD) + if (blocks_after != 4) { + printf("block_update: expected 4 blocks after update, got %" PRId64 " (before: %" PRId64 ")\n", blocks_after, blocks_before); + goto fail; + } + + close_db(db[0]); + return true; + +fail: + if (db[0]) close_db(db[0]); + return false; +} + +bool do_test_block_lww_sync(int nclients, bool print_result, bool cleanup_databases) { + // Test: Two sites edit different blocks of the same document; after sync, both edits are preserved + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0 inserts the initial document + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line A\nLine B\nLine C');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_sync: INSERT db[0] failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Sync initial state: db[0] -> db[1] so both have the same document + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("block_sync: initial merge 0->1 failed\n"); goto fail; } + + // Site 0: edit first line + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'EDITED A\nLine B\nLine C' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_sync: UPDATE db[0] failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Site 1: edit third line + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Line A\nLine B\nEDITED C' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_sync: UPDATE db[1] failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + // Sync: db[0] -> db[1] (send site 0's edits) + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("block_sync: merge 0->1 failed\n"); goto fail; } + // Sync: db[1] -> db[0] (send site 1's edits) + if (!do_merge_using_payload(db[1], db[0], true, true)) { printf("block_sync: merge 1->0 failed\n"); goto fail; } + + // Both databases should now have the merged result: "EDITED A\nLine B\nEDITED C" + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1) { + printf("block_sync: could not read body from one or both databases\n"); + ok = false; + } else if (strcmp(body0, body1) != 0) { + printf("block_sync: bodies don't match after sync:\n db[0]: %s\n db[1]: %s\n", body0, body1); + ok = false; + } else { + // Check that both edits were preserved + if (!strstr(body0, "EDITED A")) { + printf("block_sync: missing 'EDITED A' in result: %s\n", body0); + ok = false; + } + if (!strstr(body0, "EDITED C")) { + printf("block_sync: missing 'EDITED C' in result: %s\n", body0); + ok = false; + } + if (!strstr(body0, "Line B")) { + printf("block_sync: missing 'Line B' in result: %s\n", body0); + ok = false; + } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +bool do_test_block_lww_delete(int nclients, bool print_result, bool cleanup_databases) { + // Test: DELETE on a row with block columns marks tombstone and block metadata is dropped + sqlite3 *db[1] = {NULL}; + time_t timestamp = time(NULL); + int rc; + + db[0] = do_create_database_file(0, timestamp, test_counter++); + if (!db[0]) return false; + + rc = sqlite3_exec(db[0], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Insert a document + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line A\nLine B\nLine C');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_delete: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Verify blocks and metadata exist + int64_t blocks_before = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + if (blocks_before != 3) { + printf("block_delete: expected 3 blocks before delete, got %" PRId64 "\n", blocks_before); + goto fail; + } + int64_t meta_before = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (meta_before != 3) { + printf("block_delete: expected 3 block metadata before delete, got %" PRId64 "\n", meta_before); + goto fail; + } + + // Delete the row + rc = sqlite3_exec(db[0], "DELETE FROM docs WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_delete: DELETE failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Verify metadata tombstone exists (delete sentinel) + int64_t tombstone = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name = '__[RIP]__' AND col_version % 2 = 0;"); + if (tombstone != 1) { + printf("block_delete: expected 1 delete tombstone, got %" PRId64 "\n", tombstone); + goto fail; + } + + // Verify block metadata was dropped (local_drop_meta removes non-tombstone metadata) + int64_t meta_after = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (meta_after != 0) { + printf("block_delete: expected 0 block metadata after delete, got %" PRId64 "\n", meta_after); + goto fail; + } + + // Row should be gone from base table + int64_t row_count = do_select_int(db[0], "SELECT count(*) FROM docs WHERE id = 'doc1';"); + if (row_count != 0) { + printf("block_delete: row still in base table after delete\n"); + goto fail; + } + + close_db(db[0]); + return true; + +fail: + if (db[0]) close_db(db[0]); + return false; +} + +bool do_test_block_lww_materialize(int nclients, bool print_result, bool cleanup_databases) { + // Test: cloudsync_text_materialize reconstructs text from blocks after sync + // Sync to a second db where body column is empty, then materialize there + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert multi-line text on db[0] + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Alpha\nBravo\nCharlie\nDelta\nEcho');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_materialize: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Sync to db[1] — body column on db[1] will be populated by payload_apply but + // materialize should reconstruct correctly from blocks + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("block_materialize: merge failed\n"); goto fail; } + + // Materialize on db[1] should reconstruct from blocks + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_materialize: materialize failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body) { + printf("block_materialize: body is NULL after materialize\n"); + goto fail; + } + if (strcmp(body, "Alpha\nBravo\nCharlie\nDelta\nEcho") != 0) { + printf("block_materialize: body mismatch: %s\n", body); + sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + // Also test materialize on db[0] (where body already matches) + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_materialize: materialize on db[0] failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body0 || strcmp(body0, "Alpha\nBravo\nCharlie\nDelta\nEcho") != 0) { + printf("block_materialize: body0 mismatch: %s\n", body0 ? body0 : "NULL"); + if (body0) sqlite3_free(body0); + goto fail; + } + sqlite3_free(body0); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +bool do_test_block_lww_empty_text(int nclients, bool print_result, bool cleanup_databases) { + // Test: INSERT with empty body creates a single empty block + sqlite3 *db[1] = {NULL}; + time_t timestamp = time(NULL); + int rc; + + db[0] = do_create_database_file(0, timestamp, test_counter++); + if (!db[0]) return false; + + rc = sqlite3_exec(db[0], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Insert empty text + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', '');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_empty: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Should have exactly 1 block (empty content) + int64_t block_count = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + if (block_count != 1) { + printf("block_empty: expected 1 block for empty text, got %" PRId64 "\n", block_count); + goto fail; + } + + // Insert NULL text + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc2', NULL);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_empty: INSERT NULL failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // NULL body should also create 1 block (treated as empty) + int64_t null_blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc2');"); + if (null_blocks != 1) { + printf("block_empty: expected 1 block for NULL text, got %" PRId64 "\n", null_blocks); + goto fail; + } + + // Update from empty to multi-line + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Line1\nLine2' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_empty: UPDATE from empty failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + int64_t updated_blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (updated_blocks != 2) { + printf("block_empty: expected 2 blocks after update from empty, got %" PRId64 "\n", updated_blocks); + goto fail; + } + + close_db(db[0]); + return true; + +fail: + if (db[0]) close_db(db[0]); + return false; +} + +bool do_test_block_lww_conflict(int nclients, bool print_result, bool cleanup_databases) { + // Test: Two sites edit the SAME line concurrently; LWW picks the later write + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0 inserts initial document + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Same\nMiddle\nEnd');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_conflict: INSERT db[0] failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Sync initial state: db[0] -> db[1] + if (!do_merge_values(db[0], db[1], false)) { printf("block_conflict: initial merge failed\n"); goto fail; } + + // Site 0: edit first line + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Site0\nMiddle\nEnd' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_conflict: UPDATE db[0] failed\n"); goto fail; } + + // Site 1: also edit first line (conflict!) + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Site1\nMiddle\nEnd' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_conflict: UPDATE db[1] failed\n"); goto fail; } + + // Sync both ways using row-by-row merge + if (!do_merge_values(db[0], db[1], true)) { printf("block_conflict: merge 0->1 failed\n"); goto fail; } + if (!do_merge_values(db[1], db[0], true)) { printf("block_conflict: merge 1->0 failed\n"); goto fail; } + + // Materialize on both databases to reconstruct body from blocks + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_conflict: materialize db[0] failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_conflict: materialize db[1] failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + // Both databases should converge (same value) + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1) { + printf("block_conflict: could not read body from databases\n"); + ok = false; + } else if (strcmp(body0, body1) != 0) { + printf("block_conflict: bodies don't match after sync:\n db[0]: %s\n db[1]: %s\n", body0, body1); + ok = false; + } else { + // Should contain either "Site0" or "Site1" (LWW picks one), plus unchanged lines + if (!strstr(body0, "Middle")) { + printf("block_conflict: missing 'Middle' in result: %s\n", body0); + ok = false; + } + if (!strstr(body0, "End")) { + printf("block_conflict: missing 'End' in result: %s\n", body0); + ok = false; + } + // One of the conflicting edits should win + if (!strstr(body0, "Site0") && !strstr(body0, "Site1")) { + printf("block_conflict: neither 'Site0' nor 'Site1' in result: %s\n", body0); + ok = false; + } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +bool do_test_block_lww_multi_update(int nclients, bool print_result, bool cleanup_databases) { + // Test: Multiple successive updates correctly maintain block state + sqlite3 *db[1] = {NULL}; + time_t timestamp = time(NULL); + int rc; + + db[0] = do_create_database_file(0, timestamp, test_counter++); + if (!db[0]) return false; + + rc = sqlite3_exec(db[0], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Insert initial text (3 lines) + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'A\nB\nC');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_multi: INSERT failed\n"); goto fail; } + + // Update 1: remove middle line (3 -> 2 blocks) + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'A\nC' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_multi: UPDATE 1 failed\n"); goto fail; } + + int64_t blocks1 = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + if (blocks1 != 2) { printf("block_multi: expected 2 blocks after update 1, got %" PRId64 "\n", blocks1); goto fail; } + + // Update 2: add two lines (2 -> 4 blocks) + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'A\nX\nC\nY' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_multi: UPDATE 2 failed\n"); goto fail; } + + int64_t blocks2 = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + if (blocks2 != 4) { printf("block_multi: expected 4 blocks after update 2, got %" PRId64 "\n", blocks2); goto fail; } + + // Update 3: change everything to a single line (4 -> 1 block) + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'SINGLE' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_multi: UPDATE 3 failed\n"); goto fail; } + + int64_t blocks3 = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks;"); + if (blocks3 != 1) { printf("block_multi: expected 1 block after update 3, got %" PRId64 "\n", blocks3); goto fail; } + + // Materialize and verify + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_multi: materialize failed\n"); goto fail; } + + char *body = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, "SINGLE") != 0) { + printf("block_multi: expected 'SINGLE', got '%s'\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + close_db(db[0]); + return true; + +fail: + if (db[0]) close_db(db[0]); + return false; +} + +bool do_test_block_lww_reinsert(int nclients, bool print_result, bool cleanup_databases) { + // Test: DELETE then re-INSERT recreates blocks properly + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert, delete, then re-insert with different content + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Old1\nOld2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_reinsert: initial INSERT failed\n"); goto fail; } + + rc = sqlite3_exec(db[0], "DELETE FROM docs WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_reinsert: DELETE failed\n"); goto fail; } + + // Block metadata should be dropped (blocks table entries are orphaned by design) + int64_t meta_after_del = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (meta_after_del != 0) { + printf("block_reinsert: expected 0 block metadata after delete, got %" PRId64 "\n", meta_after_del); + goto fail; + } + + // Re-insert with new content + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'New1\nNew2\nNew3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_reinsert: re-INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Check block metadata was recreated (3 new block entries) + int64_t meta_after_reinsert = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (meta_after_reinsert != 3) { + printf("block_reinsert: expected 3 block metadata after re-insert, got %" PRId64 "\n", meta_after_reinsert); + goto fail; + } + + // Sync to db[1] and verify + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("block_reinsert: merge failed\n"); goto fail; } + + // Materialize on db[1] + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("block_reinsert: materialize on db[1] failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body1 || strcmp(body1, "New1\nNew2\nNew3") != 0) { + printf("block_reinsert: body mismatch on db[1]: %s\n", body1 ? body1 : "NULL"); + if (body1) sqlite3_free(body1); + goto fail; + } + sqlite3_free(body1); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +bool do_test_block_lww_add_lines(int nclients, bool print_result, bool cleanup_databases) { + // Test: Both sites add lines at different positions; after sync, all lines are present + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0 inserts initial doc + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line1\nLine2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync initial: 0 -> 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) goto fail; + + // Site 0: append a line at the end + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Line1\nLine2\nAppended0' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: insert a line in the middle + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Line1\nInserted1\nLine2' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_using_payload(db[0], db[1], true, true)) goto fail; + if (!do_merge_using_payload(db[1], db[0], true, true)) goto fail; + + // Both should converge + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1) { + printf("block_add_lines: could not read body\n"); + ok = false; + } else if (strcmp(body0, body1) != 0) { + printf("block_add_lines: bodies don't match:\n db[0]: %s\n db[1]: %s\n", body0, body1); + ok = false; + } else { + // All original and added lines should be present + if (!strstr(body0, "Line1")) { printf("block_add_lines: missing Line1\n"); ok = false; } + if (!strstr(body0, "Line2")) { printf("block_add_lines: missing Line2\n"); ok = false; } + if (!strstr(body0, "Appended0")) { printf("block_add_lines: missing Appended0\n"); ok = false; } + if (!strstr(body0, "Inserted1")) { printf("block_add_lines: missing Inserted1\n"); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 1: Non-conflicting edits on different blocks — both edits preserved +bool do_test_block_lww_noconflict(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0 inserts initial document with 3 lines + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line1\nLine2\nLine3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync initial: 0 -> 1 + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: edit first line only + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'EditedByA\nLine2\nLine3' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: edit third line only (no conflict — different block) + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Line1\nLine2\nEditedByB' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + // Materialize on both + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("noconflict: bodies diverged: [%s] vs [%s]\n", body0 ? body0 : "NULL", body1 ? body1 : "NULL"); + ok = false; + } else { + // BOTH edits should be preserved (this is the key value of block-level LWW) + if (!strstr(body0, "EditedByA")) { printf("noconflict: missing EditedByA\n"); ok = false; } + if (!strstr(body0, "Line2")) { printf("noconflict: missing Line2\n"); ok = false; } + if (!strstr(body0, "EditedByB")) { printf("noconflict: missing EditedByB\n"); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 2: Concurrent add + edit — Site A adds a line, Site B modifies an existing line +bool do_test_block_lww_add_and_edit(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Initial doc + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Alpha\nBravo');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: add a new line at the end + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Alpha\nBravo\nCharlie' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: modify first line + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'AlphaEdited\nBravo' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("add_and_edit: bodies diverged: [%s] vs [%s]\n", body0 ? body0 : "NULL", body1 ? body1 : "NULL"); + ok = false; + } else { + // The added line and the edit should both be present + if (!strstr(body0, "Charlie")) { printf("add_and_edit: missing Charlie (added line)\n"); ok = false; } + if (!strstr(body0, "Bravo")) { printf("add_and_edit: missing Bravo\n"); ok = false; } + // First line: either AlphaEdited wins (from site 1) or Alpha (from site 0) — depends on LWW + // But the added line Charlie must survive regardless + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 3: Three-way sync — 3 databases with overlapping edits converge +bool do_test_block_lww_three_way(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[3] = {NULL, NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 3; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0 creates initial doc + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'L1\nL2\nL3\nL4');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync 0 -> 1, 0 -> 2 + if (!do_merge_values(db[0], db[1], false)) goto fail; + if (!do_merge_values(db[0], db[2], false)) goto fail; + + // Site 0: edit line 1 + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'S0\nL2\nL3\nL4' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: edit line 2 + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'L1\nS1\nL3\nL4' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 2: edit line 4 + rc = sqlite3_exec(db[2], "UPDATE docs SET body = 'L1\nL2\nL3\nS2' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Full mesh sync: each site sends to every other site + for (int src = 0; src < 3; src++) { + for (int dst = 0; dst < 3; dst++) { + if (src == dst) continue; + if (!do_merge_values(db[src], db[dst], true)) { printf("three_way: merge %d->%d failed\n", src, dst); goto fail; } + } + } + + // Materialize all + for (int i = 0; i < 3; i++) { + rc = sqlite3_exec(db[i], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("three_way: materialize db[%d] failed\n", i); goto fail; } + } + + // All three should converge + char *body[3]; + for (int i = 0; i < 3; i++) { + body[i] = do_select_text(db[i], "SELECT body FROM docs WHERE id = 'doc1';"); + } + + bool ok = true; + if (!body[0] || !body[1] || !body[2]) { printf("three_way: NULL body\n"); ok = false; } + else if (strcmp(body[0], body[1]) != 0 || strcmp(body[1], body[2]) != 0) { + printf("three_way: not converged:\n [0]: %s\n [1]: %s\n [2]: %s\n", body[0], body[1], body[2]); + ok = false; + } else { + // All three non-conflicting edits should be preserved + if (!strstr(body[0], "S0")) { printf("three_way: missing S0\n"); ok = false; } + if (!strstr(body[0], "S1")) { printf("three_way: missing S1\n"); ok = false; } + if (!strstr(body[0], "L3")) { printf("three_way: missing L3\n"); ok = false; } + if (!strstr(body[0], "S2")) { printf("three_way: missing S2\n"); ok = false; } + } + + for (int i = 0; i < 3; i++) { if (body[i]) sqlite3_free(body[i]); } + for (int i = 0; i < 3; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 3; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 4: Mixed block + normal columns — both work independently +bool do_test_block_lww_mixed_columns(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE notes (id TEXT NOT NULL PRIMARY KEY, body TEXT, title TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('notes');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + // body is block-level LWW, title is normal LWW + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('notes', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Site 0: insert row with multi-line body and title + rc = sqlite3_exec(db[0], "INSERT INTO notes (id, body, title) VALUES ('n1', 'Line1\nLine2\nLine3', 'My Title');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync 0 -> 1 + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: edit block column (body line 1) AND normal column (title) + rc = sqlite3_exec(db[0], "UPDATE notes SET body = 'EditedLine1\nLine2\nLine3', title = 'Title From A' WHERE id = 'n1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: edit a different block (body line 3) AND normal column (title — will conflict via LWW) + rc = sqlite3_exec(db[1], "UPDATE notes SET body = 'Line1\nLine2\nEditedLine3', title = 'Title From B' WHERE id = 'n1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + // Materialize block column + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('notes', 'body', 'n1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('notes', 'body', 'n1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM notes WHERE id = 'n1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM notes WHERE id = 'n1';"); + char *title0 = do_select_text(db[0], "SELECT title FROM notes WHERE id = 'n1';"); + char *title1 = do_select_text(db[1], "SELECT title FROM notes WHERE id = 'n1';"); + + bool ok = true; + + // Bodies should converge + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("mixed_columns: body diverged\n"); + ok = false; + } else { + // Both non-conflicting block edits should be preserved + if (!strstr(body0, "EditedLine1")) { printf("mixed_columns: missing EditedLine1\n"); ok = false; } + if (!strstr(body0, "Line2")) { printf("mixed_columns: missing Line2\n"); ok = false; } + if (!strstr(body0, "EditedLine3")) { printf("mixed_columns: missing EditedLine3\n"); ok = false; } + } + + // Titles should converge (normal LWW — one wins) + if (!title0 || !title1 || strcmp(title0, title1) != 0) { + printf("mixed_columns: title diverged: [%s] vs [%s]\n", title0 ? title0 : "NULL", title1 ? title1 : "NULL"); + ok = false; + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + if (title0) sqlite3_free(title0); + if (title1) sqlite3_free(title1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 5: NULL to text transition — INSERT with NULL body, then UPDATE to multi-line text +bool do_test_block_lww_null_to_text(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert with NULL body on site 0 + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', NULL);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("null_to_text: INSERT NULL failed\n"); goto fail; } + + // Sync to site 1 + if (!do_merge_values(db[0], db[1], false)) { printf("null_to_text: initial sync failed\n"); goto fail; } + + // Update to multi-line text on site 0 + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Hello\nWorld\nFoo' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("null_to_text: UPDATE failed\n"); goto fail; } + + // Verify blocks created + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 3) { printf("null_to_text: expected 3 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync update to site 1 + if (!do_merge_values(db[0], db[1], true)) { printf("null_to_text: sync update failed\n"); goto fail; } + + // Materialize on site 1 + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("null_to_text: materialize failed\n"); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, "Hello\nWorld\nFoo") != 0) { + printf("null_to_text: expected 'Hello\\nWorld\\nFoo', got '%s'\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 6: Interleaved inserts — multiple rounds of inserting between existing lines +bool do_test_block_lww_interleaved(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Start with 2 lines + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'A\nB');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Round 1: Site 0 inserts between A and B + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'A\nC\nB' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], true)) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Round 2: Site 1 inserts between A and C + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'A\nD\nC\nB' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Round 3: Site 0 inserts between D and C + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'A\nD\nE\nC\nB' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], true)) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Verify final state on both sites + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("interleaved: diverged: [%s] vs [%s]\n", body0 ? body0 : "NULL", body1 ? body1 : "NULL"); + ok = false; + } else { + // All 5 lines should be present + if (!strstr(body0, "A")) { printf("interleaved: missing A\n"); ok = false; } + if (!strstr(body0, "D")) { printf("interleaved: missing D\n"); ok = false; } + if (!strstr(body0, "E")) { printf("interleaved: missing E\n"); ok = false; } + if (!strstr(body0, "C")) { printf("interleaved: missing C\n"); ok = false; } + if (!strstr(body0, "B")) { printf("interleaved: missing B\n"); ok = false; } + + // Verify 5 blocks + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 5) { printf("interleaved: expected 5 blocks, got %" PRId64 "\n", blocks); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 7: Custom delimiter — paragraph separator instead of newline +bool do_test_block_lww_custom_delimiter(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + // Set custom delimiter: double newline (paragraph separator) + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'delimiter', '\n\n');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("custom_delim: set delimiter failed: %s\n", sqlite3_errmsg(db[i])); goto fail; } + } + + // Insert text with double-newline separated paragraphs + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Para one line1\nline2\n\nPara two\n\nPara three');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Should produce 3 blocks (3 paragraphs) + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 3) { printf("custom_delim: expected 3 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync and materialize + if (!do_merge_values(db[0], db[1], false)) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, "Para one line1\nline2\n\nPara two\n\nPara three") != 0) { + printf("custom_delim: mismatch: [%s]\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 8: Large text — many lines to verify position ID distribution +bool do_test_block_lww_large_text(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Build a 200-line text + #define LARGE_NLINES 200 + char large_text[LARGE_NLINES * 20]; + int offset = 0; + for (int i = 0; i < LARGE_NLINES; i++) { + if (i > 0) large_text[offset++] = '\n'; + offset += snprintf(large_text + offset, sizeof(large_text) - offset, "Line %03d content", i); + } + + // Insert via prepared statement to avoid SQL escaping issues + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[0], "INSERT INTO docs (id, body) VALUES ('bigdoc', ?);", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto fail; + sqlite3_bind_text(stmt, 1, large_text, -1, SQLITE_STATIC); + rc = sqlite3_step(stmt); + sqlite3_finalize(stmt); + if (rc != SQLITE_DONE) { printf("large_text: INSERT failed\n"); goto fail; } + + // Verify block count + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('bigdoc');"); + if (blocks != LARGE_NLINES) { printf("large_text: expected %d blocks, got %" PRId64 "\n", LARGE_NLINES, blocks); goto fail; } + + // Verify all position IDs are unique and ordered + int64_t distinct_positions = do_select_int(db[0], + "SELECT count(DISTINCT col_name) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + if (distinct_positions != LARGE_NLINES) { + printf("large_text: expected %d distinct positions, got %" PRId64 "\n", LARGE_NLINES, distinct_positions); + goto fail; + } + + // Sync and materialize + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("large_text: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'bigdoc');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("large_text: materialize failed\n"); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'bigdoc';"); + if (!body || strcmp(body, large_text) != 0) { + printf("large_text: roundtrip mismatch\n"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test 9: Rapid sequential updates — many updates on same row in quick succession +bool do_test_block_lww_rapid_updates(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert initial + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Start');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // 50 rapid updates, progressively adding lines + sqlite3_stmt *upd = NULL; + rc = sqlite3_prepare_v2(db[0], "UPDATE docs SET body = ? WHERE id = 'doc1';", -1, &upd, NULL); + if (rc != SQLITE_OK) goto fail; + + #define RAPID_ROUNDS 50 + char rapid_text[RAPID_ROUNDS * 20]; + int roff = 0; + for (int i = 0; i < RAPID_ROUNDS; i++) { + if (i > 0) rapid_text[roff++] = '\n'; + roff += snprintf(rapid_text + roff, sizeof(rapid_text) - roff, "Update%d", i); + + sqlite3_bind_text(upd, 1, rapid_text, roff, SQLITE_STATIC); + rc = sqlite3_step(upd); + if (rc != SQLITE_DONE) { printf("rapid: UPDATE %d failed\n", i); sqlite3_finalize(upd); goto fail; } + sqlite3_reset(upd); + } + sqlite3_finalize(upd); + + // Verify final block count matches line count + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != RAPID_ROUNDS) { + printf("rapid: expected %d blocks, got %" PRId64 "\n", RAPID_ROUNDS, blocks); + goto fail; + } + + // Sync and verify roundtrip + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("rapid: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("rapid: materialize failed\n"); goto fail; } + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("rapid: roundtrip mismatch\n"); + ok = false; + } else { + // Check first and last lines + if (!strstr(body0, "Update0")) { printf("rapid: missing Update0\n"); ok = false; } + if (!strstr(body0, "Update49")) { printf("rapid: missing Update49\n"); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Unicode/multibyte content in blocks (emoji, CJK, accented chars) +bool do_test_block_lww_unicode(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert multi-line text with unicode content + const char *unicode_text = "Hello \xC3\xA9\xC3\xA0\xC3\xBC" "\n" // accented: éàü + "\xE4\xB8\xAD\xE6\x96\x87\xE6\xB5\x8B\xE8\xAF\x95" "\n" // CJK: 中文测试 + "\xF0\x9F\x98\x80\xF0\x9F\x8E\x89\xF0\x9F\x9A\x80"; // emoji: 😀🎉🚀 + + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', ?);", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto fail; + sqlite3_bind_text(stmt, 1, unicode_text, -1, SQLITE_STATIC); + rc = sqlite3_step(stmt); + sqlite3_finalize(stmt); + if (rc != SQLITE_DONE) goto fail; + + // Should have 3 blocks + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 3) { printf("unicode: expected 3 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync and materialize + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("unicode: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("unicode: materialize failed\n"); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, unicode_text) != 0) { + printf("unicode: roundtrip mismatch\n"); + if (body) sqlite3_free(body); + goto fail; + } + + // Update: edit the emoji line + const char *updated_text = "Hello \xC3\xA9\xC3\xA0\xC3\xBC" "\n" + "\xE4\xB8\xAD\xE6\x96\x87\xE6\xB5\x8B\xE8\xAF\x95" "\n" + "\xF0\x9F\x92\xAF\xF0\x9F\x94\xA5"; // changed emoji: 💯🔥 + sqlite3_free(body); + + stmt = NULL; + rc = sqlite3_prepare_v2(db[0], "UPDATE docs SET body = ? WHERE id = 'doc1';", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto fail; + sqlite3_bind_text(stmt, 1, updated_text, -1, SQLITE_STATIC); + rc = sqlite3_step(stmt); + sqlite3_finalize(stmt); + if (rc != SQLITE_DONE) goto fail; + + // Sync update + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("unicode: sync update failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, updated_text) != 0) { + printf("unicode: update roundtrip mismatch\n"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Special characters (tabs, carriage returns, etc.) in blocks +bool do_test_block_lww_special_chars(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Text with tabs, carriage returns, and other special chars within lines + const char *special_text = "col1\tcol2\tcol3\n" // tabs within line + "line with\r\nembedded\n" // \r before \n delimiter + "back\\slash \"quotes\""; // backslash and quotes + + sqlite3_stmt *stmt = NULL; + rc = sqlite3_prepare_v2(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', ?);", -1, &stmt, NULL); + if (rc != SQLITE_OK) goto fail; + sqlite3_bind_text(stmt, 1, special_text, -1, SQLITE_STATIC); + rc = sqlite3_step(stmt); + sqlite3_finalize(stmt); + if (rc != SQLITE_DONE) goto fail; + + // Should split on \n: "col1\tcol2\tcol3", "line with\r", "embedded", "back\\slash \"quotes\"" + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 4) { printf("special: expected 4 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync and verify roundtrip + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("special: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, special_text) != 0) { + printf("special: roundtrip mismatch\n"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Concurrent delete vs edit on different blocks +// Site A deletes the row, Site B edits a line. After sync, delete wins. +bool do_test_block_lww_delete_vs_edit(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert initial doc + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line1\nLine2\nLine3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync to site 1 + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: DELETE the row + rc = sqlite3_exec(db[0], "DELETE FROM docs WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: Edit line 2 + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Line1\nEdited\nLine3' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + // Both should converge: either row deleted or row exists with some content + int64_t rows0 = do_select_int(db[0], "SELECT count(*) FROM docs WHERE id = 'doc1';"); + int64_t rows1 = do_select_int(db[1], "SELECT count(*) FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (rows0 != rows1) { + printf("delete_vs_edit: row count diverged: db0=%" PRId64 " db1=%" PRId64 "\n", rows0, rows1); + ok = false; + } + + // If the row still exists, materialize and verify convergence + if (rows0 > 0 && rows1 > 0) { + sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (body0 && body1 && strcmp(body0, body1) != 0) { + printf("delete_vs_edit: bodies diverged\n"); + ok = false; + } + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + } + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Two block columns on same table +bool do_test_block_lww_two_block_cols(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT, notes TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'notes', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert with both block columns + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body, notes) VALUES ('doc1', 'B1\nB2\nB3', 'N1\nN2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("two_block_cols: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Verify blocks created for both columns + int64_t body_blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'body' || x'1f' || '%';"); + int64_t notes_blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync WHERE col_name LIKE 'notes' || x'1f' || '%';"); + if (body_blocks != 3) { printf("two_block_cols: expected 3 body blocks, got %" PRId64 "\n", body_blocks); goto fail; } + if (notes_blocks != 2) { printf("two_block_cols: expected 2 notes blocks, got %" PRId64 "\n", notes_blocks); goto fail; } + + // Sync to site 1 + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: edit body line 1 + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'B1_edited\nB2\nB3' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: edit notes line 2 + rc = sqlite3_exec(db[1], "UPDATE docs SET notes = 'N1\nN2_edited' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync both ways + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + // Materialize both columns on both sites + for (int i = 0; i < 2; i++) { + rc = sqlite3_exec(db[i], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("two_block_cols: materialize body db[%d] failed\n", i); goto fail; } + rc = sqlite3_exec(db[i], "SELECT cloudsync_text_materialize('docs', 'notes', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("two_block_cols: materialize notes db[%d] failed\n", i); goto fail; } + } + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + char *notes0 = do_select_text(db[0], "SELECT notes FROM docs WHERE id = 'doc1';"); + char *notes1 = do_select_text(db[1], "SELECT notes FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("two_block_cols: body diverged\n"); ok = false; + } else if (!strstr(body0, "B1_edited")) { + printf("two_block_cols: body edit missing\n"); ok = false; + } + + if (!notes0 || !notes1 || strcmp(notes0, notes1) != 0) { + printf("two_block_cols: notes diverged\n"); ok = false; + } else if (!strstr(notes0, "N2_edited")) { + printf("two_block_cols: notes edit missing\n"); ok = false; + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + if (notes0) sqlite3_free(notes0); + if (notes1) sqlite3_free(notes1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Update text to NULL (text->NULL transition) +bool do_test_block_lww_text_to_null(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert multi-line text + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line1\nLine2\nLine3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + int64_t blocks_before = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks_before != 3) { printf("text_to_null: expected 3 blocks before, got %" PRId64 "\n", blocks_before); goto fail; } + + // Update to NULL + rc = sqlite3_exec(db[0], "UPDATE docs SET body = NULL WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("text_to_null: UPDATE to NULL failed\n"); goto fail; } + + // Verify body is NULL + int64_t is_null = do_select_int(db[0], "SELECT body IS NULL FROM docs WHERE id = 'doc1';"); + if (is_null != 1) { printf("text_to_null: body not NULL after update\n"); goto fail; } + + // Sync and verify + if (!do_merge_values(db[0], db[1], false)) { printf("text_to_null: sync failed\n"); goto fail; } + + // Materialize on site 1 + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + int64_t is_null_b = do_select_int(db[1], "SELECT body IS NULL FROM docs WHERE id = 'doc1';"); + if (is_null_b != 1) { printf("text_to_null: body not NULL on site 1 after sync\n"); goto fail; } + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Payload-based sync for block columns (vs row-by-row do_merge_values) +bool do_test_block_lww_payload_sync(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert and first sync via payload + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Alpha\nBravo\nCharlie');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("payload_sync: initial merge failed\n"); goto fail; } + + // Edit on both sites + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Alpha_A\nBravo\nCharlie' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Alpha\nBravo\nCharlie_B' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync via payload both ways + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("payload_sync: merge 0->1 failed\n"); goto fail; } + if (!do_merge_using_payload(db[1], db[0], true, true)) { printf("payload_sync: merge 1->0 failed\n"); goto fail; } + + // Materialize + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("payload_sync: bodies diverged\n"); ok = false; + } else { + if (!strstr(body0, "Alpha_A")) { printf("payload_sync: missing Alpha_A\n"); ok = false; } + if (!strstr(body0, "Bravo")) { printf("payload_sync: missing Bravo\n"); ok = false; } + if (!strstr(body0, "Charlie_B")) { printf("payload_sync: missing Charlie_B\n"); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Idempotent apply — applying the same payload twice is a no-op +bool do_test_block_lww_idempotent(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert and sync + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Line1\nLine2\nLine3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_using_payload(db[0], db[1], false, true)) goto fail; + + // Edit on site 0 + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Edited1\nLine2\nLine3' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Apply payload to site 1 TWICE + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("idempotent: first apply failed\n"); goto fail; } + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("idempotent: second apply failed\n"); goto fail; } + + // Materialize + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + bool ok = true; + if (!body || strcmp(body, "Edited1\nLine2\nLine3") != 0) { + printf("idempotent: body mismatch: [%s]\n", body ? body : "NULL"); + ok = false; + } + + // Verify block count is still 3 (no duplicates from double apply) + int64_t blocks = do_select_int(db[1], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 3) { printf("idempotent: expected 3 blocks, got %" PRId64 "\n", blocks); ok = false; } + + if (body) sqlite3_free(body); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Block position ordering — after edits, materialized text has correct line order +bool do_test_block_lww_ordering(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert initial doc: A B C D E + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'A\nB\nC\nD\nE');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: insert X between B and C, remove D -> A B X C E + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'A\nB\nX\nC\nE' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: insert Y between D and E -> A B C D Y E + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'A\nB\nC\nD\nY\nE' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("ordering: bodies diverged: [%s] vs [%s]\n", body0 ? body0 : "NULL", body1 ? body1 : "NULL"); + ok = false; + } else { + // Verify ordering: A must come before B, B before C, etc. + // All lines that survived should maintain relative order + const char *pA = strstr(body0, "A"); + const char *pB = strstr(body0, "B"); + const char *pC = strstr(body0, "C"); + const char *pE = strstr(body0, "E"); + + if (!pA || !pB || !pC || !pE) { + printf("ordering: missing original lines\n"); ok = false; + } else { + if (pA >= pB) { printf("ordering: A not before B\n"); ok = false; } + if (pB >= pC) { printf("ordering: B not before C\n"); ok = false; } + if (pC >= pE) { printf("ordering: C not before E\n"); ok = false; } + } + + // X (inserted between B and C) should appear between B and C + const char *pX = strstr(body0, "X"); + if (pX) { + if (pX <= pB || pX >= pC) { printf("ordering: X not between B and C\n"); ok = false; } + } + + // Y should appear somewhere after C + const char *pY = strstr(body0, "Y"); + if (pY) { + if (pY <= pC) { printf("ordering: Y not after C\n"); ok = false; } + } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Composite primary key (text + int) with block column +bool do_test_block_lww_composite_pk(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (owner TEXT NOT NULL, seq INTEGER NOT NULL, body TEXT, PRIMARY KEY(owner, seq));", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert on site 0 + rc = sqlite3_exec(db[0], "INSERT INTO docs (owner, seq, body) VALUES ('alice', 1, 'Line1\nLine2\nLine3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("composite_pk: INSERT failed\n"); goto fail; } + + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('alice', 1);"); + if (blocks != 3) { printf("composite_pk: expected 3 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync to site 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("composite_pk: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'alice', 1);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("composite_pk: materialize failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE owner = 'alice' AND seq = 1;"); + if (!body || strcmp(body, "Line1\nLine2\nLine3") != 0) { + printf("composite_pk: body mismatch: [%s]\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + // Edit on site 1, sync back + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'Line1\nEdited2\nLine3' WHERE owner = 'alice' AND seq = 1;", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_using_payload(db[1], db[0], true, true)) { printf("composite_pk: reverse sync failed\n"); goto fail; } + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'alice', 1);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE owner = 'alice' AND seq = 1;"); + if (!body0 || strcmp(body0, "Line1\nEdited2\nLine3") != 0) { + printf("composite_pk: reverse body mismatch: [%s]\n", body0 ? body0 : "NULL"); + if (body0) sqlite3_free(body0); + goto fail; + } + sqlite3_free(body0); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Empty string body (not NULL) — should produce 1 block with empty content +bool do_test_block_lww_empty_vs_null(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert empty string (NOT NULL) + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', '');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 1) { printf("empty_vs_null: expected 1 block for empty string, got %" PRId64 "\n", blocks); goto fail; } + + // Insert NULL + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc2', NULL);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + int64_t blocks_null = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc2');"); + if (blocks_null != 1) { printf("empty_vs_null: expected 1 block for NULL, got %" PRId64 "\n", blocks_null); goto fail; } + + // Sync both to site 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("empty_vs_null: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // doc1 (empty string): body should be empty string, NOT NULL + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + int64_t is_null1 = do_select_int(db[1], "SELECT body IS NULL FROM docs WHERE id = 'doc1';"); + if (is_null1 != 0) { printf("empty_vs_null: doc1 body should NOT be NULL\n"); if (body1) sqlite3_free(body1); goto fail; } + if (!body1 || strcmp(body1, "") != 0) { printf("empty_vs_null: doc1 body should be empty, got [%s]\n", body1 ? body1 : "NULL"); if (body1) sqlite3_free(body1); goto fail; } + sqlite3_free(body1); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: DELETE row then re-insert with different block content (resurrection) +bool do_test_block_lww_delete_reinsert(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert and sync + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'Old1\nOld2\nOld3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_using_payload(db[0], db[1], false, true)) goto fail; + + // Delete the row + rc = sqlite3_exec(db[0], "DELETE FROM docs WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("del_reinsert: DELETE failed\n"); goto fail; } + + // Sync delete + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("del_reinsert: delete sync failed\n"); goto fail; } + + // Verify row gone on site 1 + int64_t count = do_select_int(db[1], "SELECT count(*) FROM docs WHERE id = 'doc1';"); + if (count != 0) { printf("del_reinsert: row should be deleted on site 1, count=%" PRId64 "\n", count); goto fail; } + + // Re-insert with different content + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'New1\nNew2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("del_reinsert: re-INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + // Sync re-insert + if (!do_merge_using_payload(db[0], db[1], true, true)) { printf("del_reinsert: reinsert sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, "New1\nNew2") != 0) { + printf("del_reinsert: body mismatch after reinsert: [%s]\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: INTEGER primary key with block column +bool do_test_block_lww_integer_pk(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE notes (id INTEGER NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: CREATE TABLE failed on %d: %s\n", i, sqlite3_errmsg(db[i])); goto fail; } + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('notes', 'CLS', 1);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: init failed on %d: %s\n", i, sqlite3_errmsg(db[i])); goto fail; } + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('notes', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: set_column failed on %d: %s\n", i, sqlite3_errmsg(db[i])); goto fail; } + } + + // Insert on site 0 + rc = sqlite3_exec(db[0], "INSERT INTO notes (id, body) VALUES (42, 'First\nSecond\nThird');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: INSERT failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM notes_cloudsync_blocks WHERE pk = cloudsync_pk_encode(42);"); + if (blocks != 3) { printf("int_pk: expected 3 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync to site 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("int_pk: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('notes', 'body', 42);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: materialize failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + + // Debug: check row exists + int64_t row_count = do_select_int(db[1], "SELECT count(*) FROM notes WHERE id = 42;"); + if (row_count != 1) { printf("int_pk: row not found on site 1, count=%" PRId64 "\n", row_count); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM notes WHERE id = 42;"); + if (!body || strcmp(body, "First\nSecond\nThird") != 0) { + printf("int_pk: body mismatch: [%s]\n", body ? body : "NULL"); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + // Edit and sync back + rc = sqlite3_exec(db[1], "UPDATE notes SET body = 'First\nEdited\nThird' WHERE id = 42;", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: UPDATE failed: %s\n", sqlite3_errmsg(db[1])); goto fail; } + if (!do_merge_using_payload(db[1], db[0], true, true)) { printf("int_pk: reverse sync failed\n"); goto fail; } + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('notes', 'body', 42);", NULL, NULL, NULL); + if (rc != SQLITE_OK) { printf("int_pk: reverse mat failed: %s\n", sqlite3_errmsg(db[0])); goto fail; } + + char *body0 = do_select_text(db[0], "SELECT body FROM notes WHERE id = 42;"); + if (!body0 || strcmp(body0, "First\nEdited\nThird") != 0) { + printf("int_pk: reverse body mismatch: [%s]\n", body0 ? body0 : "NULL"); + if (body0) sqlite3_free(body0); + goto fail; + } + sqlite3_free(body0); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Multiple rows with block columns in a single sync +bool do_test_block_lww_multi_row(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert 3 rows + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('r1', 'R1-Line1\nR1-Line2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('r2', 'R2-Alpha\nR2-Beta\nR2-Gamma');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('r3', 'R3-Only');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Edit r1 and r3 + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'R1-Edited\nR1-Line2' WHERE id = 'r1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'R3-Changed' WHERE id = 'r3';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Sync all in one payload + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("multi_row: sync failed\n"); goto fail; } + + // Materialize all + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'r1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'r2');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'r3');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + bool ok = true; + char *b1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'r1';"); + if (!b1 || strcmp(b1, "R1-Edited\nR1-Line2") != 0) { printf("multi_row: r1 mismatch [%s]\n", b1 ? b1 : "NULL"); ok = false; } + if (b1) sqlite3_free(b1); + + char *b2 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'r2';"); + if (!b2 || strcmp(b2, "R2-Alpha\nR2-Beta\nR2-Gamma") != 0) { printf("multi_row: r2 mismatch [%s]\n", b2 ? b2 : "NULL"); ok = false; } + if (b2) sqlite3_free(b2); + + char *b3 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'r3';"); + if (!b3 || strcmp(b3, "R3-Changed") != 0) { printf("multi_row: r3 mismatch [%s]\n", b3 ? b3 : "NULL"); ok = false; } + if (b3) sqlite3_free(b3); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Concurrent add at non-overlapping positions (top vs bottom) +bool do_test_block_lww_nonoverlap_add(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Insert initial: A B C + rc = sqlite3_exec(db[0], "INSERT INTO docs (id, body) VALUES ('doc1', 'A\nB\nC');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + if (!do_merge_values(db[0], db[1], false)) goto fail; + + // Site 0: add line at top -> X A B C + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'X\nA\nB\nC' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Site 1: add line at bottom -> A B C Y + rc = sqlite3_exec(db[1], "UPDATE docs SET body = 'A\nB\nC\nY' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + // Bidirectional sync + if (!do_merge_values(db[0], db[1], true)) goto fail; + if (!do_merge_values(db[1], db[0], true)) goto fail; + + rc = sqlite3_exec(db[0], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body0 = do_select_text(db[0], "SELECT body FROM docs WHERE id = 'doc1';"); + char *body1 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + + bool ok = true; + if (!body0 || !body1 || strcmp(body0, body1) != 0) { + printf("nonoverlap: bodies diverged: [%s] vs [%s]\n", body0 ? body0 : "NULL", body1 ? body1 : "NULL"); + ok = false; + } else { + // X should be present, Y should be present, original A B C should be present + if (!strstr(body0, "X")) { printf("nonoverlap: X missing\n"); ok = false; } + if (!strstr(body0, "Y")) { printf("nonoverlap: Y missing\n"); ok = false; } + if (!strstr(body0, "A")) { printf("nonoverlap: A missing\n"); ok = false; } + if (!strstr(body0, "B")) { printf("nonoverlap: B missing\n"); ok = false; } + if (!strstr(body0, "C")) { printf("nonoverlap: C missing\n"); ok = false; } + + // Order: X before A, Y after C + const char *pX = strstr(body0, "X"); + const char *pA = strstr(body0, "A"); + const char *pC = strstr(body0, "C"); + const char *pY = strstr(body0, "Y"); + if (pX && pA && pX >= pA) { printf("nonoverlap: X not before A\n"); ok = false; } + if (pC && pY && pY <= pC) { printf("nonoverlap: Y not after C\n"); ok = false; } + } + + if (body0) sqlite3_free(body0); + if (body1) sqlite3_free(body1); + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return ok; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Very long single line (10K chars, single block) +bool do_test_block_lww_long_line(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Build a 10,000-char single line + { + char *long_line = (char *)malloc(10001); + if (!long_line) goto fail; + for (int i = 0; i < 10000; i++) long_line[i] = 'A' + (i % 26); + long_line[10000] = '\0'; + + char *sql = sqlite3_mprintf("INSERT INTO docs (id, body) VALUES ('doc1', '%q');", long_line); + rc = sqlite3_exec(db[0], sql, NULL, NULL, NULL); + sqlite3_free(sql); + + if (rc != SQLITE_OK) { printf("long_line: INSERT failed: %s\n", sqlite3_errmsg(db[0])); free(long_line); goto fail; } + + // Should have 1 block (no newlines) + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 1) { printf("long_line: expected 1 block, got %" PRId64 "\n", blocks); free(long_line); goto fail; } + + // Sync to site 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("long_line: sync failed\n"); free(long_line); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) { free(long_line); goto fail; } + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + bool match = (body && strcmp(body, long_line) == 0); + if (!match) printf("long_line: body mismatch (len=%zu vs expected 10000)\n", body ? strlen(body) : 0); + if (body) sqlite3_free(body); + free(long_line); + if (!match) goto fail; + } + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +// Test: Whitespace and empty lines (delimiter edge cases) +bool do_test_block_lww_whitespace(int nclients, bool print_result, bool cleanup_databases) { + sqlite3 *db[2] = {NULL, NULL}; + time_t timestamp = time(NULL); + int rc; + + for (int i = 0; i < 2; i++) { + db[i] = do_create_database_file(i, timestamp, test_counter++); + if (!db[i]) return false; + rc = sqlite3_exec(db[i], "CREATE TABLE docs (id TEXT NOT NULL PRIMARY KEY, body TEXT);", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_init('docs');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + rc = sqlite3_exec(db[i], "SELECT cloudsync_set_column('docs', 'body', 'algo', 'block');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + } + + // Text with empty lines, whitespace-only lines, trailing newline + const char *text = "Line1\n\n spaces \n\t\ttabs\n\nLine6\n"; + char *sql = sqlite3_mprintf("INSERT INTO docs (id, body) VALUES ('doc1', '%q');", text); + rc = sqlite3_exec(db[0], sql, NULL, NULL, NULL); + sqlite3_free(sql); + if (rc != SQLITE_OK) { printf("whitespace: INSERT failed\n"); goto fail; } + + // Count blocks: "Line1", "", " spaces ", "\t\ttabs", "", "Line6", "" (trailing newline produces empty last block) + int64_t blocks = do_select_int(db[0], "SELECT count(*) FROM docs_cloudsync_blocks WHERE pk = cloudsync_pk_encode('doc1');"); + if (blocks != 7) { printf("whitespace: expected 7 blocks, got %" PRId64 "\n", blocks); goto fail; } + + // Sync to site 1 + if (!do_merge_using_payload(db[0], db[1], false, true)) { printf("whitespace: sync failed\n"); goto fail; } + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body || strcmp(body, text) != 0) { + printf("whitespace: body mismatch: [%s] vs [%s]\n", body ? body : "NULL", text); + if (body) sqlite3_free(body); + goto fail; + } + sqlite3_free(body); + + // Edit: remove empty lines -> "Line1\n spaces \n\t\ttabs\nLine6" + rc = sqlite3_exec(db[0], "UPDATE docs SET body = 'Line1\n spaces \n\t\ttabs\nLine6' WHERE id = 'doc1';", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + if (!do_merge_using_payload(db[0], db[1], true, true)) goto fail; + rc = sqlite3_exec(db[1], "SELECT cloudsync_text_materialize('docs', 'body', 'doc1');", NULL, NULL, NULL); + if (rc != SQLITE_OK) goto fail; + + char *body2 = do_select_text(db[1], "SELECT body FROM docs WHERE id = 'doc1';"); + if (!body2 || strcmp(body2, "Line1\n spaces \n\t\ttabs\nLine6") != 0) { + printf("whitespace: body2 mismatch: [%s]\n", body2 ? body2 : "NULL"); + if (body2) sqlite3_free(body2); + goto fail; + } + sqlite3_free(body2); + + for (int i = 0; i < 2; i++) { close_db(db[i]); db[i] = NULL; } + return true; + +fail: + for (int i = 0; i < 2; i++) if (db[i]) close_db(db[i]); + return false; +} + +int test_report(const char *description, bool result){ + printf("%-30s %s\n", description, (result) ? "OK" : "FAILED"); + return result ? 0 : 1; +} + +int main (int argc, const char * argv[]) { + sqlite3 *db = NULL; + int result = 0; + bool print_result = false; + bool cleanup_databases = true; + + // test in an in-memory database + int rc = sqlite3_open(":memory:", &db); + if (rc != SQLITE_OK) goto finalize; + + // manually load extension + sqlite3_cloudsync_init(db, NULL, NULL); + + printf("Testing CloudSync version %s\n", CLOUDSYNC_VERSION); + printf("=================================\n"); + + result += test_report("PK Test:", do_test_pk(db, 10000, print_result)); + result += test_report("UUID Test:", do_test_uuid(db, 1000, print_result)); + result += test_report("Comparison Test:", do_test_compare(db, print_result)); + result += test_report("RowID Test:", do_test_rowid(50000, print_result)); + result += test_report("Algo Names Test:", do_test_algo_names()); + result += test_report("DBUtils Test:", do_test_dbutils()); + result += test_report("Minor Test:", do_test_others(db)); + result += test_report("Test Error Cases:", do_test_error_cases(db)); + result += test_report("Null PK Insert Test:", do_test_null_prikey_insert(db)); + result += test_report("Test Single PK:", do_test_single_pk(print_result)); + + int test_mask = TEST_INSERT | TEST_UPDATE | TEST_DELETE; + int table_mask = TEST_PRIKEYS | TEST_NOCOLS; + #if !CLOUDSYNC_DISABLE_ROWIDONLY_TABLES + table_mask |= TEST_NOPRIKEYS; + #endif + + // test local changes + result += test_report("Local Test:", do_test_local(test_mask, table_mask, db, print_result)); + result += test_report("VTab Test: ", do_test_vtab(db)); + result += test_report("Functions Test:", do_test_functions(db, print_result)); + result += test_report("Functions Test (Int):", do_test_internal_functions()); + result += test_report("String Func Test:", do_test_string_replace_prefix()); + result += test_report("String Lowercase Test:", do_test_string_lowercase()); + result += test_report("Context Functions Test:", do_test_context_functions()); + result += test_report("PK Decode Count Test:", do_test_pk_decode_count_from_buffer()); + result += test_report("Error Handling Test:", do_test_error_handling()); + result += test_report("Terminate Test:", do_test_terminate()); + result += test_report("Hash Function Test:", do_test_hash_function()); + result += test_report("Blob Compare Test:", do_test_blob_compare()); + result += test_report("Blob Compare Large:", do_test_blob_compare_large_sizes()); + result += test_report("Deterministic Flags:", do_test_deterministic_flags()); + result += test_report("Schema Hash Roundtrip:", do_test_schema_hash_consistency()); + result += test_report("String Functions Test:", do_test_string_functions()); + result += test_report("UUID Functions Test:", do_test_uuid_functions()); + result += test_report("RowID Decode Test:", do_test_rowid_decode()); + result += test_report("SQL Schema Funcs Test:", do_test_sql_schema_functions()); + result += test_report("SQL PK Decode Test:", do_test_sql_pk_decode()); + result += test_report("PK Negative Values Test:", do_test_pk_negative_values()); + result += test_report("Settings Functions Test:", do_test_settings_functions()); + result += test_report("Sync/Enabled Funcs Test:", do_test_sync_enabled_functions()); + result += test_report("SQL UUID Func Test:", do_test_sql_uuid_function()); + result += test_report("PK Encode Edge Cases:", do_test_pk_encode_edge_cases()); + result += test_report("Col Value Func Test:", do_test_col_value_function()); + result += test_report("Is Sync Func Test:", do_test_is_sync_function()); + result += test_report("Insert/Update/Delete:", do_test_insert_update_delete_sql()); + result += test_report("Binary Comparison Test:", do_test_binary_comparison()); + result += test_report("PK Decode Malformed:", do_test_pk_decode_malformed()); + result += test_report("Test Many Columns:", do_test_many_columns(600, db)); + result += test_report("Payload Buffer Test (500KB):", do_test_payload_buffer(500 * 1024)); + result += test_report("Payload Buffer Test (600KB):", do_test_payload_buffer(600 * 1024)); + result += test_report("Payload Buffer Test (1MB):", do_test_payload_buffer(1024 * 1024)); + result += test_report("Payload Buffer Test (10MB):", do_test_payload_buffer(10 * 1024 * 1024)); + + // close local database + close_db(db); + db = NULL; + + // simulate remote merge + result += test_report("Merge Test:", do_test_merge(3, print_result, cleanup_databases)); + result += test_report("Merge Test 2:", do_test_merge_2(3, TEST_PRIKEYS, print_result, cleanup_databases)); + result += test_report("Merge Test 3:", do_test_merge_2(3, TEST_NOCOLS, print_result, cleanup_databases)); + result += test_report("Merge Test 4:", do_test_merge_4(2, print_result, cleanup_databases)); + result += test_report("Merge Test 5:", do_test_merge_5(2, print_result, cleanup_databases, false)); + result += test_report("Merge Test db_version 1:", do_test_merge_check_db_version(2, print_result, cleanup_databases, true)); + result += test_report("Merge Test db_version 2:", do_test_merge_check_db_version_2(2, print_result, cleanup_databases, true)); + result += test_report("Merge Test Insert Changes", do_test_insert_cloudsync_changes(print_result, cleanup_databases)); + result += test_report("Merge Alter Schema 1:", do_test_merge_alter_schema_1(2, print_result, cleanup_databases, false)); + result += test_report("Merge Alter Schema 2:", do_test_merge_alter_schema_2(2, print_result, cleanup_databases, false)); + result += test_report("Merge Two Tables Test:", do_test_merge_two_tables(2, print_result, cleanup_databases)); + result += test_report("Merge Conflicting PKeys:", do_test_merge_conflicting_pkeys(2, print_result, cleanup_databases)); + result += test_report("Merge Large Dataset:", do_test_merge_large_dataset(3, print_result, cleanup_databases)); + result += test_report("Merge Nested Transactions:", do_test_merge_nested_transactions(2, print_result, cleanup_databases)); + result += test_report("Merge Three Way:", do_test_merge_three_way(3, print_result, cleanup_databases)); + result += test_report("Merge NULL Values:", do_test_merge_null_values(2, print_result, cleanup_databases)); + result += test_report("Merge BLOB Data:", do_test_merge_blob_data(2, print_result, cleanup_databases)); + result += test_report("Merge Mixed Operations:", do_test_merge_mixed_operations(2, print_result, cleanup_databases)); + result += test_report("Merge Hub-Spoke:", do_test_merge_hub_spoke(4, print_result, cleanup_databases)); + result += test_report("Merge Timestamp Precision:", do_test_merge_timestamp_precision(2, print_result, cleanup_databases)); + result += test_report("Merge Partial Failure:", do_test_merge_partial_failure(2, print_result, cleanup_databases)); + result += test_report("Merge Rollback Scenarios:", do_test_merge_rollback_scenarios(2, print_result, cleanup_databases)); + result += test_report("Merge Circular:", do_test_merge_circular(3, print_result, cleanup_databases)); + result += test_report("Merge Foreign Keys:", do_test_merge_foreign_keys(2, print_result, cleanup_databases)); + // Expected failure: AFTER TRIGGERs are not fully supported by this extension. + // result += test_report("Merge Triggers:", do_test_merge_triggers(2, print_result, cleanup_databases)); + result += test_report("Merge RLS Trigger Denial:", do_test_rls_trigger_denial(2, print_result, cleanup_databases, true)); + result += test_report("Merge Index Consistency:", do_test_merge_index_consistency(2, print_result, cleanup_databases)); + result += test_report("Merge JSON Columns:", do_test_merge_json_columns(2, print_result, cleanup_databases)); + result += test_report("Merge Concurrent Attempts:", do_test_merge_concurrent_attempts(3, print_result, cleanup_databases)); + result += test_report("Merge Composite PK 10 Clients:", do_test_merge_composite_pk_10_clients(10, print_result, cleanup_databases)); + result += test_report("PriKey NULL Test:", do_test_prikey(2, print_result, cleanup_databases)); + result += test_report("Test Double Init:", do_test_double_init(2, cleanup_databases)); + + // test grow-only set + result += test_report("Test GrowOnlySet:", do_test_gos(6, print_result, cleanup_databases)); + result += test_report("Test Network Enc/Dec:", do_test_network_encode_decode(2, print_result, cleanup_databases, false)); + result += test_report("Test Network Enc/Dec 2:", do_test_network_encode_decode(2, print_result, cleanup_databases, true)); + result += test_report("Test Fill Initial Data:", do_test_fill_initial_data(3, print_result, cleanup_databases)); + result += test_report("Test Alter Table 1:", do_test_alter(3, 1, print_result, cleanup_databases)); + result += test_report("Test Alter Table 2:", do_test_alter(3, 2, print_result, cleanup_databases)); + result += test_report("Test Alter Table 3:", do_test_alter(3, 3, print_result, cleanup_databases)); + + // test row-level filter + result += test_report("Test Row Filter:", do_test_row_filter(2, print_result, cleanup_databases)); + + // test block-level LWW + result += test_report("Test Block LWW Insert:", do_test_block_lww_insert(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Update:", do_test_block_lww_update(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Sync:", do_test_block_lww_sync(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Delete:", do_test_block_lww_delete(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Materialize:", do_test_block_lww_materialize(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Empty:", do_test_block_lww_empty_text(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Conflict:", do_test_block_lww_conflict(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Multi-Update:", do_test_block_lww_multi_update(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Reinsert:", do_test_block_lww_reinsert(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Add Lines:", do_test_block_lww_add_lines(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW NoConflict:", do_test_block_lww_noconflict(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Add+Edit:", do_test_block_lww_add_and_edit(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Three-Way:", do_test_block_lww_three_way(3, print_result, cleanup_databases)); + result += test_report("Test Block LWW MixedCols:", do_test_block_lww_mixed_columns(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW NULL->Text:", do_test_block_lww_null_to_text(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Interleave:", do_test_block_lww_interleaved(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW CustomDelim:", do_test_block_lww_custom_delimiter(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Large Text:", do_test_block_lww_large_text(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Rapid Upd:", do_test_block_lww_rapid_updates(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Unicode:", do_test_block_lww_unicode(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW SpecialChars:", do_test_block_lww_special_chars(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Del vs Edit:", do_test_block_lww_delete_vs_edit(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW TwoBlockCols:", do_test_block_lww_two_block_cols(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Text->NULL:", do_test_block_lww_text_to_null(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW PayloadSync:", do_test_block_lww_payload_sync(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Idempotent:", do_test_block_lww_idempotent(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Ordering:", do_test_block_lww_ordering(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW CompositePK:", do_test_block_lww_composite_pk(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW EmptyVsNull:", do_test_block_lww_empty_vs_null(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW DelReinsert:", do_test_block_lww_delete_reinsert(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW IntegerPK:", do_test_block_lww_integer_pk(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW MultiRow:", do_test_block_lww_multi_row(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW NonOverlap:", do_test_block_lww_nonoverlap_add(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW LongLine:", do_test_block_lww_long_line(2, print_result, cleanup_databases)); + result += test_report("Test Block LWW Whitespace:", do_test_block_lww_whitespace(2, print_result, cleanup_databases)); finalize: if (rc != SQLITE_OK) printf("%s (%d)\n", (db) ? sqlite3_errmsg(db) : "N/A", rc);