Conversation
* Docker test infrastructure for cloudflare workers Add Docker-based test infrastructure for running the cloudflare worker test suite in containers. Includes multi-shard test execution, Docker Compose services, and image build pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Patch vitest-pool-workers to handle SQLite WAL sidecar files @cloudflare/vitest-pool-workers' isolated storage mechanism (pushStackedStorage / popStackedStorage) asserts that every file in a Durable Object namespace directory ends with ".sqlite". However, SQLite in WAL journal mode creates two transient sidecar files alongside each database: ".sqlite-shm" (shared memory index) and ".sqlite-wal" (write-ahead log). When running in singleWorker mode, there is a race condition: abortAllDurableObjects() evicts all DOs but workerd may not have fully checkpointed the WAL before the Node.js side runs readdir(). If a .sqlite-shm or .sqlite-wal file is still on disk, the assertion fires and the test fails with "Isolated storage failed". The fix handles each function differently: - pushStackedStorage (snapshot before test): skip WAL files entirely. They are transient and SQLite regenerates them when reopening the DB, so they don't need to be part of the snapshot. - popStackedStorage (restore after test): delete WAL files but skip the .sqlite assertion. This is critical - if we merely skipped them, the stale .sqlite-wal would be replayed by SQLite when the next test opens the restored .sqlite file, leaking the previous test's writes and breaking test isolation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Migrate comment search embeddings from Cohere to Voyage AI The queue consumer (comments-search.ts) that indexes comments already used Voyage AI, but the search endpoint (api/lib/comments.ts) that embeds user queries still imported from ~/cohere. This caused test failures because the test mocks only intercept Voyage AI and Turbopuffer origins - Cohere API calls were blocked. More importantly, this completes the Cohere→Voyage migration: - Import embed/model/dimension from ~/voyage instead of ~/cohere - Change inputType from "search_query" to "query" (Voyage API) - Remove embeddingTypes param (Voyage doesn't use it) - Use env.VOYAGE_API_KEY instead of env.COHERE_API_KEY Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Docker test infrastructure improvements Improvements to the Docker-based test runner for cloudflare workers: Dockerfile.base: - Bump Node.js from 20.18.1 to 20.19.1 (required by Prisma 7.x) - Include prisma.config.ts in the image (required by prisma migrate deploy in Prisma 7, which no longer supports --url flag or datasource.url in the schema) m2m-api entrypoint: - Remove duplicate @liveblocks/core from nested node_modules before starting the server to prevent "Multiple copies of Liveblocks" runtime error (root has 3.14.0 stable, shared/common has 3.14.0-rc1) docker-compose.dev.yml: - Add AWS_LB_M2M_API_URL_US_EAST_1 and EU_CENTRAL_1 env vars vitest.config.base.ts: - Add __DOCKER__ define (set from DOCKER=1 env var) for conditional test skipping in Docker - Override M2M API URL bindings from env vars in Docker - Configure outbound network access for workerd (0.0.0.0/0) Test changes: - Skip editThreadMetadata "metadata keys exceed limit" test in Docker (unreliably slow with 50 metadata keys in singleWorker mode) - Use env-based M2M API URL in search and queue tests - Increase waitUntil timeouts to 10s for Docker - Use dynamic M2M URL in svix-utils run-all-shards.sh / run-shard.sh: - Improved shard orchestration and result parsing - Resolve m2m-api container IP for workerd DNS (workerd can't resolve Docker service names) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix SQLite WAL cleanup in restore-sqlite-backup tests The restore-sqlite-backup tests were failing because the restore endpoint forces a DO restart via ctx.waitUntil(blockConcurrencyWhile(throw ...)), which leaves .sqlite-shm WAL files behind. The vitest-pool-workers isolated storage checker then fails with "Expected .sqlite, got .sqlite-shm". - Add state.storage.sync() to runInSQLiveRoomDO (parity with runInLiveRoomDO) as a workaround for cloudflare/workers-sdk#11031 - Add syncSQLiveRoomDO() helper for tests that access SQLite DOs via HTTP APIs - Call syncSQLiveRoomDO() at the end of both restore-sqlite-backup tests to checkpoint WAL before isolated storage cleanup Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Consolidate test/backend scripts into extensionless Node runners Replace run-tests.sh and run-all-shards.sh with a unified `run-tests` Node script that handles both simple test runs and sharded orchestration. Add `run-backend` for starting the dev server with dependencies. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Document Docker image tagging strategy in CI workflow Expand the determine-tags job comment to explain the exact tag derivation rules (PR, branch, version, SHA) and which images receive which tags. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Extract inline CI test env into .env.test.ci file Move the inline .env.test creation from cloudflare-docker.yml into a committed .env.test.ci file. The CI step now just copies it over .env.test before starting services. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Rename INTERNAL_NETWORK env var to AIR_GAPPED Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Moves the resources around * Address PR review feedback - Drop bundle size analysis from build.js - Make Docker CI workflow manual dispatch only (remove PR/push triggers) - Remove test result publishing (merge-reports job) - Revert non-Docker-related changes in setupServers.ts (startPostgres, setupPostgres, kill logic) - Remove redundant comments from vitest.config.base.ts, revert timeout changes - Remove @cloudflare/workers-types from root devDependencies - Add lint compose service, simplify docker/lint script to use it Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Rebase onto main and update dependencies - Regenerate package-lock.json after rebase - Remove @liveblocks/core from root devDependencies - Update @cloudflare/workers-types patch for 4.20260219.0 * Copy Dockerfiles into temp deploy folders for CDK turbo prune doesn't include docker/ since it's outside the workspace. Copy the production Dockerfiles into the pruned output so CDK can find them at the new paths. * Address PR review feedback: revert unrelated changes, simplify workflow - Revert miniflare-utils.ts, comments.ts, aws-m2m-api-utils.ts, and restore-sqlite-backup.test.ts to main (changes addressed elsewhere) - Revert test-cloudflare.yml to main (keep existing CI working) - Remove push_images workflow parameter (always push) - Add patches/* to base image hash in build-images script Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Update package-lock.json after rebase onto main Regenerate lockfile to reflect zenrouter removal and decoders upgrade. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Restore apps/cloudflare/test/docker-compose.yml for existing CI The test-cloudflare.yml workflow references this file at the original path. Restore it so the existing PR test workflow continues to work. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix HTTPS in tests: always enable TLS, remove stray externals - Always include tlsOptions in outboundService network config, not just in Docker mode. Without tlsOptions, workerd can't make HTTPS requests, breaking Turbopuffer and Voyage AI calls in tests. - Remove "crypto" and "canvas" from esbuild externals (incorrectly added during rebase conflict resolution). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix Cursor bot findings: globalSetup, singleWorker, workerd path, prisma - Restore globalSetup falsy check for nosetup vitest config - Respect options.singleWorker parameter alongside env var - Remove hardcoded arm64 workerd path from docker-compose (let Dockerfile symlinks handle architecture detection) - Revert prisma connection_limit change (not Docker-related) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Set MINIFLARE_WORKERD_PATH only on arm64 in test-entrypoint On arm64, miniflare can't resolve the workerd binary without an explicit path. Set it conditionally in the test entrypoint rather than hardcoding arm64 in docker-compose.dev.yml. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix Node.js version in m2m-api Dockerfile to 20.19.0 Match the version used in the rest of the repo (package.json engine constraint, base Dockerfile). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Restore rm -rf dist/ in mongo build, keep build:docker The build script should clean dist/ before compiling (matching other shared packages). The build:docker script skips cleanup since the dist dir is fresh in Docker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add clean script to shared/mongo matching other shared packages Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Revert non-Docker changes in build.js Keep only Docker-related changes (GIT_SHA env var fallback, conditional minify). Revert cosmetic and unrelated changes (unused result variable, formatting, error logging). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Restore original build script for shared/postgres-prisma The build script was changed from rsync --ignore-existing to cp -r, which overwrites tsc output with raw generated files. This likely caused the getRoomThreads OR operator test to fail (500 instead of 422). Only build:docker and clean should be added, not modifying the existing build script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix docker inspect IP resolution for multi-network containers Use newline separator and head -1 to extract only the first IP address, preventing concatenation if the container is attached to multiple networks. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Align base Dockerfile Node version with production (20.19.0) Match the node:20.19.0 used in production aws-m2m-api Dockerfile and the engine constraint in apps/cloudflare/package.json. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix minification default to preserve production behavior Default to minify when NODE_ENV is unset (production deploys don't set it). Only disable minification when NODE_ENV is explicitly set to something other than "production" (e.g. test, development). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Enable globstar for recursive glob in build-images Without shopt -s globstar, ** only matches one directory level in bash, so apps/aws-m2m-api/src/**/*.ts would miss nested files. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Remove unused PRISMA_QUERY_ENGINE_LIBRARY and run-all-shards - Remove PRISMA_QUERY_ENGINE_LIBRARY env var from docker-compose.dev.yml: Prisma 7.x with engineType="client" doesn't use the native query engine - Remove run-all-shards script: duplicates run-tests shard orchestration - Clean up references to run-all-shards in run-shard, run-tests, README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Regenerate package-lock.json from main to fix zenrouter hoisting The previous lockfile had @liveblocks/zenrouter duplicated as 4 separate copies (one per consuming package) instead of hoisted to root. This caused instanceof ValidationError checks to fail across package boundaries, resulting in 500 instead of 422 for invalid thread queries. Regenerated by taking main's lockfile and running npm install to reconcile with branch package.json changes (scripts only, no dep changes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix entrypoint debug logging, Docker tag whitespace, and lockfile - Remove set -x from m2m-api entrypoint to avoid leaking env vars in logs - Fix leading whitespace in Docker image tags in cloudflare-docker.yml - Regenerate package-lock.json from main to fix zenrouter hoisting (4 separate copies → 1 hoisted copy, fixing instanceof checks) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Make Docker-specific vitest config conditional to fix workerd crash Gate outboundService.network and miniflare log on DOCKER=1, restore hardcoded localhost:3003 bindings as defaults, remove minWorkers: 1, and restore env: {} from main. * Fix lint errors: simplify Docker detection, fix import sort order * Remove unnecessary IP resolution from run-shard, fix shard average Docker DNS resolves service names within the compose network, so the manual IP resolution via docker inspect was unnecessary. The M2M API URLs are already set in docker-compose.dev.yml using the service name. Also fix average-per-shard calculation to divide by actual results count instead of totalShards when sequential mode stops early. * Remove misplaced syntax directives, fix cleanup path in run-shard-ci The # syntax=docker/dockerfile:1 directive was on line 3 in all Dockerfiles, preceded by comments, so Docker ignored it. Since modern Docker defaults to BuildKit, just remove the no-op directives. Fix cleanup function in run-shard-ci to use absolute path for the compose file, since the working directory changes later in the script. * Add build:docker task to postgres-prisma turbo config The build:docker script needs prisma:generate to have run first (it copies src/generated to dist), but only the build task had this dependency declared. * Fix build:docker dependency to use ^build:docker not ^build Aligns with root turbo.json which uses ^build:docker, ensuring upstream deps run their Docker-specific build (without rm -rf dist). * Fix workflow_dispatch branch tag to use ref instead of event name workflow_dispatch sets github.ref to refs/heads/<branch> but event_name is 'workflow_dispatch' not 'push', so the branch tag always fell through to 'main'. Check the ref pattern instead. * Remove shared/zenrouter references from Docker files zenrouter was moved to its own repo and is now consumed as an npm dependency (@liveblocks/zenrouter). The Dockerfile and compose files were still referencing shared/zenrouter which no longer exists. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Original commit: aabb73a94311f409e70ccb4265910b6cba6a828f
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )