feat: deploy projects asynchronously with live progress#1358
Merged
Conversation
23c424b to
02969bc
Compare
Switch `checkly deploy` from the synchronous POST /next-v2/projects/deploy to the
async POST /v1/projects/deploy, then poll GET /v1/projects/deployments/{id}/completion
to completion. This removes the API-gateway request-timeout ceiling that caused
large projects to fail with a 504 even though the deploy was still running.
- rest/projects.ts: deploy() submits the deployment and awaits completion
(looping the long-poll completion endpoint, which 408s while in progress);
dry runs still return the preview diff synchronously. Adds ProjectDeployment /
ProjectDeployFailedError types, getDeployment(), and awaitDeploymentCompletion()
with best-effort progress reporting.
- commands/deploy.ts: show a spinner with live progress during the deploy.
The deploy() return shape ({ data: { project, diff } }) is unchanged for callers.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Replace the completion long-poll (which only refreshed progress ~every 30s) with
the backend's Server-Sent Events stream: deploy() now follows
GET /v1/projects/deployments/{id}/events, surfacing each `progress` frame to
onProgress for a smooth bar and resolving on the terminal `complete` frame. A
single authenticated GET — no client polling.
- streamDeploymentEvents() reconnects (bounded) when the stream drops before a
terminal frame, covering both a clean EOF and a socket error (ECONNRESET) — the
common mid-deploy interruption; the server is stateless so resuming needs no cursor.
- openEventStream() buffers an HTTP error body (the response is a stream, so the
interceptor can't classify it) and re-runs the classifier to surface the typed
error (NotFoundError, etc.).
- Drops the snapshot-on-408 progress hack. dryRun stays the synchronous preview.
Requires the backend SSE route to be deployed first.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Follow a deploy via /v1/projects/{logicalId}/deployments/{id}/events, threading the
project logicalId (URL-encoded) through the deployment-following calls.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ight deploy [RED-644] When the async deploy endpoint returns 409 (a deployment is already in progress for the project), the CLI previously errored out. With the new opt-in flag, the CLI instead cancels the in-flight deployment, waits for it to finish unwinding, and retries — so a new deploy can preempt an older one instead of waiting. - New boolean flag --cancel-in-progress-deployment (default false). It is a dedicated flag, NOT --force (which only skips the confirmation prompt). - On a 409 with the flag set, deploy() cancels the specific in-flight deployment (from the conflict's deploymentId), long-polls the completion endpoint to a final state, then retries — bounded to avoid an unbounded cancel war, and treating a vanished predecessor (404) as "slot free". - awaitDeploymentCompletion floors its poll cadence so a server returning 408 immediately can't become a tight request loop. - A "Waiting for an in-progress deployment to finish…" status message is shown during the wait; a 409 without the flag now prints an actionable hint. - rest/errors.ts: type the deploymentId carried on a 409 ErrorData. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Mirror the API response shape: the deployment now carries cancelRequestedAt (null unless a cancellation has been requested). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…loy [RED-644] Previously a 409 (a deployment is already in progress for the project) failed the deploy. Now `checkly deploy` waits for the in-progress deployment to finish and then deploys — the same flow as --cancel-in-progress-deployment minus the cancel step. The flag now means "cancel it instead of waiting". - On a 409, deploy() resolves the conflict and retries: it long-polls the completion endpoint until the predecessor reaches a final state (optionally cancelling it first with the flag), and only then re-POSTs the deploy — once. The payload is never re-uploaded while the predecessor is still running. - awaitDeploymentCompletion is a single long-poll; the wait/retry cadence lives in deploy()/resolveInProgressDeployment, bounded by an overall ~30-min deadline after which the 409 surfaces with a clear message. - A predecessor whose worker died is finalized by the backend reaper, after which the completion poll returns and we deploy. - Update the flag description and the conflict message to match. Depends on the backend returning 409 immediately on a concurrent deploy (previously the request hung), without which neither the wait nor the cancel path could start. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
A deployment that was cancelled surfaced as the generic "Your project could not be deployed. / The deployment did not complete successfully." — giving no hint that it was cancelled. - Add ProjectDeployCancelledError; submitDeployment throws it when the deploy stream completes as CANCELLED (a reliable signal, independent of any backend error text), before the generic non-SUCCEEDED failure path. - The deploy command reports it distinctly: title "Your deployment was cancelled." with the body "A newer deployment may have cancelled yours. Try deploying again if you still need to apply your changes." Still an error (❌, exit 1) — only the messaging changes. - Also reword the in-progress (409 past wait deadline) message so the body stands on its own instead of leaning on the title. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ages [RED-644] Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ge [RED-644] Matches the wait message; "the" implied a deployment the user hadn't been told about. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…tedAt The async/sync deploy response's `result.project` was typed as the bare `Project` (name/logicalId/repoUrl). Add a `DeployedProject` type matching the API's deploy-result shape: the `id` it always returned plus the newly added camelCase `createdAt`/`updatedAt` timestamps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
d68a098 to
5885fae
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
checkly deploynow runs the deployment asynchronously and shows live progress while it runs.Previously, deploying a very large project could fail with a request timeout even though the deployment was still in progress. Deploys are no longer bound by that limit.
Details
/v1/projects/{logicalId}/deployments/{id}).--preview(dry run) is unchanged: it still returns the diff synchronously without starting a deployment.deploy()result shape is unchanged for programmatic consumers.projecttype now reflects the API's full shape — itsidplus the new camelCasecreatedAt/updatedAttimestamps (type-only, additive; existing consumers unaffected).Concurrent deployments
When another deployment is already in progress for the project (the API returns
409),checkly deploypreviously errored out. Now it waits for the in-progress deployment to finish and then deploys — no failure, no hung request.409the CLI long-polls the deployment's completion endpoint until the in-progress one reaches a final state, then re-submits the deploy once (the payload is never re-uploaded while the predecessor is still running). It shows a "Waiting for an in-progress deployment to finish…" status meanwhile.--cancel-in-progress-deploymentchanges the default from wait to cancel: it cancels the in-flight deployment first (so a new deploy preempts an older one) and then deploys. It is a dedicated flag, not--force(which only skips the confirmation prompt).409surfaces with an actionable message. A predecessor whose worker died is finalized by the backend reaper, after which the wait returns and the deploy proceeds.Testing
Unit tests cover the deploy flow: progress reporting, successful completion, failed deployments, reconnect on interrupted progress, and the dry-run path — plus the cancellation flow (cancel + wait + retry, bounded retries, 408 long-poll, and the predecessor-already-gone path).
tscand ESLint pass.🤖 Generated with Claude Code
https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi