Skip to content

feat: deploy projects asynchronously with live progress#1358

Merged
sorccu merged 10 commits into
mainfrom
simo/red-644-async-project-deploy
Jun 24, 2026
Merged

feat: deploy projects asynchronously with live progress#1358
sorccu merged 10 commits into
mainfrom
simo/red-644-async-project-deploy

Conversation

@sorccu

@sorccu sorccu commented Jun 22, 2026

Copy link
Copy Markdown
Member

What

checkly deploy now runs the deployment asynchronously and shows live progress while it runs.

Previously, deploying a very large project could fail with a request timeout even though the deployment was still in progress. Deploys are no longer bound by that limit.

Details

  • Live progress percentage in the deploy spinner as the deployment proceeds.
  • Resilient to transient connection drops mid-deploy — the CLI automatically reconnects and keeps following progress.
  • Follows the deployment via its project-scoped endpoint (/v1/projects/{logicalId}/deployments/{id}).
  • --preview (dry run) is unchanged: it still returns the diff synchronously without starting a deployment.
  • The deploy() result shape is unchanged for programmatic consumers.
  • The deploy result's project type now reflects the API's full shape — its id plus the new camelCase createdAt/updatedAt timestamps (type-only, additive; existing consumers unaffected).

Concurrent deployments

When another deployment is already in progress for the project (the API returns 409), checkly deploy previously errored out. Now it waits for the in-progress deployment to finish and then deploys — no failure, no hung request.

  • On a 409 the CLI long-polls the deployment's completion endpoint until the in-progress one reaches a final state, then re-submits the deploy once (the payload is never re-uploaded while the predecessor is still running). It shows a "Waiting for an in-progress deployment to finish…" status meanwhile.
  • New opt-in flag --cancel-in-progress-deployment changes the default from wait to cancel: it cancels the in-flight deployment first (so a new deploy preempts an older one) and then deploys. It is a dedicated flag, not --force (which only skips the confirmation prompt).
  • The wait is bounded by an overall deadline (~30 min); if it's exceeded the 409 surfaces with an actionable message. A predecessor whose worker died is finalized by the backend reaper, after which the wait returns and the deploy proceeds.

Testing

Unit tests cover the deploy flow: progress reporting, successful completion, failed deployments, reconnect on interrupted progress, and the dry-run path — plus the cancellation flow (cancel + wait + retry, bounded retries, 408 long-poll, and the predecessor-already-gone path). tsc and ESLint pass.

Requires the corresponding Checkly API update (async deploy + cancellation) to be available.

🤖 Generated with Claude Code

https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi

@sorccu sorccu force-pushed the simo/red-644-async-project-deploy branch from 23c424b to 02969bc Compare June 23, 2026 02:07
sorccu and others added 10 commits June 25, 2026 02:52
Switch `checkly deploy` from the synchronous POST /next-v2/projects/deploy to the
async POST /v1/projects/deploy, then poll GET /v1/projects/deployments/{id}/completion
to completion. This removes the API-gateway request-timeout ceiling that caused
large projects to fail with a 504 even though the deploy was still running.

- rest/projects.ts: deploy() submits the deployment and awaits completion
  (looping the long-poll completion endpoint, which 408s while in progress);
  dry runs still return the preview diff synchronously. Adds ProjectDeployment /
  ProjectDeployFailedError types, getDeployment(), and awaitDeploymentCompletion()
  with best-effort progress reporting.
- commands/deploy.ts: show a spinner with live progress during the deploy.

The deploy() return shape ({ data: { project, diff } }) is unchanged for callers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Replace the completion long-poll (which only refreshed progress ~every 30s) with
the backend's Server-Sent Events stream: deploy() now follows
GET /v1/projects/deployments/{id}/events, surfacing each `progress` frame to
onProgress for a smooth bar and resolving on the terminal `complete` frame. A
single authenticated GET — no client polling.

- streamDeploymentEvents() reconnects (bounded) when the stream drops before a
  terminal frame, covering both a clean EOF and a socket error (ECONNRESET) — the
  common mid-deploy interruption; the server is stateless so resuming needs no cursor.
- openEventStream() buffers an HTTP error body (the response is a stream, so the
  interceptor can't classify it) and re-runs the classifier to surface the typed
  error (NotFoundError, etc.).
- Drops the snapshot-on-408 progress hack. dryRun stays the synchronous preview.

Requires the backend SSE route to be deployed first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Follow a deploy via /v1/projects/{logicalId}/deployments/{id}/events, threading the
project logicalId (URL-encoded) through the deployment-following calls.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ight deploy [RED-644]

When the async deploy endpoint returns 409 (a deployment is already in progress
for the project), the CLI previously errored out. With the new opt-in flag, the
CLI instead cancels the in-flight deployment, waits for it to finish unwinding,
and retries — so a new deploy can preempt an older one instead of waiting.

- New boolean flag --cancel-in-progress-deployment (default false). It is a
  dedicated flag, NOT --force (which only skips the confirmation prompt).
- On a 409 with the flag set, deploy() cancels the specific in-flight
  deployment (from the conflict's deploymentId), long-polls the completion
  endpoint to a final state, then retries — bounded to avoid an unbounded
  cancel war, and treating a vanished predecessor (404) as "slot free".
- awaitDeploymentCompletion floors its poll cadence so a server returning 408
  immediately can't become a tight request loop.
- A "Waiting for an in-progress deployment to finish…" status message is shown
  during the wait; a 409 without the flag now prints an actionable hint.
- rest/errors.ts: type the deploymentId carried on a 409 ErrorData.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
Mirror the API response shape: the deployment now carries cancelRequestedAt
(null unless a cancellation has been requested).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…loy [RED-644]

Previously a 409 (a deployment is already in progress for the project) failed the
deploy. Now `checkly deploy` waits for the in-progress deployment to finish and
then deploys — the same flow as --cancel-in-progress-deployment minus the cancel
step. The flag now means "cancel it instead of waiting".

- On a 409, deploy() resolves the conflict and retries: it long-polls the
  completion endpoint until the predecessor reaches a final state (optionally
  cancelling it first with the flag), and only then re-POSTs the deploy — once.
  The payload is never re-uploaded while the predecessor is still running.
- awaitDeploymentCompletion is a single long-poll; the wait/retry cadence lives
  in deploy()/resolveInProgressDeployment, bounded by an overall ~30-min deadline
  after which the 409 surfaces with a clear message.
- A predecessor whose worker died is finalized by the backend reaper, after
  which the completion poll returns and we deploy.
- Update the flag description and the conflict message to match.

Depends on the backend returning 409 immediately on a concurrent deploy
(previously the request hung), without which neither the wait nor the cancel
path could start.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
A deployment that was cancelled surfaced as the generic "Your project could not
be deployed. / The deployment did not complete successfully." — giving no hint
that it was cancelled.

- Add ProjectDeployCancelledError; submitDeployment throws it when the deploy
  stream completes as CANCELLED (a reliable signal, independent of any backend
  error text), before the generic non-SUCCEEDED failure path.
- The deploy command reports it distinctly: title "Your deployment was
  cancelled." with the body "A newer deployment may have cancelled yours. Try
  deploying again if you still need to apply your changes." Still an error
  (❌, exit 1) — only the messaging changes.
- Also reword the in-progress (409 past wait deadline) message so the body
  stands on its own instead of leaning on the title.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ages [RED-644]

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…ge [RED-644]

Matches the wait message; "the" implied a deployment the user hadn't been told about.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
…tedAt

The async/sync deploy response's `result.project` was typed as the bare `Project`
(name/logicalId/repoUrl). Add a `DeployedProject` type matching the API's
deploy-result shape: the `id` it always returned plus the newly added camelCase
`createdAt`/`updatedAt` timestamps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01CjBjJjMihvJKv8PEzuCvKi
@sorccu sorccu force-pushed the simo/red-644-async-project-deploy branch from d68a098 to 5885fae Compare June 24, 2026 17:52
@sorccu sorccu merged commit fda0b51 into main Jun 24, 2026
13 checks passed
@sorccu sorccu deleted the simo/red-644-async-project-deploy branch June 24, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant