Fly.io deploy manifest + CI auto-deploy (#38)#76
Conversation
Append a `deploy` job to `.github/workflows/ci.yml` and land `fly.toml` at repo root. The job runs only on push to main, depends on test + security + image-scan, holds `contents: read`, and invokes `flyctl deploy --remote-only` with `FLY_API_TOKEN` from repo secrets. `fly.toml` declares TCP-passthrough on :80 and :443 (no Fly HTTP handlers — autocert stays the cert holder), mounts the autocert volume at /var/lib/relay/autocert, passes --domain/--cert-cache as literal argv (distroless has no shell), and pins the single-machine invariant via min_machines_running=1 + auto_start_machines=false + auto_stop_machines=off + immediate deploy strategy. `<REGION>` and `<DOMAIN>` ship as loud placeholders that the operator fills at bootstrap; the spec's first-deploy-fails-loud failure mode is the desired behaviour over a silently-misconfigured production relay. `setup-flyctl` is SHA-pinned (1.6) with a `# Tracks:` comment, matching the Trivy (#68) and govulncheck (#41) pin convention. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`docs/deploy.md`: bootstrap (apps create / IPv4 / volume / DNS / FLY_API_TOKEN), steady-state (PR → merge → CI deploys), and rollback (by image digest preferred, by release number as fallback). `docs/architecture.md` § Hosting: records the Fly.io + relay-terminated TLS + single-machine-via-fly.toml decision so the next agent reading the repo cold sees the call without scrolling #38. AC #5 permits architecture.md as the equivalent of PROJECT-MEMORY, which the architect role is not allowed to edit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Code Review: #38Decision: PASS Findings
Security review (security-sensitive label)The architect's spec ( Re-walking the diff with security goggles:
The spec's [Threat model alignment] SHOULD FIX (update SummarySpec match is exact for all four files: The placeholder choice ( Comments throughout the manifest explain the why — the "no Fly handler" invariant, the distroless-argv constraint, the single-machine-not-enforced-by-platform reality, the |
Per-ticket file `codebase/38.md` captures implementation summary, patterns established (action-pin convention extended to setup-flyctl; TCP-passthrough → real peer IP for #34's rate limiter; loud placeholders over silently-misconfigured real values), and lessons learned (Fly Apps v2 has no `max_machines` key — verify platform docs at impl time; distroless = no shell = no argv env expansion). New feature doc `features/fly-deploy.md` covers the manifest shape (TCP passthrough vs. Fly HTTP proxy, single-machine cap via four declarative knobs, dedicated-IPv4 requirement, loud placeholders), the CI deploy job's three-layer privilege model, and the rollback paths. Cross-link refresh in `features/docker-image.md`: #38's host-wiring deferrals now point at the landed fly-deploy feature doc. INDEX entries added for the new feature doc and `docs/deploy.md`; the System overview line gains a § Hosting pointer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
fly.tomlat repo root: TCP passthrough on:80/:443(no Fly HTTP/TLS handler — autocert stays the cert holder), volume mount at/var/lib/relay/autocert,--domain/--cert-cacheas literal argv (distroless has no shell), single-machine cap viamin_machines_running=1+auto_start_machines=false+auto_stop_machines="off"+immediatedeploy strategy..github/workflows/ci.yml: newdeployjob, runs only on push tomain,needs: [test, security, image-scan],permissions: contents: read,superfly/flyctl-actions/setup-flyctlSHA-pinned to1.6with a# Tracks:comment matching the Trivy / govulncheck convention.docs/deploy.md: bootstrap (flyctl apps create,flyctl ips allocate-v4,flyctl volumes create, DNS,FLY_API_TOKENsecret), steady-state flow, rollback by image digest (preferred) and by release number.docs/architecture.md§ Hosting: decision record for Fly.io + relay-terminated TLS + single-machine-via-fly.toml, per AC relay: WS upgrade for /v1/client — accept phone connection, look up server-id, register #5 (architecture.md is the permitted equivalent of PROJECT-MEMORY, which this role cannot edit).Issue
Closes #38. Spec:
docs/specs/architecture/38-fly-deploy-manifest.md.Notes for review
<REGION>and<DOMAIN>ship as loud placeholders (__REGION__,__DOMAIN__). The spec offered either real values or recognisable placeholders; without operator-supplied real values I picked the loud-placeholder path — first CI deploy fails with a clear "fill in fly.toml" signal rather than a silently-misconfigured production relay.docs/deploy.mdstep 1 calls this out.flyctl validate fly.tomlwas not run —flyctlis not installed in this environment. The first CI deploy onmainis the smoke test; spec § Testing strategy lists this as expected. TOML parses cleanly viatomllib.max_machinesis not a key in current Fly Apps v2; the single-machine cap is enforced via the four declarative knobs above, the in-binaryPYRYCODE_RELAY_SINGLE_INSTANCEself-check (relay: startup self-check refuses to run as multi-instance deploy #65), and operator discipline atflyctl scale count. Documented infly.toml's header comment.docs/threat-model.md§ Deploy security to reflect Fly-account hardening alongside VPS) is not included — flagged as a follow-up, not blocking.Testing
go vet ./...clean.go test -race ./...passes (no Go code changed).go build ./cmd/pyrycode-relaybuilds.fly.tomlTOML-parses; CI workflow YAML-parses.Architecture compliance
Matches the spec's Design § for all four files: fly.toml shape (TCP passthrough, no Fly handlers, mount path, argv form, single-machine knobs,
immediatestrategy), CI deploy job (branch gate,needs:chain,permissions: contents: read, SHA-pinned action with# Tracks:comment),docs/deploy.mdthree-section structure, anddocs/architecture.md§ Hosting placement between § Single-instance constraint (v1) and § Threat model.🤖 Generated with Claude Code