feat(update): roll out automatic updates in staged batches via CDN manifest#691
feat(update): roll out automatic updates in staged batches via CDN manifest#691liruifengv wants to merge 5 commits into
Conversation
|
commit: |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: aee481f759
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| try { | ||
| const isInteractive = | ||
| options.isTTY ?? (process.stdin.isTTY && process.stdout.isTTY); | ||
| const deviceId = resolveUpdateDeviceId(); |
There was a problem hiding this comment.
Preserve first-launch telemetry before bucketing
On a first-ever normal launch, handleMainCommand runs update preflight before runShell/runPrompt initialize telemetry, and this call now creates ~/.kimi-code/device_id without an onFirstLaunch callback. When telemetry bootstrap runs later it sees an existing id and never emits the first_launch event, so default users with auto-update enabled lose first-launch attribution just by passing through the rollout check.
Useful? React with 👍 / 👎.
Related Issue
No prior issue — internal release-engineering work; the problem is explained below.
Problem
Every release currently becomes visible to the entire fleet at once: clients check the plain-text
/latestpointer on the CDN, so a bad release reaches 100% of auto-updating devices within hours. There is no way to limit the blast radius, observe a ramp, or stop a rollout without shipping another release.What changed
Update checks now prefer a CDN
latest.jsonmanifest and stage the release across device batches:{ "version": "0.15.0", "publishedAt": "2026-06-12T08:00:00Z", "rollout": [ { "percent": 30, "delaySeconds": 0 }, { "percent": 30, "delaySeconds": 43200 }, { "percent": 40, "delaySeconds": 86400 } ] }sha256(deviceId:version)(reshuffled every release); batches claim bucket ranges in order. A device's batch delay gates all passive update surfaces (background auto-install, startup prompt, manual-command notice): beforepublishedAt + delaySecondsthe device sees no update at all. Eligibility is a pure local-time check, so a cached manifest flips to eligible without any further network access.kimi upgradeis never gated. Manual upgrades always install the newest version.latest.jsonis missing or malformed, the check falls back to the existing plain-text/latestand behaves exactly like today (fully rolled out). The manifest parser ignores unknown fields so future additions cannot brick shipped clients. The plain-text/latestkeeps serving legacy clients and the install scripts unchanged.KIMI_CODE_EXPERIMENTAL_FLAGopts a device out of staging (always newest);KIMI_CODE_NO_AUTO_UPDATEstill disables everything and takes precedence. Operationally, publishingrollout: []releases to everyone immediately, and revertingversionhalts a ramp without downgrading devices that already updated (semvergtnever downgrades).no-latest/not-newer/no-manifest/held/eligible/experimental, with bucket, delay, and eligibleAt) to<data>/updates/rollout.log(size-capped, best-effort), and update telemetry gainsrollout_bucket/rollout_delay_seconds/rollout_from_manifest/rollout_bypassedproperties for watching the ramp.manifestfield: legacy cache files keep parsing (manifest: null= pre-rollout behavior), and older clients reading the new cache fall back safely to an empty cache.This approach needs no server-side logic — the CDN stays a static file host, the schedule is published once per release, and the ramp advances by time alone. Verified end-to-end against a real release: staged batch held → manual upgrade bypass → time-based flip → background native install of the published binary.
Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update.