Skip to content

feat(update): roll out automatic updates in staged batches via CDN manifest#691

Open
liruifengv wants to merge 5 commits into
mainfrom
auto-update
Open

feat(update): roll out automatic updates in staged batches via CDN manifest#691
liruifengv wants to merge 5 commits into
mainfrom
auto-update

Conversation

@liruifengv

@liruifengv liruifengv commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Related Issue

No prior issue — internal release-engineering work; the problem is explained below.

Problem

Every release currently becomes visible to the entire fleet at once: clients check the plain-text /latest pointer on the CDN, so a bad release reaches 100% of auto-updating devices within hours. There is no way to limit the blast radius, observe a ramp, or stop a rollout without shipping another release.

What changed

Update checks now prefer a CDN latest.json manifest and stage the release across device batches:

{
  "version": "0.15.0",
  "publishedAt": "2026-06-12T08:00:00Z",
  "rollout": [
    { "percent": 30, "delaySeconds": 0 },
    { "percent": 30, "delaySeconds": 43200 },
    { "percent": 40, "delaySeconds": 86400 }
  ]
}
  • Deterministic batching. Devices hash into buckets 0–99 via sha256(deviceId:version) (reshuffled every release); batches claim bucket ranges in order. A device's batch delay gates all passive update surfaces (background auto-install, startup prompt, manual-command notice): before publishedAt + delaySeconds the device sees no update at all. Eligibility is a pure local-time check, so a cached manifest flips to eligible without any further network access.
  • 24h hard ceiling. Delays are clamped to 86400s and buckets left uncovered by the plan fall into the slowest cohort, so every device sees a release at most 24h after publish no matter what the published plan says.
  • kimi upgrade is never gated. Manual upgrades always install the newest version.
  • Safe fallback. When latest.json is missing or malformed, the check falls back to the existing plain-text /latest and behaves exactly like today (fully rolled out). The manifest parser ignores unknown fields so future additions cannot brick shipped clients. The plain-text /latest keeps serving legacy clients and the install scripts unchanged.
  • Escape hatches. KIMI_CODE_EXPERIMENTAL_FLAG opts a device out of staging (always newest); KIMI_CODE_NO_AUTO_UPDATE still disables everything and takes precedence. Operationally, publishing rollout: [] releases to everyone immediately, and reverting version halts a ramp without downgrading devices that already updated (semver gt never downgrades).
  • Observability. Each version check appends one JSONL line (no-latest / not-newer / no-manifest / held / eligible / experimental, with bucket, delay, and eligibleAt) to <data>/updates/rollout.log (size-capped, best-effort), and update telemetry gains rollout_bucket / rollout_delay_seconds / rollout_from_manifest / rollout_bypassed properties for watching the ramp.
  • The local update cache gains an optional manifest field: legacy cache files keep parsing (manifest: null = pre-rollout behavior), and older clients reading the new cache fall back safely to an empty cache.

This approach needs no server-side logic — the CDN stays a static file host, the schedule is published once per release, and the ramp advances by time alone. Verified end-to-end against a real release: staged batch held → manual upgrade bypass → time-based flip → background native install of the published binary.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

@changeset-bot

changeset-bot Bot commented Jun 12, 2026

Copy link
Copy Markdown

⚠️ No Changeset found

Latest commit: 7602b43

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@pkg-pr-new

pkg-pr-new Bot commented Jun 12, 2026

Copy link
Copy Markdown
pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@7602b43
npx https://pkg.pr.new/@moonshot-ai/kimi-code@7602b43

commit: 7602b43

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aee481f759

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

try {
const isInteractive =
options.isTTY ?? (process.stdin.isTTY && process.stdout.isTTY);
const deviceId = resolveUpdateDeviceId();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve first-launch telemetry before bucketing

On a first-ever normal launch, handleMainCommand runs update preflight before runShell/runPrompt initialize telemetry, and this call now creates ~/.kimi-code/device_id without an onFirstLaunch callback. When telemetry bootstrap runs later it sees an existing id and never emits the first_launch event, so default users with auto-update enabled lose first-launch attribution just by passing through the rollout check.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant