Skip to content

bug: update cs2 for demo renders, fail-fast when it cannot launch#26

Open
Flegma wants to merge 2 commits into
mainfrom
fix/cs2-update-and-launch-failfast
Open

bug: update cs2 for demo renders, fail-fast when it cannot launch#26
Flegma wants to merge 2 commits into
mainfrom
fix/cs2-update-and-launch-failfast

Conversation

@Flegma
Copy link
Copy Markdown
Contributor

@Flegma Flegma commented Jun 2, 2026

Problem

CS2 demo renders could hang forever with no UI error. Two causes:

  1. Stale CS2 build. Demo renders skip the CS2 update to keep live spectate pinned to the game-server build. But after a Valve patch the cached build goes stale, and the Steam client refuses to launch an out-of-date game ("An error occurred while launching this game: Update required"). cs2 never starts.
  2. No-timeout launch wait. wait_for_cs2_process waits indefinitely on purpose (so a cold shader compile is never killed). So when cs2 cannot launch for any reason, the render sits at "waiting on cs2" forever and pins the only GPU node, with nothing shown in the UI.

Changes (src/lib/steam.sh)

1. Keep CS2 current for demo renders (install_cs2_via_steamcmd).
It previously skipped steamcmd entirely when CS2 was already installed (to pin live spectate to the game-server build). Now:

  • Live (DEMO_URL empty): unchanged, skip and stay pinned.
  • Demo (DEMO_URL set): run a fast steamcmd +app_update 730 (no validate, a no-op when already current) so CS2 is on the current build and launchable, then register_library.

Demos have no live server to match, so the pin does not apply to them. This re-introduces the "always update" approach tried in 47bb304 and reverted in b5c9fe6 (it broke live build-pinning), now gated to demos so the revert's reason no longer applies.

2. Fail fast when cs2 cannot launch (wait_for_cs2_process).
The launch wait now bails when cs2 genuinely will not start: if no shader compile was ever seen AND no game-file validation is running AND CS2_LAUNCH_TIMEOUT (default 360s) has elapsed, it calls die, which broadcasts status=error (with the reason) to every batch job so the failure shows in the render queue UI, and exits so the pod is reaped and the GPU node frees.

The guards (shaders_seen=0, validating_active=0) ensure a long-but-legitimate cold shader compile or integrity check is never aborted, and the 360s default clears the slow Steam first-boot (cs2 can spawn ~2 min in). Overridable via CS2_LAUNCH_TIMEOUT.

Together: fix 1 makes demo renders launch in the common case, and fix 2 surfaces a clear UI error and frees the node if a render still cannot launch for any reason.

Validation

  • bash -n clean.
  • Reviewed: the live-pin path is byte-for-byte unchanged; DEMO_URL reliably distinguishes demo vs live in this env; die/broadcast_batch_error are in scope in the batch flow; the timeout cannot fire during a compile or validation and clears the slow first-boot; the pgrep early-return still takes precedence. The demo-gating directly addresses why 47bb304 was reverted.
  • Observed live: the stale-build case shows the Steam "Update required" dialog while the render sits at "waiting on cs2" indefinitely; this is the hang both changes resolve.

lukepolo and others added 2 commits June 1, 2026 21:17
Demo renders skipped the CS2 update to stay pinned to the game-server
build. After a Valve patch the cached build goes stale and Steam refuses
to launch it ("Update required"), so cs2 never starts and the render
hangs forever (the launch wait is deliberately no-timeout for cold
shader compiles).

- install_cs2_via_steamcmd: keep the live pin (DEMO_URL empty), but for
  demo renders run a fast app_update 730 (no validate) so CS2 is current
  and launchable. Re-introduces the reverted "always update" (47bb304),
  now gated to demos so it does not break live build-pinning.
- wait_for_cs2_process: if cs2 never starts and no shader compile or
  game-file validation is running, die after CS2_LAUNCH_TIMEOUT (360s),
  which marks every batch job errored in the UI and frees the node.
@lukepolo lukepolo force-pushed the main branch 6 times, most recently from 7e9f6f4 to 495754a Compare June 2, 2026 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants