Skip to content

Add kernelCTF CVE-2026-31419_cos (fix CI hang)#398

Closed
winmin wants to merge 8 commits into
google:masterfrom
winmin:fix-cve-2026-31419
Closed

Add kernelCTF CVE-2026-31419_cos (fix CI hang)#398
winmin wants to merge 8 commits into
google:masterfrom
winmin:fix-cve-2026-31419

Conversation

@winmin
Copy link
Copy Markdown
Contributor

@winmin winmin commented May 30, 2026

Summary

  • Based on PR Add kernelCTF CVE-2026-31419_cos #397 by @n132 (CVE-2026-31419_cos exploit)
  • Fixed exploit infinite loops that caused CI vuln_verify to report "VM hanged before running exploit"
  • Added MAX_RACE_ATTEMPTS (2000), MAX_EXPLOIT_RETRIES (50), and EXPLOIT_TIMEOUT_SEC (120s) to ensure clean exit on failure

Changes from original PR

  • exploit.c: Inner race while(1) → bounded loop (2000 iterations)
  • exploit.c: Outer retry while(1) → bounded loop (50 retries, 120s timeout)
  • Recompiled static binary with the fixes

Test plan

  • CI vuln_verify should no longer hang on patched KASAN kernel (clean exit instead)
  • CI vuln_verify should no longer hang on COS kernel (clean exit or successful exploit)
  • Unpatched KASAN kernel should still trigger KASAN UAF

n132 and others added 8 commits May 29, 2026 22:05
The exploit's infinite retry loops caused CI verification to hang:
- main() while(1) loop retried forever on patched kernels where the
  race can never succeed
- exploit() inner race loop also had no bound

Add MAX_EXPLOIT_RETRIES (50), EXPLOIT_TIMEOUT_SEC (120s), and
MAX_RACE_ATTEMPTS (2000) so the exploit exits cleanly when it fails,
allowing the CI to distinguish "exploit failed" from "VM hung".
The original prefetch side-channel KASLR leak fails in CI's QEMU/KVM
environment, returning wrong kernel base addresses every time.

Replace with the approach from CVE-2025-21700 (adapted from IAIK prefetch
project) which uses:
- Asymmetric fences (mfence/lfence) for better timing precision
- prefetchnta + prefetcht2 combo instead of dual prefetcht0
- 16 iterations per probe (vs 12) for better noise averaging
- Boyer-Moore majority vote with 7 rounds and automatic retry
- Wider scan range (up to 0xffffffffD0000000)
CI runners use AMD EPYC 7763 where prefetch timing is inverted:
mapped kernel pages show HIGHER latency (not lower like Intel).

Switch to the AMD-compatible technique from CVE-2025-39946:
- 2MB scan steps instead of 16MB (finer granularity for AMD)
- Sliding window of 11 consecutive entries to find the largest
  contiguous high-latency region (= mapped kernel text)
- 9 voting rounds with Boyer-Moore majority vote
- Same rdtsc/prefetch primitives (prefetchnta + prefetcht2)
The core_pattern flag-reading mechanism failed in CI because
crash(666, "/tmp/exp") couldn't find the binary at /tmp/exp.
Use /proc/self/exe which always resolves to the current executable
regardless of the deployment path.
@n132
Copy link
Copy Markdown
Contributor

n132 commented May 30, 2026

LLM saved my life

@n132
Copy link
Copy Markdown
Contributor

n132 commented May 30, 2026

It seems our exploit didn't work on the remote, but the CI is broken.

@winmin winmin closed this May 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants