Skip to content

fix(security): close bugs-rust-wont-catch Categories 1/2/3 + partial 4/5/6/8 (13 WIs)#346

Merged
githubrobbi merged 15 commits into
mainfrom
harden/bugs-rust-wont-catch
Jun 4, 2026
Merged

fix(security): close bugs-rust-wont-catch Categories 1/2/3 + partial 4/5/6/8 (13 WIs)#346
githubrobbi merged 15 commits into
mainfrom
harden/bugs-rust-wont-catch

Conversation

@githubrobbi

@githubrobbi githubrobbi commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

Summary

Implements the bugs-rust-wont-catch hardening plan. This is an honest partial landing: 13 of 20 work items — Phase A complete plus the surgical subset of B/C/E. The remaining 7 WIs are larger / cross-cutting / Windows-only and are tracked as follow-ups (see below). The plan's §2 "Definition of Done" is not yet met; this is reviewable, incremental progress.

Every commit is one WI, lint-fast green; the full pre-push gate (incl. cargo vet, vet-audit-discipline, and Windows-target lint-ci-windows) passes on the branch.

Landed (13 WIs)

Category WIs Result
1 TOCTOU 1.1, 1.2, 2.4 ✅ 100% — single-fd secure_remove; randomised create_new temps
2 Perms-after-create 2.1–2.4 ✅ 100% — create_new_secure_file/write_secret_file; born-0700 dir; keystore born-0600; atomic_write born-0600
3 Path identity 3.1 ✅ 100% — paths_identical (dev+ino / file-id) + tests
4 UTF-8 boundary 4.3 ✅, 4.4 RFC ✅ ~40% — strict-parse fail-closed at subprocess decision sites (closed a fail-open signature check); lossless-storage RFC landed
5 Panic=DoS 5.1 ~33% — [profile.dist] overflow-checks=true
6 Discarded errors 6.1, 6.2 ~66% — keepalive write failures logged; log-dir create failures surfaced (fixed a latent daemon panic-on-degrade)
8 Trust boundary 8.2 ~50% — shutdown-nonce security property documented
G Guard G.1 gate built (multi-line AUDIT-OK aware); not yet wired into the pipeline — still red on 36 byte-decode sites until 4.1/4.2

Two real bugs found & fixed along the way

  • Fail-open code-signature check (broker/verify): a lossy/mangled status slipped through the != "HashMismatch" accept path → now strict-decode + fail-closed.
  • Latent daemon crash (WI-6.2 test caught it): on an uncreatable log dir the code logged the error then still called the panicking rolling::never(...) → now degrades to stdout.

Documented deviations (no suppression hacks)

  • arithmetic_side_effects is not a workspace "warn" (the -D warnings gate would promote it to a hard error across ~1766 legitimate sites); WI-5.2 will enable it module-scoped on the parsers instead.
  • WI-4.2 reclassified as large (threads OsString through the uffs-client connect API + Windows CreateProcess builder + MCP wire field) — deferred with 4.1/5.2.
  • paths_identical Windows arm returns Unsupported (stable GetFileInformationByHandle FFI deferred until a caller needs it; the unstable windows_by_handle accessors are avoided).

Follow-ups (remaining 7 WIs — tracked in §1.1)

  • WI-4.1 instrumented UTF-16 decoder across ~22 sites (decoder foundation prototyped, stashed)
  • WI-4.2 OsString argv/IPC (large, API-coupled)
  • WI-5.2 parser checked_*/.get() hardening (hundreds of sites) → then raise the lint to deny
  • WI-5.3 malformed-input fuzz/regression corpus
  • WI-6.3 workspace-wide .ok()/let _ triage
  • WI-7.1 parity corpus (Windows-only)
  • WI-8.1 broker single-handle verify→grant (Windows-only)
  • Then: wire just audit-gate into the go lane (once the gate is green) — the final guardrail.

Notes

  • No new dependencies; Cargo.lock shows only the workspace version skew from the branch base predating the v0.5.110 ship — reconciles on merge.
  • LOG/<ts>_CHANGELOG_HEALING.md holds the per-WI healing record (gitignored, kept locally).

githubrobbi and others added 15 commits June 4, 2026 08:16
Companion audit + step-by-step WI runbook for closing the bug categories
from corrode.dev's "Bugs Rust Won't Catch" against the UFFS tree, with
regression guardrails. Spec for the harden/bugs-rust-wont-catch effort;
the work items (WI-2.1 … WI-3.1) land in subsequent commits, each
flipping its tracker row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add create_new_secure_file + write_secret_file to uffs-security::fs so
secret files are born with owner-only perms via O_CREAT|O_EXCL+mode (Unix)
/ create_new + owner-only ACL (Windows), closing the perms-after-create
window and refusing to follow a pre-planted symlink. Subsequent WIs route
key/atomic-write through these.

Tests: 0600 birth, rejects-existing, rejects-symlink (target untouched).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace create_dir_all + set_permissions(0700) with
DirBuilder::recursive(true).mode(0o700) on Unix, so the runtime/cache dir
(key, cache, socket, nonce) is never observable at the umask default
before being tightened. Windows arm keeps create_dir_all + owner-only ACL.

Tests: nested components each born 0700; idempotent re-create.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
file_based_key + dpapi_write_key now remove any stale key then
write_secret_file (born 0600 via O_CREAT|O_EXCL), replacing the
write-then-chmod pair that left the AES key / DPAPI blob briefly at the
umask default and could follow a symlink planted at the key path.

Test: remove-then-write pattern stays 0600 on first write and regen
(isolated temp path; never touches the shared real key.bin).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ut (WI-2.4, WI-1.2)

Both shared atomic_write and the daemon --out export wrote a predictable
<target>.uffs.tmp via File::create + chmod-after — a symlink-pluggable
temp name (TOCTOU) and a perms-after-create window. Both now use a
random-suffixed temp in the same dir, created via create_new_secure_file
(born 0600, refuses a pre-planted symlink), with error-cleanup.

Tests: atomic_write 0600 + 8-thread concurrent no-collision + overwrite;
daemon export ignores a symlink planted at the old predictable temp name.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the metadata(path) pre-stat + later open(path) with a single
open, reading the length from the open file (file.metadata()). The two
separate path resolutions were a TOCTOU window where the size used to
zero-fill and the bytes overwritten could target different objects.

Tests: zeroes-then-unlinks, absent-is-ok, symlink follow-target/
unlink-link semantics documented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ers (WI-5.1)

[profile.dist] now sets overflow-checks = true so shipped binaries (the
panic=abort daemon) catch integer wraps instead of silently corrupting.

The plan's workspace arithmetic_side_effects="warn" is not viable here:
the lint gate runs -D warnings, which would promote it to a hard error
across ~1766 legitimate sites well beyond the untrusted-input parsers.
Documented the decision to enable it module-scoped on the parser modules
in WI-5.2 instead (same Category-5 coverage, no false-positive failures).

(taplo realigned the lint-table comment column — whitespace only, no
lint values changed.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
scripts/ci/anti_pattern_gate.sh + `just audit-gate` forbid the audited
anti-patterns from returning to production code: silent lossy UTF decode,
predictable .uffs.tmp temp, set_permissions on secrets, raw key write,
discarded control writes. Deliberate exceptions use an inline
// AUDIT-OK(<category>): <reason> marker.

Committed but intentionally NOT yet wired into the go/lint-fast lane —
that lands in the final commit once WI-4.1/4.3/6.x clear the existing
hits, so it doesn't red-block intermediate WI commits. Verified it fails
on the pre-fix tree with the known from_utf16_lossy hits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The FNV-1a shutdown nonce is liveness/anti-cross-talk, NOT a
cryptographic authenticator — its confidentiality rests entirely on the
0700 runtime dir (WI-2.2). Document this at generate_nonce so the
property isn't silently over-trusted, and note the HMAC path if peer
authentication is ever required. No behaviour change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…(WI-4.3)

Trust/targeting decisions decoded child-process output with
from_utf8_lossy; the broker/verify code-signature checks even failed
OPEN on a mangled status. Convert decision sites to core::str::from_utf8
with fail-closed handling (broker verify_authenticode + client
verify_code_signature return false on invalid UTF-8; mcp ppid/name/
cmdline/etime parses return None). Log-only and per-line-robust sites
keep lossy decode with precise AUDIT-OK(bytes) markers.

Also harden anti_pattern_gate.sh to honor an AUDIT-OK marker anywhere in
the contiguous comment block above a match (not just the line directly
above), so multi-line justifications work.

uffs-broker is windows-only — its arm is verified on the Windows lane
(flagged in the healing log).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Both log-dir creates swallowed errors. Now each eprintln's the failure
once (tracing isn't up yet) under a scoped print_stderr expect.

The daemon path also had a latent crash: after logging the failure it
still called rolling::never(parent_dir,…), which PANICS when the parent
is absent — killing the detached daemon before IPC bind. Route a
create-dir failure to the stdout branch (try_init + return None) so it
actually degrades. A new test (init_tracing_degrades_when_log_dir_
uncreatable) exposed the panic and now locks in the degrade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…WI-3.1)

Add paths_identical(first, second) comparing OS file identity (Unix
dev+ino, Windows volume-serial+file-index) not path strings, for scoping
decisions that need "same file". Audited §3: drive scoping already uses
the typed DriveLetter and path_contains is a deliberate substring filter,
so no string-compare safety bug existed.

The cumulative Phase-A + WI-3.1 additions pushed fs.rs over the 800-LOC
policy ceiling, so the test module moved to fs/tests.rs via #[path]
(repo-idiomatic, mirrors runtime_dir/tests.rs). fs.rs now 625 LOC.

Tests: hardlink/symlink-target identical; distinct files not identical.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the WI-4.4 RFC for eliminating NTFS-name loss. Recommends Option B:
keep the UTF-8 name column for the common path (zero hot-path/formatter/
Polars-kernel change) + an optional WTF-8 sidecar keyed by record_index
for the rare lossy rows, behind a cache-format version bump. Covers
search/case-fold/output/migration/perf/acceptance.

Per the plan this WI ships the RFC only and remains 🟨 until maintainer
sign-off; WI-4.1 stays the in-place (measured, non-silent) mitigation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
keepalive_send_blocking dropped the write_all/flush results silently, so
a failed keepalive (daemon gone) was invisible. Chain the writes and log
at debug on error (daemon may be gone). Not propagated — the keepalive
task has no return channel and the next cycle self-corrects — but no
longer silent. Best-effort contract unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final-verification accounting: 13 of 20 WIs landed (Phase A complete +
surgical B/C/E). Category rollup now shows true coverage — 1/2/3 at 100%,
4/5/6/8 partial, 7 pending — and a status banner stating the §2
Definition of Done is not yet met. The anti-pattern gate is built but NOT
wired into the pipeline (still red on 36 byte-decode sites until WI-4.1/
4.2 land).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@githubrobbi githubrobbi changed the title harden: close 'Bugs Rust Won't Catch' Categories 1/2/3 + partial 4/5/6/8 (13 WIs) fix(security): close bugs-rust-wont-catch Categories 1/2/3 + partial 4/5/6/8 (13 WIs) Jun 4, 2026
@githubrobbi githubrobbi enabled auto-merge (squash) June 4, 2026 15:32
@githubrobbi githubrobbi merged commit 7c9fa8b into main Jun 4, 2026
23 checks passed
@githubrobbi githubrobbi deleted the harden/bugs-rust-wont-catch branch June 4, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant