Skip to content

fix(daemon): close IPC TOCTOU window + enforce SO_PEERCRED (PILOT-246)#176

Merged
TeoSlayer merged 4 commits into
mainfrom
openclaw/pilot-246-20260529-193241
May 29, 2026
Merged

fix(daemon): close IPC TOCTOU window + enforce SO_PEERCRED (PILOT-246)#176
TeoSlayer merged 4 commits into
mainfrom
openclaw/pilot-246-20260529-193241

Conversation

@matthew-pilot
Copy link
Copy Markdown
Collaborator

What failed

The daemon's Unix-domain IPC socket had two security gaps:

  1. TOCTOU window (ipc.go:421-429): net.Listen(unix, ...) creates the socket with default umask (typically 0755 for sockets), then os.Chmod(0600) restricts it. Another local UID could connect during the microsecond gap between Listen and Chmod.

  2. No peer UID check (ipc.go:454-477): acceptLoop never called SO_PEERCRED. Any local process could connect and issue IPC commands regardless of its UID.

Why this fix

  • Set umask(0177) before Listen so the socket file is created with 0600 immediately, closing the TOCTOU window. The explicit Chmod is kept as belt-and-suspenders for platforms with different umask semantics.
  • Add checkPeerUID() that extracts SO_PEERCRED credentials on accept and rejects connections from different UIDs. Rejected connections are logged at WARN level.

Verification

  • go build ./... — clean
  • go vet ./... — clean
  • go test -run 'TestIPCServer|TestCheckPeer' ./pkg/daemon/ — all 10 tests pass
  • New tests: TestCheckPeerUIDRejectsNonUnixSocket, TestCheckPeerUIDAcceptsSameUIDUnixSocket
  • Pre-existing TestWriteLoopExitsOnWriteDeadline flake confirmed on main — unrelated

Diff stat

pkg/daemon/ipc.go                          | 49 ++++++++++++++++++++++++-
pkg/daemon/zz_ipc_socket_lifecycle_test.go | 57 ++++++++++++++++++++++++++++++
2 files changed, 105 insertions(+), 1 deletion(-)

Closes PILOT-246

Add tests for IPC socket access control:
- TestIPCServerRejectsCrossUIDConnection validates same-UID dial
- TestCheckPeerUIDRejectsNonUnixSocket validates non-Unix rejection
- TestCheckPeerUIDAcceptsSameUIDUnixSocket validates same-UID pass
The Unix-domain IPC socket had two problems:

1. TOCTOU window between net.Listen (creates socket with default umask,
   typically 0755 for sockets) and os.Chmod(0600) — another local UID
   could connect during this microsecond gap.

2. acceptLoop never checked the peer's UID via SO_PEERCRED, so any
   local process could connect and issue IPC commands regardless of
   its UID.

Fix:
- Set umask to 0177 before Listen so the socket is created with 0600
  immediately. The explicit Chmod is kept as belt-and-suspenders for
  platforms where umask semantics differ.
- Add checkPeerUID() that extracts SO_PEERCRED credentials on accept
  and rejects connections from different UIDs. Logged at WARN when
  rejected.

Closes PILOT-246
@matthew-pilot matthew-pilot added the matthew-fix-larger Autonomous fix by matthew-pilot, medium tier (≤10 files, ≤200 LoC) label May 29, 2026
@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

🦾 Matthew PR Check — #176 PILOT-246

Status

Verdict

⚠️ HOLD — macOS CI failure is new and plausibly related to the SO_PEERCRED change. Arch gates are pre-existing noise. Recommend verifying macOS test failure before merge.

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

🦾 Matthew Explains — #176 PILOT-246

What this does

Fixes two Unix-domain IPC socket security gaps in the daemon:

  1. TOCTOU window: Sets umask(0177) before net.Listen() so the socket file is created 0600 immediately. Keeps the explicit Chmod as belt-and-suspenders.

  2. Peer UID enforcement: Adds checkPeerUID() that extracts SO_PEERCRED credentials on accept and rejects connections from different UIDs. Rejected connections are logged at WARN.

Why it matters

Before this fix, any local process could connect to the daemon IPC socket and issue privileged commands (rotate keys, set webhooks, approve handshakes) regardless of UID. The TOCTOU gap also allowed a brief window where another UID could connect before permissions were tightened.

Caution

⚠️ macOS CI failed. SO_PEERCRED is Linux-specific; macOS uses LOCAL_PEERCRED/LOCAL_PEERPID with different types (Xucred vs ucred). The current implementation likely needs a +build linux constraint with a macOS stub.

@hank-pilot
Copy link
Copy Markdown
Collaborator

hank-pilot commented May 29, 2026

🤖 Hank — CI status

Classification: real
Run: https://github.com/TeoSlayer/pilotprotocol/actions/runs/26662745696
At commit: 5a90201

The build/test failure is a genuine code defect:

WARNING: DATA RACE
--- FAIL: TestWriteLoopExitsOnWriteDeadline (15.10s)
  zz_ipc_write_deadline_test.go:92: writeLoop did not exit within deadline window
FAIL	github.com/TeoSlayer/pilotprotocol/pkg/daemon	61.520s

@matthew-pilot — fix or comment.

Auto-classified at 2026-05-29T21:27:00Z. Re-runs on next push or check completion.

The original PR-176 placed unix.GetsockoptUcred + unix.SO_PEERCRED in
pkg/daemon/ipc.go without a build tag. These syscalls are Linux-only;
macOS CI failed at 'go vet' with:
  pkg/daemon/ipc.go:479:24: undefined: unix.GetsockoptUcred
  pkg/daemon/ipc.go:479:71: undefined: unix.SO_PEERCRED

Split checkPeerUID into:
- ipc_peercred_linux.go  — SO_PEERCRED + Ucred (original logic)
- ipc_peercred_darwin.go — LOCAL_PEERCRED + GetsockoptXucred (BSD equivalent)
- ipc_peercred_other.go  — no-op fallback for unsupported platforms

Tested locally on macOS: go build + go test pass.
TeoSlayer added a commit to pilot-protocol/common that referenced this pull request May 29, 2026
Mirrors the change that landed in TeoSlayer/pilotprotocol#172 — the
managed.policy.set IPC is a network-admin operation; without an
adminToken parameter, the gate that PR #172 added to the daemon side
has no caller-side support.

Wire format: [cmd][sub][action=0x01][netID(2)][tokenLen(2)][token...][policyJSON...]

Threat model alignment (per PILOT-347):
- Only the network administrators hold the admin token.
- managed.policy.set is correctly admin-gated.
- User-owned IPC ops (rotate_key, handshake approve/reject/revoke,
  set_webhook) belong on a SO_PEERCRED check, not admin-token —
  see TeoSlayer/pilotprotocol#176 for that pattern.

Empty adminToken is fine for solo-mode daemons (no AdminToken
configured); managed-mode daemons reject empty.

Tests updated to pass empty token.
go test ./driver/... PASS.

Co-authored-by: Teodor Calin <teodor@vulturelabs.io>
@TeoSlayer TeoSlayer merged commit d9826e8 into main May 29, 2026
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

matthew-fix-larger Autonomous fix by matthew-pilot, medium tier (≤10 files, ≤200 LoC)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants