Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions docs/issues/cua-driver-v0-2-0-sync/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Plan

## Source Review

- Compare upstream `trycua/cua` tags `cua-driver-v0.1.5` and
`cua-driver-v0.2.0`.
- Apply the Swift driver delta with a three-way merge against DeepChat's
maintained fork.
- Keep upstream Rust driver changes out of this sync.

## Implementation

- Merge upstream Swift runtime changes into
`plugins/cua/vendor/cua-driver/source`.
- Adapt new upstream TCC, doctor, and MCP daemon-proxy text and commands to
`DeepChat Computer Use.app` and `com.wefonk.deepchat.computeruse`.
- Preserve DeepChat-only CLI behavior: `deepchat-permission-probe`, nonblocking
MCP startup, and DeepChat-managed `update`.
- Update `plugins/cua/vendor/cua-driver/upstream.json` to `cua-driver-v0.2.0`.
- Leave packaged skills unchanged unless validation shows upstream skill content
changed in the Swift release.

## Validation

- Run `swift build --package-path plugins/cua/vendor/cua-driver/source --product cua-driver`.
- Run `pnpm run format`.
- Run `pnpm run i18n`.
- Run `pnpm run lint`.
- Run `git diff --check`.
- Run `pnpm run plugin:cua:build:mac:arm64`.
- Run `pnpm run plugin:validate -- --name cua --platform darwin --arch arm64`.

## Risk

The vendored driver is a local fork with DeepChat-specific TCC and packaging
behavior. A direct replacement with upstream source would risk regressing the
helper identity, permission flow, and plugin-managed update path, so the sync is
kept as an explicit fork merge.
41 changes: 41 additions & 0 deletions docs/issues/cua-driver-v0-2-0-sync/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# CUA Driver v0.2.0 Sync

## Problem

The bundled DeepChat Computer Use helper is based on upstream
`cua-driver-v0.1.5`. Upstream Swift CUA driver `cua-driver-v0.2.0` contains
macOS reliability fixes for focus suppression, screenshot capture fallback,
hidden app handling, side-effect detection, and MCP daemon proxying.

## User Story

As a DeepChat user using the bundled CUA plugin, I need the macOS helper to
include current upstream Swift driver fixes while continuing to use DeepChat's
helper app, TCC permissions, MCP registration, and plugin packaging.

## Acceptance Criteria

- Vendored upstream metadata records `cua-driver-v0.2.0` and commit
`d3f3b9325f49aa5302c15fb03f6b66bd1e688e27`.
- The local fork includes the upstream Swift driver runtime improvements from
`v0.1.5` through `v0.2.0`.
- DeepChat-specific behavior remains intact: `DeepChat Computer Use.app`,
bundle id `com.wefonk.deepchat.computeruse`, `deepchat-permission-probe`,
DeepChat-managed updates, and MCP-first plugin skills.
- The Rust `cua-driver-rs` runtime is not introduced in this sync.
- Validation covers Swift build, formatting, i18n, lint, diff checks, CUA
runtime build, and plugin validation where practical.

## Non-goals

- No migration to `cua-driver-rs`.
- No changes to the CUA plugin manifest, settings UI, MCP server id, or tool
policy.
- No adoption of upstream standalone installer behavior for DeepChat updates.

## Constraints

- Preserve DeepChat's local helper app identity for TCC attribution.
- Keep packaged `plugins/cua/skills/cua-driver` guidance MCP-first.
- Treat upstream standalone scripts as reference material unless required by
the bundled helper build.
13 changes: 13 additions & 0 deletions docs/issues/cua-driver-v0-2-0-sync/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Tasks

- [x] Identify latest upstream Swift CUA driver release.
- [x] Confirm Rust `cua-driver-rs` remains out of scope.
- [x] Compare `cua-driver-v0.1.5` to `cua-driver-v0.2.0`.
- [x] Merge upstream Swift runtime changes into the DeepChat fork.
- [x] Preserve DeepChat helper app identity, permission probe, update policy,
and MCP-first behavior.
- [x] Update vendored upstream metadata.
- [x] Run Swift build validation.
- [x] Run formatting, i18n, lint, and diff checks.
- [x] Build the CUA plugin runtime.
- [x] Validate the CUA plugin package.
2 changes: 1 addition & 1 deletion plugins/cua/vendor/cua-driver/source/.bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.1.5
current_version = 0.2.0
commit = True
tag = True
tag_name = cua-driver-v{new_version}
Expand Down
4 changes: 4 additions & 0 deletions plugins/cua/vendor/cua-driver/source/Package.swift
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,9 @@ let package = Package(
name: "ZoomMathTests",
dependencies: ["CuaDriverCore"]
),
.testTarget(
name: "FocusStealPreventerTests",
dependencies: ["CuaDriverCore"]
),
]
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import Darwin
import Foundation

/// Shared "is this binary running from inside an installed DeepChat Computer Use.app
/// bundle?" heuristic used by both `ServeCommand` (for the
/// auto-relaunch-via-`open` path) and `MCPCommand` (for the daemon proxy
/// path). Resolves `Bundle.main.executablePath` (falling back to
/// `CommandLine.arguments.first`) through any symlinks via `realpath` and
/// checks whether the resolved path lives inside some
/// `DeepChat Computer Use.app/Contents/MacOS/` directory.
///
/// That's the "installed via install-local.sh / install.sh" shape —
/// `/usr/local/bin/cua-driver` is a symlink into
/// `/Applications/DeepChat Computer Use.app`, and `realpath` walks into the
/// bundle. Returns `false` for `swift run` /
/// raw `.build/<config>/cua-driver` dev invocations, which have no installed
/// bundle to relaunch into.
///
/// Subcommands may wrap this with additional gating (env vars, flags,
/// parent-pid checks, etc.) when their relaunch heuristics diverge.
func isExecutableInsideCuaDriverApp() -> Bool {
// Prefer Foundation's executablePath (stable, absolute).
// Fall back to argv[0] when unset, which realpath() still
// resolves via $PATH lookup at the shell level — good enough
// for the cases we care about.
let candidate = Bundle.main.executablePath
?? CommandLine.arguments.first
?? ""
guard !candidate.isEmpty else { return false }

var buffer = [CChar](repeating: 0, count: Int(PATH_MAX))
guard realpath(candidate, &buffer) != nil else { return false }
let resolved = String(cString: buffer)
return resolved.contains("/DeepChat Computer Use.app/Contents/MacOS/")
}
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ struct CuaDriverCommand: AsyncParsableCommand {
UpdateCommand.self,
DiagnoseCommand.self,
DoctorCommand.self,
CleanupCommand.self,
DumpDocsCommand.self,
]
)
Expand Down Expand Up @@ -249,6 +250,7 @@ struct CuaDriverEntryPoint {
"update",
"diagnose",
"doctor",
"cleanup",
"dump-docs",
"help",
]
Expand Down Expand Up @@ -373,7 +375,23 @@ struct CuaDriverEntryPoint {
struct MCPCommand: ParsableCommand {
static let configuration = CommandConfiguration(
commandName: "mcp",
abstract: "Run the stdio MCP server."
abstract: "Run the stdio MCP server.",
discussion: """
When invoked from a shell or IDE terminal (Claude Code, Cursor, \
VS Code, Warp), macOS TCC attributes the process to the parent \
terminal — not to DeepChat Computer Use.app — so AX probes silently fail \
against the wrong bundle id. To sidestep this without breaking \
the stdio MCP transport, `mcp` detects the context, ensures a \
`cua-driver serve` daemon is running under LaunchServices \
(relaunching via `open -n -g -a "DeepChat Computer Use" --args serve` if not), \
and proxies every MCP tool call through the daemon's Unix \
socket. Tool semantics are identical to the in-process path. \
Pass `--no-daemon-relaunch` (or set CUA_DRIVER_MCP_NO_RELAUNCH=1) \
to force in-process execution — useful when the calling context \
already has the right TCC grants (e.g. spawned from \
DeepChat Computer Use.app directly), or for diagnosing \
in-process failures.
"""
)

@Flag(
Expand All @@ -387,7 +405,38 @@ struct MCPCommand: ParsableCommand {
)
var claudeCodeComputerUseCompat: Bool = false

@Flag(
name: .long,
help: """
Stay in the current process instead of auto-launching a daemon \
and proxying through its Unix socket when invoked from a shell \
without DeepChat Computer Use.app's TCC grants. Also toggleable via \
CUA_DRIVER_MCP_NO_RELAUNCH=1.
"""
)
var noDaemonRelaunch: Bool = false

@Option(
name: .long,
help: "Override the daemon Unix socket path used by the proxy fallback."
)
var socket: String?

func run() throws {
// TCC sidestep. Same heuristic the `serve` subcommand uses
// (shell-spawned bare binary that resolves into DeepChat Computer Use.app
// bundle), gated by an explicit env / flag opt-out. When the
// shell already has the right TCC context (e.g. DeepChat Computer Use.app
// launched us directly), this returns false and we stay
// in-process exactly like before. The proxy path is purely
// additive: it gives stdio MCP clients spawned from IDE
// terminals a correct TCC context without requiring an external
// bridge.
if shouldUseDaemonProxy() {
try runViaDaemonProxy()
return
}

// MCP stdio runs for the lifetime of the host process, so we
// bootstrap AppKit here — the agent cursor overlay (disabled
// by default, enabled via `set_agent_cursor_enabled`) needs a
Expand Down Expand Up @@ -421,6 +470,135 @@ struct MCPCommand: ParsableCommand {
}
}

extension MCPCommand {
/// Decide whether the current `mcp` invocation should auto-launch a
/// daemon and proxy every MCP tool call through its Unix socket.
/// Mirror of `ServeCommand.shouldRelaunchViaOpen()` — same heuristic,
/// same env override convention, separate flag so callers can opt
/// each surface in/out independently.
fileprivate func shouldUseDaemonProxy() -> Bool {
if noDaemonRelaunch { return false }
if isEnvTruthy(ProcessInfo.processInfo.environment["CUA_DRIVER_MCP_NO_RELAUNCH"]) {
return false
}
// When AppKit already attributes us to DeepChat Computer Use.app — either
// because LaunchServices spawned us, or the user invoked the
// bundle's main executable directly — `Bundle.main.bundlePath`
// ends in `.app`. Either case has the right TCC context.
if Bundle.main.bundlePath.hasSuffix(".app") { return false }
// The bare-binary path must resolve into an installed
// DeepChat Computer Use.app bundle, otherwise there's nothing for the
// daemon side to land in. Raw `swift run` dev invocations fail
// this check and stay in-process.
guard isExecutableInsideCuaDriverApp() else { return false }
// ppid == 1 means launchd already reparented us — we're
// post-LaunchServices and have the right TCC context.
if getppid() == 1 { return false }
return true
}

/// Ensure a `cua-driver serve` daemon is running under the right TCC
/// context, then run the MCP stdio server with `ListTools` /
/// `CallTool` handlers that forward every request through
/// `~/Library/Caches/cua-driver/cua-driver.sock`. Falls back to
/// in-process on launch failure with a diagnostic and a pointer at
/// the `--no-daemon-relaunch` escape hatch.
fileprivate func runViaDaemonProxy() throws {
let socketPath = socket ?? DaemonPaths.defaultSocketPath()

if !DaemonClient.isDaemonListening(socketPath: socketPath) {
FileHandle.standardError.write(
Data(
"cua-driver: mcp launched without DeepChat Computer Use.app's TCC grants; auto-launching the daemon via `open -n -g -a \"DeepChat Computer Use\" --args serve` and proxying MCP requests through it. Pass --no-daemon-relaunch to stay in-process.\n"
.utf8))
try launchDaemonViaOpen()
try waitForDaemon(socketPath: socketPath, timeout: 10.0)
Comment thread
zerob13 marked this conversation as resolved.
}

let serverName = claudeCodeComputerUseCompat ? "computer-use" : "cua-driver"
let compat = claudeCodeComputerUseCompat

// The MCP `Server` actor + `StdioTransport` use Swift
// concurrency, so we need a live async runtime. Reuse
// `AppKitBootstrap` for that — it's the same sync→async bridge
// the in-process path already takes, and the idle AppKit
// run-loop costs us nothing here (no AX work runs in this
// process). Critically we skip PermissionsGate entirely: the
// daemon owns TCC, and AX probes against this process would
// lie because we're attributed to the calling shell.
AppKitBootstrap.runBlockingAppKitWith {
let server = try await CuaDriverMCPServer.makeProxy(
serverName: serverName,
socketPath: socketPath,
claudeCodeComputerUseCompat: compat
)
let transport = StdioTransport()
try await server.start(transport: transport)
await server.waitUntilCompleted()
}
}

/// Spawn `/usr/bin/open -n -g -a "DeepChat Computer Use" --args serve`. Mirror of
/// `ServeCommand.relaunchViaOpen` minus the post-launch probe (we
/// poll separately via `waitForDaemon`, since the timeout there is
/// MCP-specific).
fileprivate func launchDaemonViaOpen() throws {
let process = Process()
process.executableURL = URL(fileURLWithPath: "/usr/bin/open")
// -n: force a new instance. DeepChat Computer Use.app may already be
// running from a previous `mcp` (different MCP client
// session); without -n, `open -a` would re-use it and
// drop our `--args serve`, leaving no daemon up.
// -g: keep the new instance backgrounded. DeepChat Computer Use.app is
// LSUIElement=true anyway, but this makes that explicit.
process.arguments = ["-n", "-g", "-a", "DeepChat Computer Use", "--args", "serve"]
process.standardOutput = FileHandle.nullDevice
process.standardError = FileHandle.nullDevice
do {
try process.run()
} catch {
FileHandle.standardError.write(
Data(
"cua-driver: failed to exec `/usr/bin/open`: \(error). Pass --no-daemon-relaunch to bypass.\n"
.utf8))
throw ExitCode(1)
}
process.waitUntilExit()
if process.terminationStatus != 0 {
FileHandle.standardError.write(
Data(
"cua-driver: `open -n -g -a \"DeepChat Computer Use\" --args serve` exited \(process.terminationStatus). Check that `/Applications/DeepChat Computer Use.app` is installed, or pass --no-daemon-relaunch to bypass.\n"
.utf8))
throw ExitCode(1)
}
}

/// Block (up to `timeout` seconds) until `socketPath` accepts a
/// protocol-speaking probe. Throws `ExitCode(1)` with a diagnostic
/// if the daemon never appears — usually means the user hasn't
/// granted Accessibility / Screen Recording to DeepChat Computer Use.app yet
/// and the daemon's PermissionsGate is waiting on a dialog.
fileprivate func waitForDaemon(socketPath: String, timeout: TimeInterval) throws {
let deadline = Date().addingTimeInterval(timeout)
while Date() < deadline {
if DaemonClient.isDaemonListening(socketPath: socketPath) {
return
}
usleep(100_000) // 100ms
}
FileHandle.standardError.write(
Data(
"cua-driver: daemon did not appear on \(socketPath) within \(Int(timeout))s. If this is the first launch, grant Accessibility + Screen Recording to DeepChat Computer Use.app in System Settings and retry. Pass --no-daemon-relaunch to stay in-process.\n"
.utf8))
throw ExitCode(1)
}

private func isEnvTruthy(_ value: String?) -> Bool {
guard let value = value?.lowercased() else { return false }
return ["1", "true", "yes", "on"].contains(value)
}
}

/// Bootstrap AppKit on the main thread so `AgentCursor` can draw its
/// overlay window + CA animations. The caller's async work runs on a
/// detached Task; the main thread blocks inside `NSApplication.run()`
Expand Down Expand Up @@ -502,7 +680,7 @@ struct UpdateCommand: AsyncParsableCommand {
}
}

/// `cua-driver doctor` — clean up stale install bits left from older versions.
/// `cua-driver cleanup` — clean up stale install bits left from older versions.
///
/// v0.0.5 and earlier installed a weekly LaunchAgent at
/// `~/Library/LaunchAgents/com.trycua.cua_driver_updater.plist` and a companion
Expand All @@ -514,9 +692,9 @@ struct UpdateCommand: AsyncParsableCommand {
/// update script. The plist lives under `$HOME` (no sudo). The companion
/// script under `/usr/local/bin` is root-owned, so we print the exact
/// `sudo rm` command for the user to run if it still exists.
struct DoctorCommand: ParsableCommand {
struct CleanupCommand: ParsableCommand {
static let configuration = CommandConfiguration(
commandName: "doctor",
commandName: "cleanup",
abstract: "Clean up stale install bits left from older cua-driver versions."
)

Expand Down
Loading