Conversation
# Conflicts: # src/App.tsx # src/pages/Settings.tsx # src/pages/__tests__/Doctor.test.tsx
dev01lay2
left a comment
There was a problem hiding this comment.
Solid foundation for the gateway client crate and remote doctor flow. A few blocking issues to address before merge.
| method: "connect".into(), | ||
| params: Some(serde_json::to_value(self.build_connect_params(&nonce))?), | ||
| }); | ||
|
|
There was a problem hiding this comment.
BS: The start() method has a confusing scope structure — locked is bound to the unsplit writer but then used after the handshake read. The variable name locked is misleading (it's not a MutexGuard). Please restructure: split → send connect via writer → read response via reader → spawn reader task, with clearer variable names.
src-tauri/src/remote_doctor.rs
Outdated
| set_active_clawpal_data_override(None).expect("clear clawpal data"); | ||
| let _ = std::fs::remove_dir_all(temp_root); | ||
| } | ||
| } |
There was a problem hiding this comment.
BS: This file is 4340 lines — well beyond the repo's own guideline of ≤500 lines per PR (excluding generated files). Even as a single module this is hard to review and maintain. Please split into submodules: types.rs, protocol.rs, executor.rs, orchestrator.rs, logging.rs, identity.rs under src-tauri/src/remote_doctor/.
| @@ -1,2 +1,115 @@ | |||
| <!-- This file has been moved to AGENTS.md. --> | |||
| Moved to [`AGENTS.md`](AGENTS.md). | |||
| # AGENTS.md | |||
There was a problem hiding this comment.
BS: This overwrites the redirect stub in agents.md (lowercase) with full content, but the repo already has AGENTS.md (uppercase) as the canonical file. This creates two competing AGENTS files. Either update AGENTS.md or keep the lowercase file as a redirect.
| } | ||
| } | ||
| Err(_) => break, | ||
| } |
There was a problem hiding this comment.
NBS: request() has no timeout — if the gateway never responds, the oneshot will hang until the connection drops. Consider adding a configurable timeout or documenting the caller's responsibility.
| client_id: &str, | ||
| client_mode: &str, | ||
| role: &str, | ||
| scopes: &[String], |
There was a problem hiding this comment.
NBS: build_device_auth_payload_v3 builds a JSON value but is never used by client.rs (which passes ConnectParams directly without device auth signing). If this is for future use, consider #[allow(dead_code)] or moving it to tests until wired in.
dev01lay2
left a comment
There was a problem hiding this comment.
PR #142 feat: remote doctor — REQUEST_CHANGES
CI: rust fails (cargo fmt --check), Capture UI Screenshots fails. Builds pending.
🔴 Blocking
BS 1: cargo fmt failures across openclaw-gateway-client/tests/
Multiple files need formatting: protocol_roundtrip.rs, tls_fingerprint.rs, and others. Fix: cargo fmt --all.
BS 2: remote_doctor.rs is 4,340 lines in a single file
This violates the repo AGENTS.md guideline of ≤500 lines per file and makes review extremely difficult. The file contains types, config helpers, protocol selection, 3 separate repair loops (legacy, clawpal_server, agent_planner), plan execution, logging, SSH helpers, rescue preflight, prompt construction, and ~1,500 lines of tests. Split into at minimum:
remote_doctor/types.rsremote_doctor/protocol.rsremote_doctor/executor.rsremote_doctor/planner_agent.rsremote_doctor/planner_clawpal_server.rsremote_doctor/logging.rsremote_doctor/prompts.rsremote_doctor/rescue.rsremote_doctor/mod.rs
BS 3: agents.md (lowercase) overwrites redirect with full content — conflicts with AGENTS.md (uppercase)
The repo already has AGENTS.md at root (the canonical file referenced by harness tooling). This PR replaces the lowercase agents.md redirect stub with 115 lines of Chinese-language content that duplicates and diverges from AGENTS.md. On case-insensitive filesystems (macOS, Windows) this creates a real conflict. Fix: keep agents.md as a redirect stub, or delete it and only use AGENTS.md.
BS 4: openclaw-gateway-client/Cargo.lock should not exist
Workspace members share the root Cargo.lock. A separate lockfile in the subcrate will confuse tooling and diverge from the workspace resolution. Delete openclaw-gateway-client/Cargo.lock.
🟡 Non-blocking
MAX_PENDING_INVOKESsilently changed from 50→10 inbridge_client.rswith no explanation — could cause dropped invokes under loadedition = "2024"inopenclaw-gateway-client/Cargo.tomlrequires nightly Rust; CI likely uses stable — will fail on build jobs if they reach compilation- E2E Dockerfile embeds
E2E_ROOT_PASSWORDas a string constant in test code — fine for local Docker but document that this must never be used outside test fixtures - Design docs (6 plan files, ~1,700 lines total) are useful but should reference each other with a summary index; currently they are standalone and partially overlapping
…irect - Format openclaw-gateway-client test files (protocol_roundtrip, tls_fingerprint) - Remove openclaw-gateway-client/Cargo.lock (workspace members share root lockfile) - Restore agents.md redirect stub (content lives in AGENTS.md)
dev01lay2
left a comment
There was a problem hiding this comment.
Re-review after 9166cc6..1e1b1af (2 new commits)
CI: rust ✅ | frontend ✅ | builds (4 platforms) ✅ | home-perf ✅ | profile-e2e ✅ — coverage, metrics, and Capture UI Screenshots fail (appear to be pre-existing/infra issues).
Resolved
- BS 1 (cargo fmt) — ✅ Fixed in both commits
- BS 3 (agents.md conflict) — ✅ Restored as redirect stub
- BS 4 (subcrate Cargo.lock) — ✅ Removed
🔴 Still blocking
BS 2: remote_doctor.rs is now 4,525 lines in a single file (was 4,340)
The file actually grew by ~185 net lines in the formatting pass. The repo AGENTS.md guideline is ≤500 lines per file. This file contains types, config helpers, protocol selection, 3 separate repair loops, plan execution, logging, SSH helpers, rescue preflight, prompt construction, and ~1,500 lines of tests — all in one module.
Please split into at minimum:
remote_doctor/types.rsremote_doctor/protocol.rsremote_doctor/executor.rsremote_doctor/planner_agent.rsremote_doctor/planner_clawpal_server.rsremote_doctor/logging.rsremote_doctor/prompts.rsremote_doctor/rescue.rsremote_doctor/mod.rs
Once this is addressed, the rest of the PR looks good to go.
dev01lay2
left a comment
There was a problem hiding this comment.
PR #142 Re-review — Code Quality Deep Dive
Correction: My initial review flagged
remote_doctor.rsas a 4,340-line monolith — that was based on a stale shallow clone. The actual branch already hasremote_doctor/split into 9 modules. Apologies for the noise. The previous fix commit (fmt + Cargo.lock + agents.md) remains valid.
CI Status: rust ✅, frontend ✅, profile-e2e ✅, home-perf ✅. coverage ❌ (GitHub API permission — pre-existing infra). Builds pending.
Architecture Assessment
openclaw-gateway-client crate — well-structured standalone gateway WebSocket client:
- Clean builder pattern for
GatewayClient - Protocol handshake (challenge → connect → hello) correctly implemented
NodeClient/NodeInvokeRequestabstraction is clean- Ed25519 identity generation + PKCS#8 DER → PEM encoding correct
- Test coverage for protocol roundtrip and TLS fingerprint normalization
remote_doctor/ module split — 9 files, good separation:
types.rs(255 lines) — clean domain types ✅session.rs(149 lines) — logging/progress helpers ✅mod.rs(5 lines) — minimal re-exports ✅config.rs(~630 lines) — gateway config + SSH target resolution — slightly over 500 but acceptable given cohesionplan.rs(~760 lines) — plan execution + clawpal doctor command dispatch — could benefit from splittingexecute_clawpal_doctor_commandinto its own filerepair_loops.rs(~1,325 lines) — this is the main concern — contains 3 full repair loops (legacy, clawpal_server, agent_planner) + the Tauri command entry point
bridge_client.rs changes — generified to <R: Runtime>, added broadcast::Sender<Value> for invoke events, added subscribe_invokes(). Clean extension.
node_client.rs changes — same <R: Runtime> generification pattern.
🟡 Non-blocking Issues
NBS 1: repair_loops.rs at ~1,325 lines
Contains 3 distinct repair loops (run_remote_doctor_repair_loop, run_clawpal_server_repair_loop, run_agent_planner_repair_loop) plus the Tauri command entry point. Consider extracting each loop into its own file. Not blocking because the internal structure is logical.
NBS 2: MAX_PENDING_INVOKES 50→10 silently
bridge_client.rs reduced from 50 to 10 with no commit message or comment explaining why. Under high invoke throughput this could silently drop requests. Add a comment explaining the rationale.
NBS 3: edition = "2024" mismatch
openclaw-gateway-client uses edition = "2024" while rest of workspace uses edition = "2021". This works on Rust 1.85+ (current stable) but introduces subtle semantic differences (e.g. gen keyword, unsafe_op_in_unsafe_fn lint). Consider aligning to "2021" for consistency unless the 2024 features are needed.
NBS 4: Plan command allowlist is very restrictive
validate_plan_command_argv only allows openclaw --version and openclaw gateway status. Any other openclaw subcommand fails validation. This is good security but may be too tight for real repair scenarios — the agent planner might need openclaw gateway restart or openclaw config get. Worth documenting the expansion path.
NBS 5: empty_diagnosis() hardcodes "2026-03-18T00:00:00Z"
This should use chrono::Utc::now() or at least be a const/comment explaining it is a sentinel value.
Overall the implementation is solid — 3 repair protocol paths (legacy gateway RPC, clawpal-server structured plans, agent planner with LLM), good logging/observability, clean command validation, proper config backup-before-write pattern. The openclaw-gateway-client extraction is a welcome addition that will be reusable. Previous blocking issues (fmt, Cargo.lock, agents.md) are resolved by the fix commit.
dev01lay2
left a comment
There was a problem hiding this comment.
Re-review after 84a09ba..9c42faf (3 new commits)
CI: frontend ✅, Capture UI Screenshots ❌ (pre-existing), rest pending.
Resolved from previous NBS
- NBS 1 (repair_loops.rs at ~1,325 lines) — ✅ Split into 4 submodules:
agent_planner.rs(243),clawpal_server.rs(259),legacy_doctor.rs(175),shared.rs(153). Clean extraction withmod.rsre-exports. Each file is well under 500 lines. - NBS 3 (edition 2024 mismatch) — ✅
openclaw-gateway-clientchanged fromedition = "2024"to"2021", aligning with the workspace. - NBS 5 (hardcoded empty diagnosis timestamp) — ✅
empty_diagnosis()now usesformat_timestamp_from_unix(unix_timestamp_secs())instead of a hardcoded string. Struct construction replaces the brittleserde_json::from_valuedeserialization. Test added (empty_diagnosis_checked_at_is_not_hardcoded).
New changes
refactor: tighten remote doctor repair limits — MAX_REMOTE_DOCTOR_ROUNDS reduced from 50 to 10 in shared.rs. Reasonable — 50 rounds was likely too permissive and would burn through LLM quota on stuck loops. The stall detection (REPAIR_PLAN_STALL_THRESHOLD = 3) catches loops much earlier anyway.
legacy_e2e.rs — explicit type annotation on closure parameter (previous_results: Vec<CommandResult>) fixes type inference. Minor but correct.
All three previous NBS items addressed. Remaining NBS 2 (MAX_PENDING_INVOKES 50→10) and NBS 4 (command allowlist documentation) are minor and can be addressed post-merge.
LGTM ✅
dev01lay2
left a comment
There was a problem hiding this comment.
PR #142 feat: remote doctor — APPROVED ✅
CI: rust ✅ frontend ✅ profile-e2e ✅ home-perf ✅ | coverage ❌ (pre-existing GH API permission) | Screenshots ❌ (pre-existing) | Builds pending
Changes since initial review
Author addressed all feedback:
- ✅
cargo fmtfixed (my commit + author follow-up) - ✅
Cargo.lockremoved from subcrate (my commit) - ✅
agents.mdredirect restored (my commit) - ✅
repair_loops.rssplit intorepair_loops/with 4 sub-modules (agent_planner.rs,clawpal_server.rs,legacy_doctor.rs,shared.rs) — addresses NBS 1 - ✅
MAX_REMOTE_DOCTOR_ROUNDStightened 50→10 with clear commit message — addresses NBS 2 concern (explicit change) - ✅
empty_diagnosis()now usesunix_timestamp_secs()+ test asserting no hardcoded date — addresses NBS 5
Remaining NBS (non-blocking, can address later)
MAX_PENDING_INVOKES50→10 inbridge_client.rs— no inline comment explaining rationaleedition = "2024"vs workspace"2021"— works fine on stable, just inconsistent- Plan command allowlist (
openclaw --version+openclaw gateway statusonly) — document expansion path when more commands are needed
Summary
Solid feature PR. Three repair protocol paths (legacy gateway RPC, clawpal-server structured plans, agent planner with LLM bridge), comprehensive E2E test suite with Docker containers, clean openclaw-gateway-client crate extraction, proper module organization.
aac9c24 to
8586fb7
Compare
dev01lay2
left a comment
There was a problem hiding this comment.
Re-review after 8489d4b..8586fb7 (merge sync from upstream/develop)
CI: rust ✅ frontend ✅ builds (4 platforms) ✅ profile-e2e ✅ | coverage ❌ metrics ❌ home-perf ❌ Screenshots ❌ (pre-existing infra issues)
The new commits since my last approval are upstream merges — CI fork-workflow fixes, local CI scripts (#144), stream-based session loading (#145), and a final merge sync. No changes to the remote doctor feature code itself. The merge appears clean.
LGTM ✅
dev01lay2
left a comment
There was a problem hiding this comment.
Re-review after 8586fb7..63224d8 (1 new commit)
CI: rust ✅ frontend ✅ coverage ✅ home-perf ✅ profile-e2e ✅ | Builds + Screenshots + metrics pending
The new commit is a workflow sync from upstream develop — CI config changes only (coverage, fork-pr-comment, home-perf-e2e, metrics, screenshot). No feature code changes.
LGTM ✅
|
@zzhengzhuo015 ready to merge? |
dev01lay2
left a comment
There was a problem hiding this comment.
Re-review after 63224d8..bb04d7e (1 new commit)
CI: No checks reported on current head. PR is in CONFLICTING merge state.
New: Remote Doctor bootstrap + connect diagnostics + invite code flow
Substantial commit adding 3 features with good test coverage and clean implementation.
1. Bootstrap script — idempotent bash+python3 script with JSON5 normalization, config backup, integration tests.
2. Connect diagnostics — last_disconnect_reason in NodeClient, session JSONL logging, frontend error formatter.
3. Invite code exchange — replaces manual gateway URL/token with single invite code input, thorough error classification.
🔴 Blocking
BS 1: agents.md overwritten again with full Chinese content — previously fixed (BS 3, March 19 review). On case-insensitive filesystems this conflicts with canonical AGENTS.md. Restore the redirect stub.
BS 2: Merge conflicts — PR needs rebase onto current develop/main.
BS 3: No CI on current head — CI has not run on bb04d7e3. Need a clean CI pass after rebase.
🟡 Non-blocking
NBS 1: Hardcoded IP 65.21.45.43:3040 in multiple files (invite-code.ts, config.rs, preferences.rs, Settings.tsx, locales) — extract to shared constants per language boundary.
NBS 2: Doctor.tsx removes gateway URL pre-check — first attempt always fails if no invite code exchanged. A pre-flight auth token check would save a round-trip.
NBS 3: logRemoteDoctorInviteCodeFailure in Settings.tsx hardcodes the gateway URL string instead of reading from preferences.
No description provided.