Connect plugin: initial implementation (POC)#1
Open
drewr wants to merge 133 commits into
Open
Conversation
Internal packages (stubbed): - binary/discovery.go - binary discovery (next-to-self, then PATH) - env/build.go - child environment builder - output/convert.go - JSON/YAML conversion - signals/forward.go - signal forwarding stub - daemon/fork.go - daemon fork stub - pidfile/pidfile.go - PID file read/write - logfile/logfile.go - log path resolution - svcconfig/config.go - tunnel config serialization - svcunit/unit.go - service unit generation stubs Test fakes: - fake-datum-connect - emulates Rust binary with failure modes - fake-credentials-helper - emulates auth token helper Build scripts: - scripts/build.sh - host-platform build - scripts/release.sh - release packaging stub (Phase 7)
E2E tests validate: - PLUG-01: --plugin-manifest emits valid JSON and exits 0 - PLUG-02: --plugin-manifest handled before cobra parses args - PLUG-03: all 12 tunnel subcommands scaffolded - PLUG-04: each stub prints 'not implemented' with target phase - PLUG-05: test fakes functional (fake-datum-connect, fake-credentials-helper) - PLUG-06: build and release scripts exist and are executable Build script: - Added --test flag to run E2E tests after build - Usage: ./scripts/build.sh --test
…out message field - Rule 1 auto-fix: msg["message"].(string) panics on messages like heartbeat - Added safe type assertion with ok check - Added TestParseTypedMessage covering heartbeat without message field Deviation from plan: plan code had unsafe direct type assertion
…al handling - Full listen command: spawns Rust binary, reads typed JSON messages - Handles ready message (prints hostname in interactive mode, JSON in JSON mode) - Blocks until SIGINT/SIGTERM, forwards signal to child, waits 30s grace period - 60s startup timeout before ready message - Malformed JSON treated as fatal error (Rust contract validation) - stderr forwarded transparently to plugin stderr - Exit code 64 for missing --endpoint (POSIX semantic rejection) - E2E tests: interactive mode, JSON mode, missing endpoint validation
- tunnel/list: table/json/yaml output via exec.Run() with OutputModeTable/JSON/YAML - tunnel/update: --id required, --label/--endpoint optional, JSON output mode - tunnel/delete: --id required, JSON output with confirmation - All commands: discover binary, acquire token, build env, delegate to Rust - Exit code propagation: child non-zero -> os.Exit(child_code), Go errors -> POSIX codes - Manual flag validation for POSIX exit code 64 (avoids cobra's default code 1)
- Update fake-datum-connect handleListen to emit typed JSON with type field - Fix TestPluginPassesContextToSubcommand to use fake binary - Fix TestCredentialsHelperTokenFlow to use fake binary
- Tests for Dir(), TunnelDir(), PidFilePath(), LogDir(), LogFilePath() - Cross-platform path conventions per PLATFORM-05 / CONTEXT.md
- Dir() returns cross-platform state directory path - TunnelDir() returns <state>/tunnels - PidFilePath(name) returns <tunnels>/<name>.pid - LogDir() returns log directory (macOS uses ~/Library/Logs/) - LogFilePath(name) returns <logdir>/<name>.log - Per PLATFORM-05 and Phase 5 CONTEXT.md conventions
…D format - Tests for Write/Read round-trip with goPID/rustPID - Tests for Exists, Parse, Remove, and missing file error - New API: Write(path, goPID, rustPID, startTime, binaryPath) - New API: Read returns *PidFile
- Write(path, goPID, rustPID, startTime, binaryPath) creates PID file - Read returns *PidFile with GoPID, RustPID, StartTime, BinaryPath - Parse for in-memory content (without disk read) - Exists, Remove helpers - 4-line format per DAEMON-02: go-pid, rust-pid, start-time, binary-path
- PIDAlive(0) returns false, PIDAlive(current) returns true - ListRunningTunnels scans PID files and returns running/zombie tunnels - Per DAEMON-08 stale PID detection requirements
- PIDAlive uses syscall.Kill(pid, 0) on Unix (signal 0 existence check) - PIDAlive uses tasklist /FI on Windows - ListRunningTunnels scans tunnels dir for .pid files - RunningTunnel with status: Running/Degraded/Zombie - computeTunnelStatus based on goPID + rustPID aliveness - Per DAEMON-08 stale PID detection requirements [Rule 1 - Bug] Fix: use syscall.Kill(pid, 0) instead of process.Signal(nil) - os.Signal(nil) returns 'unsupported signal type' error on Linux [Rule 1 - Bug] Fix: use os.Getpid() for both PIDs in RunningTunnel test - Parent PID (Getppid()) may not be alive in test environment
- RunSupervisor starts Rust binary via exec, reads typed JSON, forwards to stdout - Writes PID file on start, removes on exit (defers cleanup) - Supports --log-file for Rust debug output - Falls back to state.TunnelDir() via DATUM_CONNECT_TUNNEL_DIR env var
- Daemonize uses os.StartProcess for cross-platform detached spawn - ForegroundArgs builds args for 'tunnel run --name N' - SelfExe returns currently running executable path
- listen --detach --name N spawns background daemon via daemon.Daemonize - listen without --detach behaves as before (foreground) - --log-file flag for Rust debug output
…point - tunnel run --name N calls daemon.RunSupervisor - Defaults --log-file to state.LogFilePath(name) - Internal entry point for daemon background process and service units
- FAKE_DUMMY_MODE=daemon-listen emits ready JSON then exits immediately - Simulates Rust binary when launched by daemon supervisor
- Lists tunnels from PID files with table output (NAME, PID, RUST, STATUS, UPTIME) - --json flag outputs JSON array - --prune flag removes stale (Zombie) PID files - Uses pidfile.ListRunningTunnels with tabwriter display
- Reads PID file, gets Rust PID, sends SIGTERM first (per CONTEXT.md) - Waits up to 30s for graceful shutdown - Sends SIGKILL after timeout - Cleans up PID file after Rust exits
- Reads tunnel log file from state directory - --follow / -f tails live content using os.File polling
- Reads PID file, checks both Go and Rust PIDs via PIDAlive - Reports Running, Stopped, Degraded, or Zombie status - Shows uptime, started time, binary path
- TestPS_WithFakePIDFiles: ps shows tunnel with fake PID file - TestPS_JSONOutput: ps --json outputs valid JSON array - TestStatus_StoppedTunnel: status for nonexistent tunnel shows Stopped
- Add github.com/kardianos/service v1.2.4 to go.mod - Update go.sum with new checksums
… resolution - Add ConfigDir() using os.UserConfigDir() (XDG on Linux) - Add ConfigFilePath() for path resolution per tunnel - Enhance Save() with auto-CreatedAt and error wrapping - Enhance Load() with error wrapping - Add Exists() and Remove() functions - Add comprehensive tests for all exported functions
- TestServiceName: verify service unit naming convention - TestServiceArgs: verify tunnel run argument construction - TestServiceArgs_NoLabel: verify --label omitted when empty
…rvice - Export ServiceName, ServiceArgs, Install, Uninstall, Start, Stop, Status - Use kardianos/service with user-scoped systemd config - Unit config: network-online.target dependency, on-failure restart, 5s delay - ServiceArgs builds tunnel run CLI arguments from TunnelConfig - Resolve binary path via exec.LookPath
- TestInstall_RequiresName: install with no flags exits 64 - TestInstall_RequiresEndpoint: install without --endpoint shows error - TestInstall_RequiresSession: install without --session shows error
- Validates --name, --endpoint, --session are provided - Validates session via DATUM_CREDENTIALS_HELPER - Checks for duplicate config names - Saves config to svcconfig directory - Installs systemd unit via svcunit - Cleans up config on unit install failure
…_NAME env var - Add DATUM_CONNECT_TUNNEL_NAME to child env when name is known (detach mode) - For --endpoint-only path (name empty), env var is not set - Enables Rust binary to construct per-tunnel key path for listen --id and picker flows
…acy migration
- Add listen_key_for_tunnel(project_id, tunnel_name) method
- Migrates legacy flat key from repo root to per-tunnel path
- Generates fresh keys for new tunnels in fresh projects
- Adds #[instrument("repo", skip_all)] tracing attribute
- 4 new unit tests: fresh generation, legacy migration, stability, multi-tunnel isolation
- Add new_with_key(repo, secret_key) accepting a pre-generated SecretKey - Add private build_with_key() helper that skips disk-based key lookup - Endpoint ID matches the provided key's derived identity - Unit test verifies key is used directly without reading from disk
…ayout - listen --endpoint generates key in memory, persists after tunnel creation - listen --id reads per-tunnel key from disk using server-assigned tunnel name - --id + --endpoint same as --id with endpoint validation - picker reads per-tunnel key using picked tunnel's name - new_with_key() constructor used for all listen paths - LISTEN_KEY_FILE made public for bin crate access - Added iroh and rand dependencies to bin Cargo.toml
- Add resolve_connector_class() to auto-detect available ConnectorClass
when hardcoded 'datum-connect' doesn't exist (falls back to first available)
- Add ConnectorClass CRD definition
- Fix --project flag position in listen command (before subcommand)
- Fix flake.nix empty string syntax ('' -> "")
- Reuse existing connector instead of delete-and-recreate to avoid the
NSO#209 race where the replicator mirrors upstream-status at Ready:False
and never re-mirrors when Ready flips to True. Patching connectionDetails
in-place keeps the same generation; the replicator re-mirrors on spec
changes. Matches how the desktop app (app/lib) handles reconnect.
- Start heartbeat before ensure_connector so the iroh relay URL is
populated before build_connection_details() runs. Without this the
connector status patch has no relay URL and Ready never becomes True.
- Add refresh_connection_details() call after progress completes (connector
is Ready:True) to trigger a second replicator reconcile with correct
state, causing EG to re-translate xDS so the extension server injects
the iroh cluster config. Workaround for NSO#209.
- Add ConnectorMetadataProgrammed field to TunnelSummary; treat absent
condition as true (extension-server mode does not set this condition).
Use condition_is_true_or_absent() across all TunnelSummary construction
sites and in TunnelProgress::all_ready().
- Patch HTTPProxy connector backend reference in set_enabled_project() so
a resumed tunnel with a reused connector correctly wires the proxy to
the new connector name.
- Drop verify_endpoints() proxy probe from the listen path; it blocked
tunnel_ready indefinitely when the edge returned 503 (extension server
not yet deployed). K8s progress steps completing is sufficient signal.
- Fix error display: eprintln!("{:#}") to show full error chain.
- Move step_started_at from RefCell to Arc<Mutex> for use in spawned task.
Relates to: datum-cloud/network-services-operator#209
- Add .goreleaser.yaml for cross-platform plugin binary builds - Add .github/workflows/release.yml with Rust cross-compilation matrix - Add .github/workflows/testing.yml for Go unit tests - Add version ldflag override for release builds - Rework README with component-focused development docs
0b1525c to
19a29bd
Compare
This was referenced Jun 20, 2026
Untrack connect-plugin/fake-datum-connect-test and connect-plugin/internal/daemon/fake-datum-connect from git; add both to .gitignore alongside the drewr-specific directory entry. Fix run_test.go to resolve the module root and binary output path using runtime.Caller and t.TempDir instead of hardcoded /home/drewr/... paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TestServiceArgs was asserting --endpoint, --session, and --yes flags that were removed in Phase 13. The three BuildConfig tests were asserting per-service DATUM_CONNECT_DIR injection that was also removed in Phase 13 (D-01). Replace with tests that document current behavior: --name only in service args, empty EnvVars when no credentials helper, and DATUM_CREDENTIALS_HELPER set only when a helper path is provided. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Forward was calling child.Wait() (os.Process.Wait) in an internal goroutine while Run() also called cmd.Wait() (exec.Cmd.Wait) on the same process. Whichever won the race could leave cmd.ProcessState nil or with exit code 0, causing TestRunWithNonZeroExit to flap. Fix by removing the internal Wait from Forward and requiring callers to pass a childExited channel that is closed after their cmd.Wait() returns. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
0xmc
reviewed
Jun 22, 2026
0xmc
left a comment
There was a problem hiding this comment.
I made three commits fixing up some hardcoded paths, stale tests, and a race condition. Some end-to-end are still failing, which I'm going to look at now.
The connect-plugin go.mod has a replace directive pointing to ../../datumctl which resolves correctly in local development (sibling repos) but fails in CI where only the connect repo is checked out. Add a second checkout step to place datumctl at the expected relative path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fake-datum-connect arg parser was scanning only the first positional arg after stripping --json, so --project <value> shifted subcmd to "--project" and the binary exited 2 before the listen subcommand was reached. Replace with a flag-aware scan that skips known flags-with-values to find the first positional subcommand. state.Dir() on macOS uses user.Current() and ignores XDG_STATE_HOME, so the ps tests' PID files were invisible to the plugin. Add a DATUM_STATE_DIR override (checked before platform logic) and update the ps tests to use it. TestStatus_WithConfig called buildPlugin before setting HOME, which caused go build to use configDir/go as GOPATH and put the module cache inside the temp dir. Read-only cache files then caused t.TempDir cleanup to fail. Fix: build the plugin binary before overriding HOME. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
0xmc
reviewed
Jun 22, 2026
| - cargo test --workspace | ||
| dir: '{{.ROOT_DIR}}/{{.RUST_PROJECT}}' | ||
|
|
||
| test:e2e: |
There was a problem hiding this comment.
test:e2e is a dead task — the script it references (../e2e-test.sh) has never existed, so the task has never been runnable.
actions/checkout requires path to be under $GITHUB_WORKSPACE, so ../datumctl was rejected. Check out datumctl-dep inside the workspace and rewrite the replace directive with go mod edit before any Go commands run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Initial implementation of the Datum Connect plugin for datumctl, as a proof of concept for enhancements#756. This code is mostly not reviewed and has been incrementally developed with agents (Qwen, DeepSeek, and Claude) to tease out the functionality. This PR is safe to merge so that we can generate releases and wire them in to the datumctl-plugin manifest. Until that
datumctl plugin installwon't work and it'll be cumbersome to test.Example:
Components
datum-connecttunnel agent) and a library crate (connect-lib) exposing shared types, Kube API client, DatumCloud API bindings, heartbeat agent, and tunnel service — reusable by other clients such as Datum Desktopdatumctl-connect) that datumctl exec's as a plugin, spawning the Rust binary as a subprocess and relaying JSON lifecycle eventsKey capabilities
datumctl connect tunnel listen: spawn the tunnel agent, stream progress, handle Ctrl-C cleanupdatumctl connect tunnel run: background daemon mode with PID trackingdatumctl connect tunnel list|ps|status|stop|logs|delete|update: manage tunnelsdatumctl connect tunnel install|uninstall: systemd service installationCI / Release
datumctl-connectfor linux/darwin/windows × amd64/arm64datum-connectis cross-compiled via matrix runners and bundled alongside the Go binary in each release archive