Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
7d8c44c
X-Smart-Branch-Parent: master
robbycochran Mar 13, 2026
c3e8561
docs: merge doc/ into docs/ with corrected inaccuracies
robbycochran Mar 16, 2026
6b8926b
X-Smart-Branch-Parent: agent-doc
robbycochran Mar 17, 2026
0657912
feat: add devcontainer and Claude Code agent development environment
robbycochran Mar 18, 2026
ed0bdbb
docs: add Vertex AI setup and devcontainer build instructions to CLAU…
robbycochran Mar 18, 2026
c4ce7ba
feat: add launcher script with worktree isolation and GitHub PAT support
robbycochran Mar 18, 2026
097d280
feat: use official GitHub MCP server, fix run.sh, add bubblewrap
robbycochran Mar 18, 2026
acde695
refactor: convert skills to collector-dev plugin with scoped tool per…
robbycochran Mar 18, 2026
e4d5edd
feat: create branch and draft PR upfront, tighten iterate permissions
robbycochran Mar 18, 2026
58bbf24
feat: add watch-ci skill for CI monitoring loop
robbycochran Mar 18, 2026
1c23503
feat: add end-to-end task skill with CI monitoring loop
robbycochran Mar 18, 2026
74e2ec9
fix: load collector-dev plugin via --plugin-dir flag
robbycochran Mar 18, 2026
5472fbf
feat: stream agent activity to stdout in autonomous mode
robbycochran Mar 18, 2026
f311442
feat: add --local mode for debugging without worktree or PR
robbycochran Mar 18, 2026
282196e
feat: add --headless mode (worktree + stream-json, no PR)
robbycochran Mar 18, 2026
011bc6d
fix: initialize submodules in worktree after creation
robbycochran Mar 18, 2026
a586721
fix: only init required submodules, drop --recursive
robbycochran Mar 18, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .claude/plugins/collector-dev/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"name": "collector-dev",
"description": "Collector development workflows — build, test, CI status, and PR management",
"version": "1.0.0",
"author": {
"name": "RHACS Collector Team"
},
"repository": "https://github.com/stackrox/collector"
}
8 changes: 8 additions & 0 deletions .claude/plugins/collector-dev/.mcp.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"mcpServers": {
"github": {
"type": "http",
"url": "https://api.githubcopilot.com/mcp/"
}
}
}
44 changes: 44 additions & 0 deletions .claude/plugins/collector-dev/skills/build/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
name: build
description: Build collector binary with options (debug, asan, tsan, clean)
allowed-tools: Bash(cmake *), Bash(make *), Bash(nproc), Bash(git describe *), Bash(strip *), Read, Glob
---

# Build Collector

Build the collector binary. Supports optional arguments:
- `debug` — Debug build with symbols
- `asan` — AddressSanitizer build
- `tsan` — ThreadSanitizer build
- `clean` — Clean build directory first
- (no args) — Release build

## Steps

1. Determine build environment:
- If inside the devcontainer (check: `DEVCONTAINER=true` env var), run cmake directly.
- If on the host (macOS), use `make start-builder && make collector`.

2. If `clean` argument is provided, remove `cmake-build/` directory first.

3. Set build variables based on arguments:
- `debug`: `CMAKE_BUILD_TYPE=Debug`
- `asan`: `CMAKE_BUILD_TYPE=Debug`, `ADDRESS_SANITIZER=ON`
- `tsan`: `CMAKE_BUILD_TYPE=Debug`, `THREAD_SANITIZER=ON`
- default: `CMAKE_BUILD_TYPE=Release`

4. Run cmake configure (if `cmake-build/` doesn't exist or CMakeLists.txt changed):
```bash
cmake -S . -B cmake-build \
-DCMAKE_BUILD_TYPE=$CMAKE_BUILD_TYPE \
-DADDRESS_SANITIZER=$ADDRESS_SANITIZER \
-DTHREAD_SANITIZER=$THREAD_SANITIZER \
-DCOLLECTOR_VERSION=$(git describe --tags --abbrev=10 --long)
```

5. Run cmake build:
```bash
cmake --build cmake-build -- -j$(nproc)
```

6. Report result: success with binary size, or failure with the first error and its file:line.
34 changes: 34 additions & 0 deletions .claude/plugins/collector-dev/skills/ci-status/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
name: ci-status
description: Check CI status on current PR, fetch failure logs, diagnose issues
allowed-tools: Bash(git branch *), Bash(git log *), mcp__github__search_pull_requests, mcp__github__pull_request_read, mcp__github__actions_list, mcp__github__actions_get, mcp__github__get_job_logs, Read
---

# CI Status

Check CI pipeline status for the current branch/PR and diagnose failures.

## Steps

1. Get the current branch name from git.

2. Use `mcp__github__search_pull_requests` to find an open PR for this branch
in `stackrox/collector`.

3. If a PR exists, use `mcp__github__pull_request_read` to get its check status.

4. Use `mcp__github__actions_list` to get workflow runs for the branch.

5. For any **failed runs**:
- Use `mcp__github__actions_get` to get the run details
- Use `mcp__github__get_job_logs` to fetch failure logs
- Identify which workflow failed (unit-tests, integration-tests, k8s-integration-tests, lint)
- For integration test failures, identify which VM type and test suite failed

6. **Diagnose** the failure:
- Unit test failure: show the failing assertion and relevant source file
- Integration test failure: distinguish infra issues (VM creation, timeout) from test failures
- Lint failure: show which files need formatting
- Build failure: show the compiler error with file:line

7. **Suggest next steps**: what code changes would fix the failure, or note if it's flaky/infra.
42 changes: 42 additions & 0 deletions .claude/plugins/collector-dev/skills/iterate/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
name: iterate
description: Full development cycle — build, unit test, format check, commit, push to existing branch
allowed-tools: Bash(cmake *), Bash(make *), Bash(ctest *), Bash(nproc), Bash(git *), Bash(clang-format *), Read, Write, Edit, Glob, Grep, mcp__github__pull_request_read, mcp__github__actions_list, mcp__github__actions_get, mcp__github__get_job_logs
---

# Iterate

Run the full development inner loop. The branch and PR already exist — just build, test, and push.
Stops at the first failure.

## Steps

1. **Build** the collector:
- Detect environment (devcontainer vs host)
- In devcontainer: `cmake -S . -B cmake-build -DCMAKE_BUILD_TYPE=Release -DCOLLECTOR_VERSION=$(git describe --tags --abbrev=10 --long) && cmake --build cmake-build -- -j$(nproc)`
- On host: `make collector`
- **Stop on failure** — report the compiler error with file:line.

2. **Unit test**:
- In devcontainer: `ctest --no-tests=error -V --test-dir cmake-build`
- On host: `make unittest`
- **Stop on failure** — report which test failed and the assertion.

3. **Format check** (C++ files changed in this branch only):
- Get changed C++ files: `git diff --name-only origin/master...HEAD | grep -E '\.(cpp|h)$'`
- Run: `clang-format --style=file -n --Werror <files>`
- If formatting issues found, auto-fix them: `clang-format --style=file -i <files>`
- Report what was fixed.

4. **Commit**:
- Stage changed files (source + any format fixes)
- Create a commit with a descriptive message summarizing the changes

5. **Push**:
- `git push` to the existing branch (branch and PR already created by run.sh)
- Do NOT create new branches or PRs

6. **Check CI**:
- Use `mcp__github__actions_list` to see if CI has started
- Report the PR URL and note that CI is running
- Use `/collector-dev:ci-status` for detailed CI results once checks complete
82 changes: 82 additions & 0 deletions .claude/plugins/collector-dev/skills/task/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
name: task
description: End-to-end autonomous workflow — implement a task, push, monitor CI, fix failures until green
disable-model-invocation: true
allowed-tools: Bash(cmake *), Bash(make *), Bash(ctest *), Bash(nproc), Bash(git *), Bash(clang-format *), Bash(sleep *), Read, Write, Edit, Glob, Grep, Agent, mcp__github__pull_request_read, mcp__github__search_pull_requests, mcp__github__actions_list, mcp__github__actions_get, mcp__github__get_job_logs
---

# Task

Complete a development task end-to-end: implement, build, test, push, and monitor CI until all checks pass.

## Input

The task description is provided via $ARGUMENTS or in the initial prompt context (branch name, PR URL, task).

## Workflow

### Phase 1: Implement

1. Read and understand the task
2. Explore relevant code in the repository
3. Implement the changes
4. Build the collector:
- In devcontainer: `cmake -S . -B cmake-build -DCMAKE_BUILD_TYPE=Release -DCOLLECTOR_VERSION=$(git describe --tags --abbrev=10 --long) && cmake --build cmake-build -- -j$(nproc)`
- On host: `make collector`
- If build fails, fix and retry
5. Run unit tests:
- In devcontainer: `ctest --no-tests=error -V --test-dir cmake-build`
- On host: `make unittest`
- If tests fail, fix and retry
6. Format check:
- `git diff --name-only origin/master...HEAD | grep -E '\.(cpp|h)$'` to find changed files
- `clang-format --style=file -i <files>` to fix formatting
7. Commit and push:
- `git add` the changed files
- `git commit` with a descriptive message
- `git push`

### Phase 2: Monitor CI

After pushing, enter a monitoring loop. CI typically takes 30-90 minutes.

**Loop** (repeat until all checks pass or blocked):

1. Wait 10 minutes: `sleep 600`
2. Check CI status:
- Get current branch: `git branch --show-current`
- Use `mcp__github__search_pull_requests` to find the PR
- Use `mcp__github__actions_list` to get workflow runs
- Use `mcp__github__pull_request_read` for check status

3. Evaluate:

**All checks passed** → report success and stop

**Checks still running** → report progress ("X of Y complete"), continue loop

**Checks failed** →
- Use `mcp__github__actions_get` and `mcp__github__get_job_logs` to get failure logs
- Diagnose the failure:
- Build failure: read error, fix code
- Unit test failure: read assertion, fix code
- Lint failure: run clang-format
- Integration test infra flake (VM timeout, network): report as flake, continue loop
- Integration test real failure: analyze and fix code
- If fixable: fix → build → unit test → commit → push → continue loop
- If not fixable: report diagnosis and stop

4. Safety limits:
- Maximum 6 CI cycles (about 3 hours of monitoring)
- If exceeded, report status and stop

### Completion

End with a summary:
```
STATUS: PASSED | BLOCKED | TIMEOUT
Branch: claude/agent-xxx
PR: <url>
Cycles: N
Changes: list of files modified
```
54 changes: 54 additions & 0 deletions .claude/plugins/collector-dev/skills/watch-ci/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
name: watch-ci
description: Check CI status and react to failures — diagnose, fix, rebuild, push. Designed to run in a loop.
allowed-tools: Bash(cmake *), Bash(make *), Bash(ctest *), Bash(nproc), Bash(git *), Bash(clang-format *), Read, Write, Edit, Glob, Grep, mcp__github__pull_request_read, mcp__github__search_pull_requests, mcp__github__actions_list, mcp__github__actions_get, mcp__github__get_job_logs
---

# Watch CI

Monitor CI for the current branch's PR and react to failures. Designed to be run
with `/loop 30m /collector-dev:watch-ci`.

## Steps

1. **Find the PR** for the current branch:
- Get branch name: `git branch --show-current`
- Use `mcp__github__search_pull_requests` to find the open PR in `stackrox/collector`
- If no PR found, report and stop

2. **Check CI status**:
- Use `mcp__github__pull_request_read` to get check status
- Use `mcp__github__actions_list` to get workflow runs

3. **Evaluate state and act**:

**If all checks pass:**
- Report: "All CI checks passed. PR is ready for review."
- Stop — no further action needed

**If checks are still running:**
- Report: "CI still running (X of Y checks complete). Will check again next loop."
- Stop — wait for next loop iteration

**If checks failed:**
- Use `mcp__github__actions_get` and `mcp__github__get_job_logs` to get failure details
- Identify the failure type:
- **Build failure**: read compiler error, find the file:line, fix the code
- **Unit test failure**: read the assertion, find the test and source, fix the code
- **Integration test failure**: determine if it's a real failure or infra flake
- If infra flake (VM creation timeout, network issue): report and skip
- If real test failure: analyze the test expectation vs actual, fix the code
- **Lint failure**: run `clang-format --style=file -i` on the affected files
- After fixing:
- Build: `cmake --build cmake-build -- -j$(nproc)`
- Unit test: `ctest --no-tests=error -V --test-dir cmake-build`
- If build+test pass: `git add`, `git commit`, `git push`
- Report what was fixed and that a new CI run should start
- If the failure can't be fixed automatically, report the diagnosis and stop

4. **Summary**: always end with a clear status line:
- `PASSED` — all checks green
- `PENDING` — checks still running, will retry
- `FIXED` — failure diagnosed and fix pushed, awaiting new CI run
- `FLAKE` — infra failure, not a code issue
- `BLOCKED` — failure requires human intervention
12 changes: 12 additions & 0 deletions .claude/settings.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"permissions": {
"deny": [
"Read(.devcontainer/**)",
"mcp__github__merge_pull_request",
"mcp__github__delete_file",
"mcp__github__fork_repository",
"mcp__github__create_repository",
"mcp__github__actions_run_trigger"
]
}
}
80 changes: 80 additions & 0 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Collector development container
# Based on the collector-builder image which has all C++ dependencies pre-installed.
# Adds Claude Code, Go, and developer tooling for agent-driven development.
#
# Build environment: CentOS Stream 10 with clang, llvm, cmake, grpc, protobuf,
# libbpf, bpftool, and all other collector dependencies.

ARG COLLECTOR_BUILDER_TAG=master
FROM quay.io/stackrox-io/collector-builder:${COLLECTOR_BUILDER_TAG}

# Install developer tooling not in the builder image
# Note: git, findutils, which, openssh-clients already in builder
# bubblewrap: Claude Code uses this for built-in command sandboxing
RUN dnf install -y \
bubblewrap \
jq \
socat \
zsh \
procps-ng \
sudo \
python3-pip \
iptables \
ipset \
&& dnf clean all

# Determine architecture strings used by various download URLs
# uname -m gives aarch64 or x86_64
# Go uses arm64/amd64, ripgrep/fd use aarch64/x86_64
RUN ARCH=$(uname -m) \
&& GOARCH=$([ "$ARCH" = "aarch64" ] && echo "arm64" || echo "amd64") \
# Install Go
&& curl -fsSL "https://go.dev/dl/go1.23.6.linux-${GOARCH}.tar.gz" | tar -C /usr/local -xzf - \
# Install ripgrep
&& curl -fsSL "https://github.com/BurntSushi/ripgrep/releases/download/14.1.1/ripgrep-14.1.1-${ARCH}-unknown-linux-gnu.tar.gz" \
| tar -xzf - --strip-components=1 -C /usr/local/bin "ripgrep-14.1.1-${ARCH}-unknown-linux-gnu/rg" \
# Install fd
&& curl -fsSL "https://github.com/sharkdp/fd/releases/download/v10.2.0/fd-v10.2.0-${ARCH}-unknown-linux-gnu.tar.gz" \
| tar -xzf - --strip-components=1 -C /usr/local/bin "fd-v10.2.0-${ARCH}-unknown-linux-gnu/fd"

ENV PATH="/usr/local/go/bin:${PATH}"
ENV GOPATH="/home/dev/go"
ENV PATH="${GOPATH}/bin:${PATH}"

# Install Node.js (needed for Claude Code)
ARG NODE_VERSION=22
RUN curl -fsSL https://rpm.nodesource.com/setup_${NODE_VERSION}.x | bash - \
&& dnf install -y nodejs \
&& dnf clean all

# Install Claude Code
RUN npm install -g @anthropic-ai/claude-code

# Install gcloud CLI (for Vertex AI auth and GCP VM management)
RUN curl -fsSL https://sdk.cloud.google.com > /tmp/install-gcloud.sh \
&& bash /tmp/install-gcloud.sh --disable-prompts --install-dir=/opt \
&& rm /tmp/install-gcloud.sh
ENV PATH="/opt/google-cloud-sdk/bin:${PATH}"

# Pull GitHub MCP server image (used by Claude Code for GitHub operations)
# Configured in .claude/settings.json as an MCP server

# Create non-root dev user with passwordless sudo
RUN useradd -m -s /bin/zsh dev \
&& echo "dev ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers.d/dev

# Install ansible for VM-based testing (optional, lightweight)
RUN pip3 install ansible-core

# Firewall script for network isolation (optional, used with --dangerously-skip-permissions)
COPY init-firewall.sh /usr/local/bin/init-firewall.sh
RUN chmod +x /usr/local/bin/init-firewall.sh

USER dev
WORKDIR /workspace

# Persist shell history and Claude state across rebuilds (volumes in devcontainer.json)
ENV HISTFILE=/home/dev/.commandhistory/.zsh_history

ENV SHELL=/bin/zsh
ENV DEVCONTAINER=true
Loading
Loading