feat(skills): gate deploys on catalogue consistency and SkillSpector scan by DevelopmentCats · Pull Request #909 · coder/registry

DevelopmentCats · 2026-06-02T21:41:09Z

Summary

Adds a two-stage gate in front of every registry deploy so the agent-skills
catalogue cannot ship broken or unsafe content:

Catalogue consistency. Extends cmd/readmevalidation with a
READMEVALIDATION_ONLINE=1 mode that shallow-clones every declared
owner/repo@ref, verifies every sources[].skills slug exists upstream
as skills/<slug>/SKILL.md, validates each SKILL.md frontmatter
against the agentskills.io v0.2.0 specification (required name and
description), and rejects duplicate slugs across sources within the
same namespace. Strict gate, blocks on failure.
Security scan. Runs NVIDIA SkillSpector
(static-only, --no-llm) over each unique upstream source repo and
uploads SARIF to GitHub Code scanning. Warn-only gate for now, see
"Initial scan policy" below.

Sequential gating: offline catalogue checks → online catalogue checks →
SkillSpector. If anything fails (including the SkillSpector gate once it
is flipped to strict), the deploy's gcloud builds triggers run step
never fires and a single rolling tracker issue (label skills-gate) is
opened or updated by JasonEtco/create-an-issue, so repeat failures
update the same issue rather than spamming.

Initial scan policy: warn-only

SkillSpector flags 13 error-level findings against coder/skills@main
on the first run, all in the intentionally credential-handling setup
skill:

5x PE3 Credential Access in skills/setup/scripts/github-device-fetch.sh
1x PE3 Credential Access in skills/setup/scripts/github-device-poll.sh
1x E2 Env Variable Harvesting in skills/setup/references/coder-agents.md
2x PE3 Credential Access in test/claude.sh, test/codex.sh
4 others on the same setup files

These are real pattern matches but expected behavior for an installer
skill that orchestrates a GitHub device-flow login. Rather than
rubber-stamping them or holding this PR for upstream cleanup, the
severity step is shipped as continue-on-error: true. Findings still
flow to Code scanning so they show up as PR annotations and Security
tab alerts. A follow-up PR can:

Tune coder/skills (add SkillSpector ignore comments, move test
scripts out of the scan path, restructure the device-flow helpers).
Flip continue-on-error: false in both ci.yaml and
deploy-registry.yaml to make the gate strict.

What this solves

The registry-server build pipeline currently drops catalogue overrides
for upstream slugs it cannot find, silently. Upstream rename, deletion,
or branch-flip is invisible until a contributor notices a 404. This PR
makes the gap loud and blocks the deploy on the consistency check.

What this does NOT solve

Immutability. Sources are still @main. Pinning to SHAs or tags
depends on coder/registry-server PR chore: remove it wrappers from required variables tests #442 follow-up chore: add CI step for validating contributor READMEs #1 landing first
(the cloneSkillSourceRepo branch-only bug).
Race window. The gate scans upstream at CI time; the deploy clones
upstream at deploy time. A future follow-up can move the scan inside
the deploy job to close it further.

Files

Path	Change
`cmd/readmevalidation/coderskills.go`	New online verification path (`validateAllCoderSkillsOnline`, `fetchSkillsSourceContent`, `verifyReadmeAgainstSources`, `validateSkillMarkdownFrontmatter`). Factors `validateAllCoderSkills` to expose parsed readmes. Uses the system `git` binary via `exec.Command` to avoid pulling in a Go Git library.
`cmd/readmevalidation/main.go`	Runs the online check after the offline check when `READMEVALIDATION_ONLINE=1`.
`cmd/readmevalidation/readmefiles.go`	Adds `validationPhaseUpstream`.
`scripts/scan-skill-sources.sh`	Iterates skills READMEs, extracts unique `repo@ref` tuples, runs SkillSpector per source, writes SARIF. Treats missing-SARIF as the only true tool crash; severity gating lives in the workflow.
`.github/workflows/ci.yaml`	`dorny/paths-filter` on `validate-readme-files` + new `scan-skills` job, both skipped on PRs that do not touch skills paths. SkillSpector severity step is `continue-on-error: true`.
`.github/workflows/deploy-registry.yaml`	Refactored into `verify` → `deploy` + `open-issue-on-failure`. Same SkillSpector severity policy as `ci.yaml`.
`.github/ISSUE_TEMPLATE/skills-failure.md`	Single tracker template used by `JasonEtco/create-an-issue` with `update_existing: true`.

Verification

Local, before pushing, against current coder/skills@main:

$ go build ./cmd/readmevalidation
$ READMEVALIDATION_ONLINE=1 ./readmevalidation
...
[info]  verifying upstream skill sources  num_unique_sources=1
[info]  upstream skill source verification passed  num_files=1  num_sources=1

Failure path (injected nonexistent-slug into sources[].skills):

exit=1
[erro]  Error during "Upstream skill source verification" phase of README validation:
   - "registry/coder/skills/README.md": sources[0].skills["nonexistent-slug"]:
     slug not found upstream at coder/skills/skills/nonexistent-slug/SKILL.md
     (repo=coder/skills ref=main)

CI status on this PR: all seven checks green, including the new
Validate README files (online catalogue check), Scan skill sources with SkillSpector (warn-only gate, SARIF uploaded), and skillspector
(Code scanning ingest). Validate Terraform output flaked twice on
registry.terraform.io provider downloads (Error while installing coder/coder v2.18.0: ... failed to retrieve authentication checksums for provider); passed on the third retry.

Open follow-ups

Move SkillSpector from a git-pinned commit to a tagged release when
one is available. Currently pinned at NVIDIA/SkillSpector
2eb844780ab163f01468ecf142c40a2ec0fcaec0; SkillSpector is not on
PyPI yet.
Triage the warn-only SkillSpector findings in coder/skills@main,
then flip continue-on-error: false in both workflows.
Once cloneSkillSourceRepo in coder/registry-server accepts tags
and SHAs, switch sources from @main to immutable refs and add a
Renovate-style bumper.

Implementation plan and decision log

The full plan that drove this PR (option space, the choice to shell out
to git rather than vendor go-git, why we extend cmd/readmevalidation
instead of adding a new binary, severity policy reasoning) lives in
plans/skills-scan-pr-plan.md and is available on request.

This PR was created with help from Coder Agents.

Adds a READMEVALIDATION_ONLINE=1 mode to cmd/readmevalidation that shallow-clones every owner/repo@ref declared in registry/<namespace>/skills/README.md and verifies: - every catalogue override slug exists upstream as skills/<slug>/SKILL.md, - each upstream SKILL.md has the required name and description frontmatter per the agentskills.io v0.2.0 specification, - no slug is provided by more than one source within the same namespace. The mode is off by default so go run ./cmd/readmevalidation keeps working without network access. CI enables it when network is available. Lifts the shallow-clone shape from coder/registry-server/cmd/build/skills.go so the verifier matches what the deploy pipeline will actually ingest. Branch-only refs for now, matching that pipeline's current behavior.

New helper that walks every registry/<namespace>/skills/README.md, extracts each unique owner/repo@ref from the YAML frontmatter, and runs NVIDIA SkillSpector over the upstream GitHub URL. One SARIF file per source is written to ./sarif (configurable via the first positional argument). The script does not gate on findings. It exits non-zero only when skillspector itself crashes. Severity gating lives in the workflow so the deploy policy is visible alongside the deploy step.

Extends the existing validate-readme-files CI job with an online phase that runs cmd/readmevalidation with READMEVALIDATION_ONLINE=1 when the PR touches skills paths. Adds a scan-skills job that downloads and runs NVIDIA SkillSpector over every declared upstream source and uploads SARIF to GitHub Code scanning, failing on findings at the error level. Refactors deploy-registry into three jobs: - verify: runs the same online validator + SkillSpector before deploy. - deploy: now needs: verify, so gcloud builds triggers run only fires when the catalogue is consistent and the scan is clean. - open-issue-on-failure: posts to a single rolling tracker issue (label skills-gate) via JasonEtco/create-an-issue with update_existing, so repeat failures update the same issue rather than spamming. A dorny/paths-filter step drives both the online phase and the scan-skills job so PRs that do not touch skills pay no extra cost.

- Drop go-git/v5 dependency and use the system git binary via exec.Command. The library pulled go 1.25.0 transitively, breaking golangci-lint v2.1.6 which is built against go 1.24. - Apply prettier-plugin-sh formatting to scan-skill-sources.sh. - Disable actions/setup-go's default module cache in the deploy verify job to satisfy zizmor's cache-poisoning rule on jobs that hold security-events: write.

SkillSpector is not on PyPI yet, so the install must reference the NVIDIA/SkillSpector repository directly. Pinned at commit 2eb844780ab163f01468ecf142c40a2ec0fcaec0 to keep the policy reproducible. Will switch to a pip-published version when one ships. Also bumps actions/setup-python from v5.6.0 to v6.2.0 to get off the deprecated Node 20 action runtime, and pins the latest pip before installing skillspector so its hatchling build env has the recent packaging release that supports the SPDX license expression in its pyproject.toml.

SkillSpector exits non-zero when its scan finds anything to report, even though it still writes the SARIF file. Treating that as a crash short-circuits SARIF upload and prevents the workflow's severity jq step from making the policy decision. Inverts the check: scan_failed is now set only when the SARIF file is missing or empty, which is the genuine 'tool crashed' signal.

github-advanced-security · 2026-06-02T22:10:25Z

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

Will be removed once the SkillSpector severity policy is locked in.

SkillSpector flags 13 error-level findings against coder/skills@main, all in the intentionally credential-handling setup skill (GitHub device-flow scripts and references). These are real patterns matching SkillSpector's PE3 'Credential Access' and E2 'Env Variable Harvesting' rules, but they describe expected behavior for an installer skill rather than a vulnerability. Initial policy is warn-only: SARIF still flows to Code scanning so the findings are visible on PRs, the severity gate still runs and prints, but continue-on-error keeps the workflow green. A follow-up PR can either tune the rules in coder/skills, add SkillSpector suppressions, or flip continue-on-error to false to make the gate strict. Also drops the debug upload-artifact step now that the SARIF shape is known.

bpmct

how does this scanning help us if its at deploy time and there are dynamically changing repos?

DevelopmentCats · 2026-06-03T20:43:39Z

Member

Its probably not the most elegant solution, but the idea is that anytime we would deploy the registry it would run this skill verification on each skills repo and if it fails the deployment will just fail and create an issue. I was fiddling with some stuff while researching on skill scanning last night and how we would implement this, and this is a WIP currently

DevelopmentCats added 6 commits June 2, 2026 21:37

DevelopmentCats added 2 commits June 2, 2026 22:14

tmp(ci): upload SARIF as artifact for triage

8b33e21

Will be removed once the SkillSpector severity policy is locked in.

bpmct reviewed Jun 2, 2026

View reviewed changes

DevelopmentCats closed this Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): gate deploys on catalogue consistency and SkillSpector scan#909

feat(skills): gate deploys on catalogue consistency and SkillSpector scan#909
DevelopmentCats wants to merge 8 commits into
mainfrom
cat/skills-scan-gate

DevelopmentCats commented Jun 2, 2026 •

edited

Loading

Uh oh!

github-advanced-security AI commented Jun 2, 2026

Uh oh!

bpmct left a comment

Uh oh!

DevelopmentCats commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DevelopmentCats commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Initial scan policy: warn-only

What this solves

What this does NOT solve

Files

Verification

Open follow-ups

Uh oh!

github-advanced-security AI commented Jun 2, 2026

What Enabling Code Scanning Means:

Uh oh!

bpmct left a comment

Choose a reason for hiding this comment

Uh oh!

DevelopmentCats commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DevelopmentCats commented Jun 2, 2026 •

edited

Loading