Skip to content

Add event-driven repo brancher controller#5266

Open
jmguzik wants to merge 1 commit into
openshift:mainfrom
jmguzik:manager_repo
Open

Add event-driven repo brancher controller#5266
jmguzik wants to merge 1 commit into
openshift:mainfrom
jmguzik:manager_repo

Conversation

@jmguzik

@jmguzik jmguzik commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

This introduces a long-running controller that replaces periodic repository branch forwarding. It derives default-branch and release-branch relationships from strictly validated configuration and ci-operator promotion data. It consumes signed push events through Prow Hook while retaining periodic reconciliation for missed deliveries. It performs non-forced GitHub ref updates with scoped retries, rate-limit handling, health checks, and metrics. Unit, concurrency, webhook, configuration, and GitHub client tests cover the controller behavior.

New repo-brancher-controller Component

This PR introduces a new long-running controller (repo-brancher-controller) that replaces the existing periodic batch-style branch forwarding mechanism for OpenShift CI repositories. The controller shifts from a time-based to an event-driven model.

What Changed

The controller derives branch forwarding relationships (which branches to update and where) from:

  • Ci-operator promotion configuration (ci-operator config directory)
  • Explicit forwarding rules defined in forwarding.yaml configuration files
  • These relationships include default-branch forwarding (e.g., forward main to release-* branches) and release-branch forwarding

How It Works

Event-driven processing: The controller consumes GitHub push webhooks through Prow's webhook infrastructure (repo-brancher-controller registers as an external Prow plugin). When a source branch is pushed, the controller immediately reconciles all configured target branches.

Safe, non-forced updates: The controller updates target Git refs via the GitHub API with force: false, ensuring it only performs fast-forward operations. This prevents overwriting work on target branches.

Resilience:

  • Periodic reconciliation runs as a fallback to catch any missed webhook deliveries
  • Scoped retry logic with different backoff strategies for rate-limiting vs permission errors
  • Health/readiness checks track configuration reloads, GitHub connectivity, and rate-limit status
  • Rate-limit handling respects GitHub's Retry-After and X-RateLimit-* headers

Configuration & Operation

Forwarding configuration: Define which source branches forward to which targets (e.g., release-4.14release-4.15) in structured YAML with validation for release version formats, duplicate targets, and org/repo ignore lists.

Authentication: Supports both direct GitHub token and GitHub App authentication modes.

Observability:

  • Prometheus metrics for reconciliation outcomes, GitHub API operations, webhook processing, and queue depth
  • Health endpoints for readiness gating
  • Structured logging of forwarding actions

Prow integration: The controller validates that it's registered as an external plugin in the Prow plugin configuration, ensuring webhook delivery is properly configured.

Testing

The implementation includes comprehensive test coverage: unit tests for configuration loading, controller reconciliation logic, concurrency safety, webhook handling, GitHub client operations, and plugin configuration validation.

This introduces a long-running controller that replaces periodic
repository branch forwarding. It derives default-branch and release-branch
relationships from strictly validated configuration and ci-operator promotion data.
It consumes signed push events through Prow Hook while retaining periodic
reconciliation for missed deliveries. It performs non-forced GitHub ref updates
with scoped retries, rate-limit handling, health checks, and metrics.
Unit, concurrency, webhook, configuration, and GitHub client tests cover the controller behavior.

Signed-off-by: Jakub Guzik <jguzik@redhat.com>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Adds cmd/repo-brancher-controller, a new event-driven controller that replaces repo-brancher. It reads ci-operator configs and a forwarding YAML to build a desired-state map of source→target branch pairs, processes GitHub push webhooks to fast-forward target refs via the GitHub API, and guards against missed webhooks with initial reconciliation and configurable full resyncs.

Changes

repo-brancher-controller

Layer / File(s) Summary
Forwarding config schema and validation
cmd/repo-brancher-controller/forwarding.go, cmd/repo-brancher-controller/forwarding_test.go
Defines forwardingConfig/defaultBranchForwarding/releaseBranchForwarding structs; loadForwardingConfig reads and strictly unmarshals YAML; validate enforces release format, non-empty/non-duplicate targets, source≠target, and ignore-list shape; isIgnored checks org and org/repo patterns. Tests cover loading, validation error cases, and ignore matching.
Desired state model and controller reconciliation
cmd/repo-brancher-controller/controller.go, cmd/repo-brancher-controller/controller_test.go
Defines repoKey, desiredState with per-key mutexes, replace/get/matching/ifTargetConfigured accessors; controller wraps a rate-limited workqueue with permanentError/rateLimitError/reconciliationErrors types; reconcile fetches source SHA and creates/updates target refs; retry schedules requeues with backoff. Tests cover ref create/update, deduplication, sibling error aggregation, removed-target guard, and replacement/mutation concurrency.
GitHub API client with rate limiting
cmd/repo-brancher-controller/github.go, cmd/repo-brancher-controller/github_test.go, cmd/repo-brancher-controller/auth.go
githubRefClient validates HTTPS endpoints, loads bearer tokens, implements getRef/createRef/updateRef (with concurrent-create recovery), waitForRateLimit/blockUntil, retryAt from response headers, and maps 401/403/422/429 to typed errors with health signaling. newAppTokenGenerator builds a GitHub App JWT client. Tests cover ref ops, 422 permanent classification, concurrent-create recovery, rate-limit parsing, and HTTPS enforcement.
Runtime health and Prometheus metrics
cmd/repo-brancher-controller/health.go, cmd/repo-brancher-controller/health_test.go, cmd/repo-brancher-controller/metrics.go
runtimeHealth tracks server/config/GitHub/rate-limit state atomically; ready(now) gates readiness on freshness and auth/availability flags. metrics.go declares and registers counter vectors, a histogram, and gauges for reconciliation, GitHub API requests, webhooks, queue depth, and timestamps. Tests verify readiness transitions and rate-limit pause capping.
ci-operator config → desired state
cmd/repo-brancher-controller/config.go, cmd/repo-brancher-controller/config_test.go
loadDesiredState walks a ci-operator config directory, filters by ignore lists and promotesOfficialImages, maps main/master and release-*/openshift-* branches to forwarding targets via promotion.DetermineReleaseBranch, and returns a map[repoKey]sets.Set[string]. Tests verify category-scoped ignore filtering and correct target sets.
Webhook HTTP handler
cmd/repo-brancher-controller/webhook.go, cmd/repo-brancher-controller/webhook_test.go
webhookHandler.ServeHTTP validates POST, required GitHub headers, JSON content-type, body size limit, and HMAC signature; dispatches push events by calling state.matching and controller.enqueue; ignores deletes and non-branch refs; emits webhooksTotal metrics and responds 202 Accepted. Tests cover push accept, tag/delete ignore, invalid HMAC (403), and oversized body (413).
External plugin registration validator
cmd/repo-brancher-controller/plugin_config.go, cmd/repo-brancher-controller/plugin_config_test.go
Defines pluginConfigFragment/externalPlugin YAML types; validateExternalPluginRegistrations reads per-org _pluginconfig.yaml files and verifies a repo-brancher-controller push-handling entry exists for every org in the desired state. Test covers valid fixtures and missing-registration failure.
Main entrypoint, wiring, and container image
cmd/repo-brancher-controller/main.go, images/repo-brancher-controller/Dockerfile, cmd/repo-brancher-controller/README.md
gatherOptions/validate parse and check all CLI flags (auth mode, pacing, TLS, timeouts). main wires HMAC agent, GitHub client, controller, periodic reload/resync loops, Prometheus/pprof/health servers, webhook server, and graceful shutdown. runPeriodic/runPeriodicJittered provide scheduling helpers. Dockerfile copies binary into UBI9 minimal image. README documents purpose, flags, and forwarding config schema.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name Status Explanation Resolution
Container-Privileges ❌ Error Dockerfile lacks USER directive; container runs as root (UID 0). This violates container security best practice and custom check requirement against running as root. Add USER directive to Dockerfile (e.g., USER 1001) to run container as non-root user, as already suggested in review comments.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Coverage For New Features ⚠️ Warning Three pure functions lack test coverage: newAppTokenGenerator (auth.go), (o *options) validate (main.go), and jitterDuration (controller.go). Per guidelines, "pure functions should always be tested." Add unit tests for pure functions: newAppTokenGenerator error/success paths, options.validate all validation checks, and jitterDuration edge cases and randomness bounds.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing a new event-driven controller for repository branching, which is the primary focus of this large changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Go Error Handling ✅ Passed Error handling follows Go best practices: 20 proper error wrappings with %w, no panic() calls outside init, errors validated before use, and all underscore ignores have clear justification (only re...
Stable And Deterministic Test Names ✅ Passed All 20 test functions and 14 table-driven test cases use stable, static string literals with no dynamic values (timestamps, UUIDs, generated IDs, or interpolation). Test names are deterministic and...
Test Structure And Quality ✅ Passed No Ginkgo tests in this PR; all tests use standard Go testing package. Check requires review of Ginkgo test code, which is not applicable here.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All tests use standard Go testing package (testing.T); Ginkgo is not imported. Check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added; PR contains only Go unit tests. SNO compatibility check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds controller source code and a Dockerfile but contains no Kubernetes deployment manifests, pod definitions, or scheduling constraints. Topology-awareness checks do not apply to controlle...
Ote Binary Stdout Contract ✅ Passed Controller properly configures logging: logrusutil.ComponentInit() called first in main() redirects logrus to stderr; no os.Stdout, fmt.Print, or direct log.Print calls in process-level code; fmt.F...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only standard Go unit tests (20 functions with *testing.T), not Ginkgo e2e tests. Check applies only to Ginkgo patterns (It, Describe, Context, When) which are absent.
No-Weak-Crypto ✅ Passed PR uses only standard and trusted crypto libraries (crypto/rsa, golang-jwt, Prow github); no weak algorithms (MD5/SHA1/DES/RC4/3DES/Blowfish/ECB), custom implementations, or non-constant-time secre...
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, secrets, or PII) found in logs. Token/credential errors are properly wrapped without exposure, HMAC secrets never logged, and API error messages safe...
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmguzik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 23, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (7)
cmd/repo-brancher-controller/forwarding_test.go (1)

65-65: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Split assertion for better test failure messages.

The three-part boolean expression makes it unclear which specific condition failed. Consider separate assertions:

-	if !isIgnored(ignored, "all-org", "anything") || !isIgnored(ignored, "specific-org", "repo") || isIgnored(ignored, "specific-org", "other") {
-		t.Fatal("org and org/repo ignore matching is incorrect")
-	}
+	if !isIgnored(ignored, "all-org", "anything") {
+		t.Error("org-level ignore should match any repo in that org")
+	}
+	if !isIgnored(ignored, "specific-org", "repo") {
+		t.Error("org/repo ignore should match specific repository")
+	}
+	if isIgnored(ignored, "specific-org", "other") {
+		t.Error("org/repo ignore should not match different repo in same org")
+	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/forwarding_test.go` at line 65, The test
assertion at line 65 combines three separate boolean conditions in a single if
statement, making it unclear which condition failed when the test fails. Split
this compound assertion into three separate assertions, one for each condition
being tested: the first checking that isIgnored returns false for "all-org" with
"anything", the second checking that isIgnored returns false for "specific-org"
with "repo", and the third checking that isIgnored returns true for
"specific-org" with "other". This will provide clearer failure messages
indicating exactly which condition failed.
cmd/repo-brancher-controller/webhook.go (1)

21-78: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Consider adding logrus logging for webhook deliveries.

The handler relies entirely on metrics. Adding structured logging (e.g., logrus with delivery ID and outcome) would improve debuggability for webhook processing issues in production.

Example:

logrus.WithFields(logrus.Fields{
    "delivery": delivery,
    "event": eventType,
    "org": event.Repo.Owner.Login,
    "repo": event.Repo.Name,
    "branch": event.Branch(),
}).Info("webhook accepted")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/webhook.go` around lines 21 - 78, Add structured
logging using logrus throughout the ServeHTTP method to improve debuggability of
webhook processing. At the minimum, log the accepted webhook case after the
metrics increment with fields including the delivery ID, event type,
organization, repository name, and branch from the event payload. Consider
adding similar logging at other key decision points (ignored events, validation
failures) to provide visibility into webhook processing outcomes beyond just the
metrics that are currently recorded.
images/repo-brancher-controller/Dockerfile (2)

3-3: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Use COPY instead of ADD for regular files.

Static analysis (DL3020) flags the use of ADD for a regular file. ADD has implicit tar extraction and URL fetching behavior; COPY is explicit and safer for local files. As per path instructions, "COPY specific files, not entire context."

-ADD repo-brancher-controller /usr/bin/repo-brancher-controller
+COPY repo-brancher-controller /usr/bin/repo-brancher-controller

As per path instructions and static analysis, prefer COPY for copying files.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@images/repo-brancher-controller/Dockerfile` at line 3, The Dockerfile
instruction copying repo-brancher-controller to
/usr/bin/repo-brancher-controller uses ADD, which has implicit tar extraction
and URL fetching behavior that is unnecessary and unsafe for regular files.
Replace the ADD instruction with COPY instead, as COPY is explicit and safer for
copying local files. This change will align with best practices and satisfy the
DL3020 static analysis rule.

Sources: Path instructions, Linters/SAST tools


1-4: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Add HEALTHCHECK directive.

The Dockerfile lacks a HEALTHCHECK instruction. Per path instructions for container images, "HEALTHCHECK defined" is required. The controller exposes a health endpoint on --health-port (default from Prow instrumentation).

🩺 Proposed addition
 FROM registry.access.redhat.com/ubi9/ubi-minimal:latest

 COPY repo-brancher-controller /usr/bin/repo-brancher-controller
+HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
+  CMD ["/usr/bin/repo-brancher-controller", "--help"]
 ENTRYPOINT ["/usr/bin/repo-brancher-controller"]

Note: The actual health endpoint is served by Prow's health server on the instrumentation port. The above is a placeholder; the proper check would be curl http://localhost:<health-port>/healthz if curl is available in ubi-minimal, or an HTTP check via the orchestrator's native health probes.

As per path instructions, container images should define HEALTHCHECK.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@images/repo-brancher-controller/Dockerfile` around lines 1 - 4, The
Dockerfile is missing a required HEALTHCHECK directive. Add a HEALTHCHECK
instruction after the ENTRYPOINT directive that verifies the controller's health
by checking the health endpoint exposed on the instrumentation port (default
behavior from Prow). The health check should query the endpoint at
http://localhost:<health-port>/healthz, using an appropriate method for the
ubi-minimal base image (either via curl if available, or through a shell-based
HTTP check using standard tools available in the minimal image).

Source: Path instructions

cmd/repo-brancher-controller/main.go (2)

29-43: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Add documentation comment for exported type.

The options struct is exported but lacks a doc comment. As per path instructions, comment important exported types with their purpose.

📝 Suggested documentation
+// options holds all command-line configuration for repo-brancher-controller,
+// including paths to configs, GitHub authentication, server tuning, and timing parameters.
 type options struct {
 	instrumentation                                               flagutil.InstrumentationOptions
 	configDir, forwardingConfigPath, pluginConfigDir              string
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/main.go` around lines 29 - 43, The options
struct at the beginning of the file lacks a documentation comment above its
declaration. Add a doc comment directly above the type definition for options
that describes the struct's purpose and what configuration options it contains.
The comment should follow Go conventions by starting with the struct name
followed by a description of its role in the repo-brancher-controller
application.

Source: Path instructions


134-136: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Redundant HMAC file validation after secret agent initialization.

Line 134 reads the HMAC file to verify it's non-empty, but secret.Add(o.hmacPath) at line 131 already validates the file is readable and loads it into the agent. The additional read is defensive but unnecessary.

♻️ Optional simplification
 	if err := secret.Add(o.hmacPath); err != nil {
 		logrus.WithError(err).Fatal("start HMAC secret agent")
 	}
-	if raw, err := os.ReadFile(o.hmacPath); err != nil || len(raw) == 0 {
-		logrus.WithError(err).Fatal("webhook HMAC secret is missing or empty")
-	}

The secret agent already validates the file exists and is readable; an empty file would be caught when the HMAC signature validation fails on the first webhook.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/main.go` around lines 134 - 136, The redundant
validation of the HMAC file using os.ReadFile(o.hmacPath) at lines 134-136 can
be removed since the secret.Add(o.hmacPath) call at line 131 already validates
that the file is readable and loads it into the agent. Delete the if statement
that checks if raw is empty or if there's an error from os.ReadFile, as any
issues with the HMAC file (missing or empty) will be caught during actual
webhook HMAC signature validation at runtime.
cmd/repo-brancher-controller/health.go (1)

8-16: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add documentation comment for exported type.

runtimeHealth is an exported type but lacks a doc comment explaining its purpose and safe usage. As per path instructions, comment important exported functions (and types) with their purpose.

📝 Suggested documentation
+// runtimeHealth tracks atomic health state for server readiness, config freshness,
+// GitHub API availability, authentication validity, and rate-limit blocking.
+// All fields are safe for concurrent access.
 type runtimeHealth struct {
 	serverHealthy      atomic.Bool
 	lastConfigSuccess  atomic.Int64
 	lastGitHubSuccess  atomic.Int64
 	rateLimitedUntil   atomic.Int64
 	githubAuthHealthy  atomic.Bool
 	githubAvailable    atomic.Bool
 	maxConfigStaleness time.Duration
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/health.go` around lines 8 - 16, Add a
documentation comment for the runtimeHealth struct type that explains its
purpose and safe concurrent usage. Create a doc comment above the type
definition starting with "// runtimeHealth" followed by a description of what
the type does, noting that it manages the health status of runtime components
and how the atomic fields should be accessed safely. This documents the type's
role in the system and establishes expectations for how it should be used.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/repo-brancher-controller/auth.go`:
- Around line 15-37: In the newAppTokenGenerator function, add validation for
the apiEndpoint parameter using the same validation schema applied to
graphqlEndpoint. Parse apiEndpoint using url.Parse and check that it has an
https scheme, a non-empty host, no user information, no raw query parameters,
and no fragment. Insert this validation block before the apiEndpoint is passed
to the NewAppsAuthClientWithFields call, and return an appropriate error message
if any validation check fails. This ensures the API endpoint is validated at the
trust boundary where it is first used.

In `@cmd/repo-brancher-controller/health.go`:
- Around line 44-45: The lastConfigSuccess atomic value is loaded twice in the
condition check, which can cause race conditions where the value changes between
the two loads. Load h.lastConfigSuccess once at the beginning and store it in a
variable, then use that variable for both the zero-check and when constructing
the lastConfig timestamp using time.Unix. This ensures both operations use the
same consistent snapshot of the value.

In `@cmd/repo-brancher-controller/plugin_config.go`:
- Line 30: Add a documentation comment for the
validateExternalPluginRegistrations function that appears at line 30 in the
file. The doc comment should be placed directly above the function declaration
and should explain what the function does, describe the pluginConfigDir
parameter (its purpose and type), describe the desired parameter (the map
structure and its purpose), and document the error return value (when it might
be returned and what errors it represents).

In `@cmd/repo-brancher-controller/webhook.go`:
- Line 21: The exported method ServeHTTP on the webhookHandler type is missing a
doc comment. Add a doc comment directly above the ServeHTTP method that starts
with "ServeHTTP" and describes what the method does, what parameters it receives
(the http.ResponseWriter and *http.Request), and what behavior it implements
(such as handling webhook requests). Follow Go's doc comment conventions where
the first line starts with the function name followed by a description of its
purpose.

In `@images/repo-brancher-controller/Dockerfile`:
- Around line 1-4: The Dockerfile lacks a USER directive and will run as root by
default, violating security best practices. After the ADD instruction that
copies the repo-brancher-controller binary to /usr/bin/, add a USER directive to
run the container as a non-root user. Use UID 1001 which is the conventional
choice for UBI images and is within the supported arbitrary UID range
(1000-65533). This ensures the container runs under a non-privileged user
account while maintaining compatibility with the UBI minimal base image.

---

Nitpick comments:
In `@cmd/repo-brancher-controller/forwarding_test.go`:
- Line 65: The test assertion at line 65 combines three separate boolean
conditions in a single if statement, making it unclear which condition failed
when the test fails. Split this compound assertion into three separate
assertions, one for each condition being tested: the first checking that
isIgnored returns false for "all-org" with "anything", the second checking that
isIgnored returns false for "specific-org" with "repo", and the third checking
that isIgnored returns true for "specific-org" with "other". This will provide
clearer failure messages indicating exactly which condition failed.

In `@cmd/repo-brancher-controller/health.go`:
- Around line 8-16: Add a documentation comment for the runtimeHealth struct
type that explains its purpose and safe concurrent usage. Create a doc comment
above the type definition starting with "// runtimeHealth" followed by a
description of what the type does, noting that it manages the health status of
runtime components and how the atomic fields should be accessed safely. This
documents the type's role in the system and establishes expectations for how it
should be used.

In `@cmd/repo-brancher-controller/main.go`:
- Around line 29-43: The options struct at the beginning of the file lacks a
documentation comment above its declaration. Add a doc comment directly above
the type definition for options that describes the struct's purpose and what
configuration options it contains. The comment should follow Go conventions by
starting with the struct name followed by a description of its role in the
repo-brancher-controller application.
- Around line 134-136: The redundant validation of the HMAC file using
os.ReadFile(o.hmacPath) at lines 134-136 can be removed since the
secret.Add(o.hmacPath) call at line 131 already validates that the file is
readable and loads it into the agent. Delete the if statement that checks if raw
is empty or if there's an error from os.ReadFile, as any issues with the HMAC
file (missing or empty) will be caught during actual webhook HMAC signature
validation at runtime.

In `@cmd/repo-brancher-controller/webhook.go`:
- Around line 21-78: Add structured logging using logrus throughout the
ServeHTTP method to improve debuggability of webhook processing. At the minimum,
log the accepted webhook case after the metrics increment with fields including
the delivery ID, event type, organization, repository name, and branch from the
event payload. Consider adding similar logging at other key decision points
(ignored events, validation failures) to provide visibility into webhook
processing outcomes beyond just the metrics that are currently recorded.

In `@images/repo-brancher-controller/Dockerfile`:
- Line 3: The Dockerfile instruction copying repo-brancher-controller to
/usr/bin/repo-brancher-controller uses ADD, which has implicit tar extraction
and URL fetching behavior that is unnecessary and unsafe for regular files.
Replace the ADD instruction with COPY instead, as COPY is explicit and safer for
copying local files. This change will align with best practices and satisfy the
DL3020 static analysis rule.
- Around line 1-4: The Dockerfile is missing a required HEALTHCHECK directive.
Add a HEALTHCHECK instruction after the ENTRYPOINT directive that verifies the
controller's health by checking the health endpoint exposed on the
instrumentation port (default behavior from Prow). The health check should query
the endpoint at http://localhost:<health-port>/healthz, using an appropriate
method for the ubi-minimal base image (either via curl if available, or through
a shell-based HTTP check using standard tools available in the minimal image).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 61e81a0f-2537-4e0d-8c10-5695f5f3615e

📥 Commits

Reviewing files that changed from the base of the PR and between 7f8c694 and b455c87.

📒 Files selected for processing (19)
  • cmd/repo-brancher-controller/README.md
  • cmd/repo-brancher-controller/auth.go
  • cmd/repo-brancher-controller/config.go
  • cmd/repo-brancher-controller/config_test.go
  • cmd/repo-brancher-controller/controller.go
  • cmd/repo-brancher-controller/controller_test.go
  • cmd/repo-brancher-controller/forwarding.go
  • cmd/repo-brancher-controller/forwarding_test.go
  • cmd/repo-brancher-controller/github.go
  • cmd/repo-brancher-controller/github_test.go
  • cmd/repo-brancher-controller/health.go
  • cmd/repo-brancher-controller/health_test.go
  • cmd/repo-brancher-controller/main.go
  • cmd/repo-brancher-controller/metrics.go
  • cmd/repo-brancher-controller/plugin_config.go
  • cmd/repo-brancher-controller/plugin_config_test.go
  • cmd/repo-brancher-controller/webhook.go
  • cmd/repo-brancher-controller/webhook_test.go
  • images/repo-brancher-controller/Dockerfile
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/release (manual)
  • openshift/ci-docs (manual)
  • openshift/release-controller (manual)
  • openshift/ci-chat-bot (manual)

Comment on lines +15 to +37
func newAppTokenGenerator(appID, privateKeyPath, apiEndpoint, graphqlEndpoint string) (github.TokenGenerator, error) {
parsedGraphQL, err := url.Parse(graphqlEndpoint)
if err != nil || parsedGraphQL.Scheme != "https" || parsedGraphQL.Host == "" || parsedGraphQL.User != nil || parsedGraphQL.RawQuery != "" || parsedGraphQL.Fragment != "" {
return nil, fmt.Errorf("GitHub GraphQL endpoint must be an absolute HTTPS URL without user information, query parameters, or a fragment")
}
keyGenerator, err := secret.AddWithParser(privateKeyPath, func(raw []byte) (*rsa.PrivateKey, error) {
key, err := jwt.ParseRSAPrivateKeyFromPEM(raw)
if err != nil {
return nil, fmt.Errorf("parse GitHub App private key: %w", err)
}
return key, nil
})
if err != nil {
return nil, fmt.Errorf("load GitHub App private key: %w", err)
}
tokenGenerator, _, _, err := github.NewAppsAuthClientWithFields(
logrus.Fields{"component": componentName}, secret.Censor,
appID, keyGenerator, graphqlEndpoint, apiEndpoint,
)
if err != nil {
return nil, fmt.Errorf("create GitHub App authentication client: %w", err)
}
return tokenGenerator, nil

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -nP -C3 'newAppTokenGenerator|apiEndpoint|githubEndpoint|graphqlEndpoint|Scheme != "https"' cmd/repo-brancher-controller/main.go

Repository: openshift/ci-tools

Length of output: 2475


🏁 Script executed:

#!/bin/bash
# Check newGitHubRefClient implementation
rg -n "func newGitHubRefClient" cmd/repo-brancher-controller/ -A 20

Repository: openshift/ci-tools

Length of output: 2000


🏁 Script executed:

#!/bin/bash
# Search for endpoint validation patterns in the auth module
rg -n "Scheme|apiEndpoint|Host ==" cmd/repo-brancher-controller/auth.go

Repository: openshift/ci-tools

Length of output: 510


🏁 Script executed:

#!/bin/bash
# Check if NewAppsAuthClientWithFields is from pkg/ and what it expects
rg -n "NewAppsAuthClientWithFields" cmd/ pkg/ -B 2 -A 2 | head -40

Repository: openshift/ci-tools

Length of output: 603


🏁 Script executed:

#!/bin/bash
# Find NewAppsAuthClientWithFields in github package
fd -t f "\.go$" pkg/github | xargs grep -l "NewAppsAuthClientWithFields" | head -5

Repository: openshift/ci-tools

Length of output: 156


🏁 Script executed:

#!/bin/bash
# Search for JWT/token exchange or installation token logic
rg -n "installation.*token|jwt|exchange|api.github.com" pkg/github --type go | head -30

Repository: openshift/ci-tools

Length of output: 500


🏁 Script executed:

#!/bin/bash
# Find where NewAppsAuthClientWithFields is defined
rg -n "func NewAppsAuthClientWithFields" pkg/ --type go -A 15

Repository: openshift/ci-tools

Length of output: 156


🏁 Script executed:

#!/bin/bash
# Check if it's from an external package by looking at imports
rg -n "github.com/shurcooL/githubv4" pkg/github -B 5 -A 5 | head -40

Repository: openshift/ci-tools

Length of output: 156


🏁 Script executed:

#!/bin/bash
# Look at the github package structure to understand what TokenGenerator does
ls -la pkg/github/
</a>

Repository: openshift/ci-tools

Length of output: 530


🏁 Script executed:

#!/bin/bash
# Check auth.go imports to see what github package provides NewAppsAuthClientWithFields
head -20 cmd/repo-brancher-controller/auth.go

Repository: openshift/ci-tools

Length of output: 932


🏁 Script executed:

#!/bin/bash
# Search for the actual function definition across all files
rg -rn "NewAppsAuthClientWithFields" . --type go | head -10

Repository: openshift/ci-tools

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Look at go.mod to understand external dependencies
grep -i github go.mod | head -20

Repository: openshift/ci-tools

Length of output: 1096


🏁 Script executed:

#!/bin/bash
# Find how tokenGenerator is used
rg -n "tokenGenerator|withTokenGenerator" cmd/repo-brancher-controller/ -B 2 -A 5

Repository: openshift/ci-tools

Length of output: 2382


🏁 Script executed:

#!/bin/bash
# Check what withTokenGenerator does
rg -n "func withTokenGenerator" cmd/repo-brancher-controller/ -A 15

Repository: openshift/ci-tools

Length of output: 1386


🏁 Script executed:

#!/bin/bash
# Look for any GitHub App authentication or installation token generation patterns
rg -n "installation|app.*token|jwt" cmd/repo-brancher-controller/ -i

Repository: openshift/ci-tools

Length of output: 1229


Add validation for apiEndpoint in newAppTokenGenerator to match graphqlEndpoint validation.

graphqlEndpoint is validated for HTTPS/no-user-info/no-query/no-fragment at lines 17–18, but apiEndpoint is passed unchecked to NewAppsAuthClientWithFields at line 32. Although newGitHubRefClient validates apiEndpoint later, validation should occur at the trust boundary where the endpoint is first used. If Prow's NewAppsAuthClientWithFields exchanges the GitHub App JWT at apiEndpoint (to mint installation tokens), an unvalidated http:// or credential-bearing endpoint could leak the JWT. Apply the same validation schema to apiEndpoint as you do to graphqlEndpoint.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/auth.go` around lines 15 - 37, In the
newAppTokenGenerator function, add validation for the apiEndpoint parameter
using the same validation schema applied to graphqlEndpoint. Parse apiEndpoint
using url.Parse and check that it has an https scheme, a non-empty host, no user
information, no raw query parameters, and no fragment. Insert this validation
block before the apiEndpoint is passed to the NewAppsAuthClientWithFields call,
and return an appropriate error message if any validation check fails. This
ensures the API endpoint is validated at the trust boundary where it is first
used.

Comment on lines +44 to +45
lastConfig := time.Unix(h.lastConfigSuccess.Load(), 0)
if h.lastConfigSuccess.Load() == 0 || now.Sub(lastConfig) > h.maxConfigStaleness {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Redundant atomic load can cause transient readiness glitch.

lastConfigSuccess is loaded twice: once to construct lastConfig, then again for the zero-check. Between the two loads, another goroutine may update the value from 0 to a valid timestamp (or vice versa), causing the zero-check and staleness comparison to operate on inconsistent snapshots. This can produce transient false-negatives during the initialization window.

🔧 Proposed fix
-	lastConfig := time.Unix(h.lastConfigSuccess.Load(), 0)
-	if h.lastConfigSuccess.Load() == 0 || now.Sub(lastConfig) > h.maxConfigStaleness {
+	lastConfigUnix := h.lastConfigSuccess.Load()
+	if lastConfigUnix == 0 || now.Sub(time.Unix(lastConfigUnix, 0)) > h.maxConfigStaleness {
 		return false
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/health.go` around lines 44 - 45, The
lastConfigSuccess atomic value is loaded twice in the condition check, which can
cause race conditions where the value changes between the two loads. Load
h.lastConfigSuccess once at the beginning and store it in a variable, then use
that variable for both the zero-check and when constructing the lastConfig
timestamp using time.Unix. This ensures both operations use the same consistent
snapshot of the value.

Events []string `json:"events,omitempty"`
}

func validateExternalPluginRegistrations(pluginConfigDir string, desired map[repoKey]sets.Set[string]) error {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Add a doc comment for the exported function.

Per path instructions, exported functions should document their purpose, parameters, and return values.

📝 Suggested documentation
+// validateExternalPluginRegistrations verifies that each org in the desired state
+// has a repo-brancher-controller external plugin registered in its _pluginconfig.yaml
+// to handle push events at the expected endpoint.
 func validateExternalPluginRegistrations(pluginConfigDir string, desired map[repoKey]sets.Set[string]) error {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/plugin_config.go` at line 30, Add a
documentation comment for the validateExternalPluginRegistrations function that
appears at line 30 in the file. The doc comment should be placed directly above
the function declaration and should explain what the function does, describe the
pluginConfigDir parameter (its purpose and type), describe the desired parameter
(the map structure and its purpose), and document the error return value (when
it might be returned and what errors it represents).

Source: Path instructions

maxBodyBytes int64
}

func (h *webhookHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Add a doc comment for the exported ServeHTTP method.

Per path instructions, exported functions should document their purpose, parameters, and return values.

📝 Suggested documentation
+// ServeHTTP validates incoming GitHub webhook requests, verifies HMAC signatures,
+// filters push events for branch updates, and enqueues matching repository keys
+// for reconciliation.
 func (h *webhookHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (h *webhookHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// ServeHTTP validates incoming GitHub webhook requests, verifies HMAC signatures,
// filters push events for branch updates, and enqueues matching repository keys
// for reconciliation.
func (h *webhookHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/repo-brancher-controller/webhook.go` at line 21, The exported method
ServeHTTP on the webhookHandler type is missing a doc comment. Add a doc comment
directly above the ServeHTTP method that starts with "ServeHTTP" and describes
what the method does, what parameters it receives (the http.ResponseWriter and
*http.Request), and what behavior it implements (such as handling webhook
requests). Follow Go's doc comment conventions where the first line starts with
the function name followed by a description of its purpose.

Source: Path instructions

Comment on lines +1 to +4
FROM registry.access.redhat.com/ubi9/ubi-minimal:latest

ADD repo-brancher-controller /usr/bin/repo-brancher-controller
ENTRYPOINT ["/usr/bin/repo-brancher-controller"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security & Privacy | 🔴 Critical | ⚡ Quick win

Container runs as root; add USER directive.

The Dockerfile lacks a USER directive, causing the container to run as UID 0 (root). This violates container security best practices and the path instructions: "USER non-root; never run as root."

🔒 Proposed fix
 FROM registry.access.redhat.com/ubi9/ubi-minimal:latest

+RUN mkdir -p /usr/bin && chown -R 1001:0 /usr/bin
 COPY repo-brancher-controller /usr/bin/repo-brancher-controller
+USER 1001
 ENTRYPOINT ["/usr/bin/repo-brancher-controller"]

UBI images support arbitrary UIDs in the range 1000-65533; UID 1001 is conventional. Ensure the binary has execute permissions before COPY.

As per path instructions, container images must run as non-root.

🧰 Tools
🪛 Hadolint (2.14.0)

[error] 3-3: Use COPY instead of ADD for files and folders

(DL3020)

🪛 Trivy (0.69.3)

[error] 1-1: Image user should not be 'root'

Specify at least 1 USER command in Dockerfile with non-root user as argument

Rule: DS-0002

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@images/repo-brancher-controller/Dockerfile` around lines 1 - 4, The
Dockerfile lacks a USER directive and will run as root by default, violating
security best practices. After the ADD instruction that copies the
repo-brancher-controller binary to /usr/bin/, add a USER directive to run the
container as a non-root user. Use UID 1001 which is the conventional choice for
UBI images and is within the supported arbitrary UID range (1000-65533). This
ensures the container runs under a non-privileged user account while maintaining
compatibility with the UBI minimal base image.

Source: Path instructions

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

@jmguzik: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/integration b455c87 link true /test integration
ci/prow/lint b455c87 link true /test lint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant