fix(private-org-sync): batch ls-remote calls per repo instead of per branch#5027
fix(private-org-sync): batch ls-remote calls per repo instead of per branch#5027petr-muller wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughRefactors sync control flow: adds Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Tools execution failed with the following error: Failed to run tools: 13 INTERNAL: Received RST_STREAM with code 2 (Internal server error) Comment |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: petr-muller The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
48f29ee to
136df2b
Compare
|
/hold Still need to review the code properly and address CodeRabbit reviews |
There was a problem hiding this comment.
Pull request overview
Reduces git ls-remote network calls in private-org-sync by batching branch head discovery once per repository (instead of per-branch), and passing those results into per-branch mirroring.
Changes:
- Update
gitSyncer.mirror()to accept pre-fetched source/destination branch heads and a pre-built destination URL. - Move destination/source
ls-remote(and related URL construction) into the per-repo loop inmain(). - Refactor
TestMirrorto injectsrcHeads/dstHeadsand a destination URL rather than expectingls-remotecalls insidemirror().
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cmd/private-org-sync/main.go | Batches per-repo ls-remote calls and updates mirror() to consume pre-fetched heads/URL. |
| cmd/private-org-sync/main_test.go | Adjusts TestMirror expectations to reflect the refactor (no ls-remote inside mirror()). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
cmd/private-org-sync/main_test.go (1)
258-258: Minor: Error ignored in test setup.The error from
url.Parseis discarded. While this is fine for a hardcoded valid URL in tests, explicitly handling it would be more defensive.♻️ Optional improvement
- destUrl, _ := url.Parse(fmt.Sprintf("https://%s@github.com/%s/%s", token, destOrg, repo)) + destUrl, err := url.Parse(fmt.Sprintf("https://%s@github.com/%s/%s", token, destOrg, repo)) + if err != nil { + t.Fatalf("failed to parse destUrl: %v", err) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/private-org-sync/main_test.go` at line 258, The test currently ignores the error returned by url.Parse when building destUrl; update the test to capture the returned error (e.g., destUrl, err := url.Parse(...)) and assert or fail on it (using t.Fatalf/require.NoError) so any unexpected parse failure for the constructed URL using token, destOrg, and repo is handled instead of silently ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@cmd/private-org-sync/main_test.go`:
- Line 258: The test currently ignores the error returned by url.Parse when
building destUrl; update the test to capture the returned error (e.g., destUrl,
err := url.Parse(...)) and assert or fail on it (using t.Fatalf/require.NoError)
so any unexpected parse failure for the constructed URL using token, destOrg,
and repo is handled instead of silently ignored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e6ae7923-e687-4285-9e9d-d93e163d9f02
📒 Files selected for processing (2)
cmd/private-org-sync/main.gocmd/private-org-sync/main_test.go
|
/test images |
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
|
/test ci/prow/codegen ci/prow/format |
|
/test codegen format |
|
/test images |
|
@CodeRabbit help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
CodeRabbit configuration file (
|
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
2 similar comments
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
…branch Previously, mirror() called ls-remote for both source and destination on every branch, resulting in 2*N network calls per repo where N is the number of branches. Since ls-remote --heads returns ALL branches at once, we can call it once per repo and pass the results to mirror(). This moves git init, remote setup, and ls-remote calls from mirror() into the main loop where repos are already grouped, reducing network calls from 2*N to 2 per repo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract the per-repo sync logic from main() into a syncRepo method on gitSyncer. This makes the repo initialization, ls-remote batching, and branch mirroring flow independently testable. Add TestSyncRepo with 6 test cases covering: branches in sync, one branch needing sync, dst/src ls-remote failures (with and without failOnNonexistentDst), and init failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
a39befe to
ba82144
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
cmd/private-org-sync/main.go:452
mirror()can now skip syncing whensrcHeadsanddstHeadsindicate the branch is in sync (srcCommitHash == dstCommitHash), but those heads are fetched once per repo insyncRepo(). If the source branch advances after the initialls-remote, this branch can be incorrectly treated as in-sync and never fetched/pushed. Consider re-validating the source (and/or destination) HEAD for the specific branch before returning early, or using a strategy that makes the early-exit decision based on up-to-date refs (e.g., fetch then compare).
dstCommitHash := dstHeads[dst.branch]
srcRemote := fmt.Sprintf("%s-%s", src.org, src.repo)
srcCommitHash, ok := srcHeads[src.branch]
if !ok {
logger.Error("Branch does not exist in source remote")
return fmt.Errorf("branch does not exist in source remote")
}
if srcCommitHash == dstCommitHash {
logger.Info("Branches are already in sync")
return nil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| gitDir, err := g.makeGitDir(org, repo) | ||
| if err != nil { | ||
| for _, source := range branches { | ||
| errs = append(errs, fmt.Errorf("%s: %w", source.String(), err)) | ||
| } | ||
| return errs | ||
| } | ||
| logger := g.logger.WithFields(mirrorFields) | ||
| logger.Info("Syncing content between locations") | ||
|
|
||
| // We ls-remote destination first thing because when it does not exist | ||
| // we do not need to do any of the remaining operations. | ||
| logger.Debug("Determining HEAD of destination branch") | ||
| destUrlRaw := fmt.Sprintf("%s/%s/%s", g.prefix, dst.org, dst.repo) | ||
| if err := g.initRepo(gitDir, org, repo); err != nil { | ||
| for _, source := range branches { | ||
| errs = append(errs, fmt.Errorf("%s: %w", source.String(), err)) | ||
| } | ||
| return errs | ||
| } |
There was a problem hiding this comment.
syncRepo() runs makeGitDir() + initRepo() before checking destination accessibility via ls-remote. For the common “destination repo missing / no access and failOnNonexistentDst=false” case, this does unnecessary local work (and can leave behind initialized dirs when --git-dir is persistent). Consider doing the destination ls-remote first (it can run without a local repo) and only initializing the local repo if the destination check succeeds.
|
/test images |
|
Pipeline controller notification No second-stage tests were triggered for this PR. This can happen when:
Use |
|
@petr-muller: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Previously,
mirror()calledls-remotefor both source and destination on every branch, resulting in 2×N network calls per repo where N is the number of branches. Sincels-remote --headsreturns ALL branches at once, we can call it once per repo and pass the results tomirror().This moves git init, remote setup, and ls-remote calls from
mirror()into the main loop where repos are already grouped, reducing network calls from 2×N to 2 per repo.Stacked on #5023 (merged).