Skip to content

feat(benchmarks): add source/sync fields and benchmark repo sync#1234

Closed
christso wants to merge 0 commit into
mainfrom
feat/1232-benchmark-repo-sync
Closed

feat(benchmarks): add source/sync fields and benchmark repo sync#1234
christso wants to merge 0 commit into
mainfrom
feat/1232-benchmark-repo-sync

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Closes #1232

Summary

  • Extends BenchmarkEntry with optional source: { url, ref } and sync fields
  • interpolateEnv() is applied to all benchmark entries on load (same as targets.yaml behavior)
  • New benchmark-sync.ts module with three sync modes:
    • none (default) — no-op, path used as-is
    • oneshot — shallow git clone --depth 1 --filter=blob:none (idempotent if .git exists)
    • continuous — spawns a git-sync sidecar; fails fast with actionable error if git-sync is missing
  • agentv studio / agentv serve calls syncBenchmarks() before starting the server
  • agentv install git-sync — downloads a pinned version with SHA-256 verification into ~/.agentv/bin/
  • agentv doctor — reports git-sync presence/absence with install hint

Before/after evidence

Before: No source/sync fields in benchmark entries; no sync capability; no agentv install/agentv doctor.

After:

# ~/.agentv/benchmarks.yaml
benchmarks:
  - id: eval-benchmarks
    name: Eval Benchmarks
    path: /srv/agentv/repo
    source:
      url: ${{ BENCHMARK_REPO_URL }}
      ref: ${{ BENCHMARK_REPO_REF:-main }}
    sync: ${{ AGENTV_SYNC:-none }}
    added_at: "2026-03-20T10:00:00Z"
    last_opened_at: "2026-03-30T14:00:00Z"
# CI/CD — fresh clone per run
AGENTV_SYNC=oneshot agentv eval my.eval.yaml

# Cloud server — continuous mirror
AGENTV_SYNC=continuous agentv studio

# Install git-sync for continuous mode
agentv install git-sync

# Check what's installed
agentv doctor

agentv doctor output (git-sync not installed):

agentv doctor

Local bin dir: /home/user/.agentv/bin

  ✗ git-sync  (not found)
      note:    Required for benchmark sync mode: continuous
      install: agentv install git-sync

Test plan

  • Unit tests for resolveSyncMode, checkGitSyncInstalled, syncBenchmarkOneshot, syncBenchmarks
  • Registry tests for source/sync round-trip through YAML
  • Registry tests for interpolateEnv integration with ${{ BENCH_URL }}/${{ AGENTV_SYNC }}
  • Existing benchmark entries without source or sync continue to work unchanged
  • Full test suite passes (1766 core + 67 eval + 539 CLI)
  • TypeScript typecheck passes
  • Biome lint passes

🤖 Generated with Claude Code

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 12, 2026

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 4ff31cf
Status: ✅  Deploy successful!
Preview URL: https://b7ad0258.agentv.pages.dev
Branch Preview URL: https://feat-1232-benchmark-repo-syn.agentv.pages.dev

View logs

christso added a commit that referenced this pull request May 12, 2026
- Move syncBenchmarks() inside try/catch in serve handler so git errors
  produce clean console.error output rather than unhandled rejections
- Remove unused 'os' import from doctor/index.ts
- Remove duplicate BenchmarkSourceYaml interface (use BenchmarkSource directly)
- Add explicit return type to startBenchmarkContinuous()
- Update benchmarks.ts header to remove unsupported :-default syntax examples
- Add test for continuous branch in syncBenchmarks
- Update deps.json: correct placeholder checksums and version (v4.3.0 format),
  add _note explaining TODOs must be filled before shipping

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@christso christso closed this May 13, 2026
@christso christso force-pushed the feat/1232-benchmark-repo-sync branch from 4ff31cf to 72630e0 Compare May 13, 2026 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Eval benchmark repo sync to remote targets

1 participant