Overview
Formalize the software development lifecycle for cluster changes using our existing tools.
Current State
- flux-local (✅ implemented) — Validates Kustomizations + renders Helm on PRs, posts diffs as comments
- vCluster dev (✅ deployed) — Sandbox for runtime testing
- Renovate (✅ running) — Automated dependency updates via PRs
- Lefthook (✅ implemented) — Pre-commit YAML formatting
- CODEOWNERS (✅ implemented) — Path-based review requirements
Proposed Pipeline
Stage 1: Local Development
lefthook pre-commit hooks catch formatting issues
task commands for common operations (volsync, reconcile)
Stage 2: Pull Request
- flux-local
test validates all Kustomizations render correctly
- flux-local
diff shows exactly what changes in the cluster
- CODEOWNERS requires review for critical paths
- New: Add kubeconform schema validation step
- New: Add policy checks (Kyverno dry-run or conftest)
Stage 3: vCluster Testing (for risky changes)
Not every PR needs vCluster — use it for:
- Gateway API migration (new networking stack)
- CRD changes or operator upgrades
- Path-based Flux scoping restructure
- Any change that could break cluster-wide resources
Workflow:
PR created → flux-local validates → reviewer approves →
deploy to vCluster → verify → merge to main → Flux reconciles prod
Stage 4: Production
- Flux reconciles from main automatically
- Alertmanager notifies on failures
- Tuppr handles Talos/K8s upgrades in maintenance windows
- VolSync backs up stateful data
Tasks
Key Insight
flux-local is the gatekeeper, vCluster is the proving ground. Most changes only need flux-local. vCluster is for the scary stuff — networking changes, operator upgrades, structural refactors. This keeps velocity high while managing risk.
Related Issues
Overview
Formalize the software development lifecycle for cluster changes using our existing tools.
Current State
Proposed Pipeline
Stage 1: Local Development
lefthookpre-commit hooks catch formatting issuestaskcommands for common operations (volsync, reconcile)Stage 2: Pull Request
testvalidates all Kustomizations render correctlydiffshows exactly what changes in the clusterStage 3: vCluster Testing (for risky changes)
Not every PR needs vCluster — use it for:
Workflow:
Stage 4: Production
Tasks
taskcommands for vCluster test deploymentKey Insight
flux-local is the gatekeeper, vCluster is the proving ground. Most changes only need flux-local. vCluster is for the scary stuff — networking changes, operator upgrades, structural refactors. This keeps velocity high while managing risk.
Related Issues