feat: improve status on team operator resources#94
Conversation
|
Claude finished @ian-flores's task —— View job Review: feat: improve status on team operator resourcesOverall this is a well-structured change that follows Kubernetes conventions for conditions-based status reporting. The Critical:
|
Tested on ganso01-stagingDeployed adhoc image
|
|
Nice! It looks like the comment |
…ut data loss Extend the enable/disable/teardown pattern from Connect (PR #93) to Workbench, Package Manager, and Chronicle. - Add spec.{workbench,packageManager,chronicle}.enabled and .teardown fields to the Site CRD - Add Suspended field to Workbench, PackageManager, and Chronicle CRDs - Site controller uses three-way branching: enabled → reconcile, disabled → suspend (preserve data), teardown → delete CR - Product controllers skip serving resources when Suspended=true - Network policies cleaned up when products are disabled - Fix Workbench cleanupDeployedService (was a no-op) - Fix Chronicle CleanupChronicle (was a TODO no-op) - Add 9 tests (3 per product: disable-never-enabled, suspend, teardown) Closes #95
- Remove unused getChronicle test helper - Correct "CRD" to "CR" in cleanup function godoc comments - Fix double blank line in product-team-site-management.md
|
Let me wait to finish #99 and I'll address comprehensively for all |
# Conflicts: # internal/controller/core/connect.go
…into improve-status-fields # Conflicts: # internal/controller/core/chronicle_controller.go # internal/controller/core/package_manager.go # internal/controller/core/workbench.go
A disabled product (enabled: false) has no child CR, which caused aggregateChildStatus to set *Ready = false and block the site from reaching Ready. Check Enabled first for each product and treat disabled products as ready so they don't factor into the aggregate. Adds TestSiteReadyWithDisabledProducts to cover this case.
Changes: - Add `Enabled` check for Flightdeck in `aggregateChildStatus` (mirrors the existing Chronicle/Connect/Workbench/PackageManager pattern) so a disabled Flightdeck is treated as ready and doesn't permanently block `allReady` - Change `aggregateChildStatus` signature to return `error`; on non-`IsNotFound` API errors, log and return the error instead of silently setting the ready flag to false - Update `Reconcile` to capture the returned error from `aggregateChildStatus` and propagate it as the return value after status patch, triggering reconciler requeue with backoff on transient API/RBAC failures
Build succeeds. The test failures are pre-existing environment issues (kubebuilder's etcd binary not present in this sandbox), confirmed by the same failures occurring before my changes. Changes: - Fix `allReady` permanently false for Sites with optional Chronicle/Flightdeck: change guard condition from `Enabled != nil && !*Enabled` (only explicit false) to `Enabled == nil || !*Enabled` (nil or false) so these optional components don't block site readiness unless explicitly opted in via `Enabled: true` - Normalize `jsonPath` filter quoting from single to double quotes in Helm chart CRDs for connects, postgresdatabases, and sites - Normalize `jsonPath` filter quoting from single to double quotes in base CRDs for connects, postgresdatabases, and sites
JSONPath single-quote style fix regenerated by controller-gen.
Build passes. The only failing test (`TestSiteReconcileWithExperimental`) is pre-existing and unrelated to these changes. --- Changes: - Add comment to `WithStatusSubresource` block in `api/localtest/fake.go` explaining the purpose of the list and reminding future authors to add new `+kubebuilder:subresource:status` types - Add `api/localtest/fake_test.go` with `TestStatusUpdateOnlyMutatesStatus` that calls `Status().Update()` on a `Site`, confirms the status field is persisted, and confirms the spec was not mutated — documenting intent and guarding against regression if the builder pattern is accidentally reverted
- Add PatchErrorStatus helper to reduce error-path boilerplate across all controllers - Fix inconsistent JSONPath quoting in CRD printcolumn annotations (use single quotes) - Add comprehensive tests for internal/status package - Regenerate CRDs and Helm charts with updated annotations
The build succeeds. The test failures (`TestConnectReconciler_SAML*`) are due to missing `etcd` binary in this environment (`fork/exec /usr/local/kubebuilder/bin/etcd: no such file or directory`) — these are infrastructure-level integration tests that require a running control plane and are unrelated to my changes. All other tests pass. Changes: - Move `patchBase` creation and status mutations (`ObservedGeneration`, `SetProgressing`) to after the suspension guard so they are not silently discarded when `suspendDeployedService` is called (which does not apply the patch) - Remove extra blank line between suspension block and `secretKey` declaration (already resolved in the working tree; confirmed single blank line present)
The build passes cleanly. The test failures are all infrastructure-related (`/usr/local/kubebuilder/bin/etcd: no such file or directory`) — these are integration tests requiring a Kubernetes control plane that isn't available in this environment. They are pre-existing failures unrelated to the changes made. --- **Changes:** - Move suspension check before `patchBase` and status mutations in `ReconcileChronicle` so suspended resources don't have unpersisted status writes - Move suspension check before `patchBase` and status mutations in `ReconcilePackageManager` for the same reason - Move suspension check before `patchBase` and status mutations in `ReconcileWorkbench` for the same reason - Suspended resources no longer receive a misleading "Reconciling" progressing condition
Summary
CommonProductStatustype withConditions,ObservedGeneration, andVersionfields, embedded in all status structs including PostgresDatabaseinternal/statushelper package withSetReady(),SetProgressing(),IsReady(),ExtractVersion()and reason constants (ReasonReconcileComplete,ReasonDeploymentReady/NotReady,ReasonStatefulSetReady/NotReady,ReasonAllComponentsReady,ReasonComponentsNotReady,ReasonDatabaseReady)Progressingcondition correctly reflects rollout state:Trueduring active rollouts,FalsewithReasonReconcileCompletewhen doneConnectReady,WorkbenchReady, etc.) andOwns()watches on child CRsReady,Version, andAgeprintcolumns to all CRDs forkubectl getoutputStatus().Update()toStatus().Patch()withMergeFromto avoid conflictsReadybool field as+optionalfor patch compatibility; value now derived from theReadyconditionWithStatusSubresourcefor all CRD typesExtractVersionhandles registry ports (e.g.,localhost:5000/image:v1) and digest references (image:tag@sha256:...)Closes #25
Test plan
just buildpassesjust mgenerateregenerates CRDs and deepcopy without errorsjust helm-generatekeeps Helm chart in syncgo fmtproduces no changeskubectl get connectsshows READY/VERSION/AGE columns