Skip to content

Conversation

@chethanv28
Copy link
Collaborator

@chethanv28 chethanv28 commented Dec 29, 2025

What this PR does / why we need it:
There have ben issues reported where, after a namespace deletion operation, Supervisor PVCs remain in terminating state due to CnsNodeVmAttachment CR getting deleted quickly & the PVC protection finalizer removal step is skipped. Although the occurrence of this issue is sporadic, adding a cleanup routine to remove finalizers from stale Supervisor PVCs reclaims space.

Hence, this PR is adding a routine to Cleanup PVC finalizers added by cnsnodevmattachment on Supervisor PVCs, by periodically checking for such orphan Supervisor PVCs which remain in terminating state.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Testing done:
WCP Pre-checkin pipeline : https://jenkins-vcf-csifvt.devops.broadcom.net/job/wcp-instapp-e2e-pre-checkin/845/

<br> PR 3837<br>
Ran 13 of 2324 Specs
SUCCESS! -- 13 Passed | 0 Failed | 0 Pending | 2311 Skipped | 0 Flaked

VKS Pre-checkin pipeline: https://jenkins-vcf-csifvt.devops.broadcom.net/job/vks-instapp-e2e-pre-checkin/793/

<br> PR 3837<br>
Ran 5 of 2324 Specs
SUCCESS! -- 5 Passed | 0 Failed | 0 Pending | 2319 Skipped | 0 Flaked

Manual testing details:

Created a Supervisor PVC & added the cns pvc protection finalizer.
Deleted the PVC and verified it remains in terminating state.

kubectl get pvc -A
NAMESPACE             NAME              STATUS        VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                  VOLUMEATTRIBUTESCLASS   AGE
test-gc-e2e-demo-ns   example-rwo-pvc   Terminating   pvc-d0d86cc3-aea5-4a0d-91b4-6262690a65ee   10Mi       RWO            vsan-default-storage-policy   <unset>                 6m31s

Wait for few minutes and verify that finalizer gets removed & PVC gets deleted successfully

kubectl logs vsphere-csi-controller-588b66f6bb-vpq76 -c vsphere-syncer -n vmware-system-csi | grep "successfully removed CNS PVC final"
{"level":"info","time":"2026-01-06T22:13:30.856345278Z","caller":"manager/cleanupcustomresources.go:330","msg":"successfully removed CNS PVC finalizer from orphaned PVC","TraceId":"6395196a-03cb-4493-94d8-6134ecd71fb9","name":"example-rwo-pvc","namespace":"test-gc-e2e-demo-ns"}

Special notes for your reviewer:

Release note:

Cleanup PVC finalizers added by cnsnodevmattachment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 29, 2025
@deepakkinni
Copy link
Collaborator

Triggering CSI-WCP Pre-checkin Pipeline for this PR... Job takes approximately an hour to complete
Jenkins Build #843

@deepakkinni
Copy link
Collaborator

SUCCESS --- Jenkins Build #845

@kubernetes-sigs kubernetes-sigs deleted a comment from deepakkinni Jan 7, 2026
@kubernetes-sigs kubernetes-sigs deleted a comment from deepakkinni Jan 7, 2026
@deepakkinni
Copy link
Collaborator

Triggering CSI-TKG Pre-checkin Pipeline for this PR... Job takes approximately an hour to complete
Jenkins Build #792

@deepakkinni
Copy link
Collaborator

SUCCESS --- Jenkins Build #793

@xing-yang
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chethanv28, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [chethanv28,xing-yang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

for namespace := range orphanPVCs {
nodeAttachList := nodeattachv1a1.CnsNodeVmAttachmentList{}
err = cnsClient.List(ctx, &nodeAttachList, ctrlclient.InNamespace(namespace))
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code skips cleanup for an entire namespace if ANY CnsNodeVmAttachment CR exists. This means truly orphaned PVCs will never be cleaned up as long as there's at least one CR in the namespace.

// map to hold all the PVCs that are orphaned by CnsNodeVmAttachment reconciler.
// structure of the map follows the logical grouping kubernetes uses.
// namespace -> pvc -> struct{}{}
orphanPVCs := make(map[string]map[string]struct{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using map[string]map[string]struct{} loses the PVC object reference, which could be useful for additional validation.

// All the PVCs that are remaining in the map are eligible orphans whose finalizer can be removed.
for namespace, pvcs := range orphanPVCs {
for name := range pvcs {
err := k8s.RemoveFinalizerFromPVC(ctx, c, name, namespace, cnsoperatortypes.CNSPvcFinalizer)
Copy link
Member

@divyenpatel divyenpatel Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to make sure this routine is not having race with clean up routine in batchattach

Copy link
Member

@divyenpatel divyenpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you address review comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants