Skip to content

fix(k8s): prevent goroutine leak and deadlock in pod watcher#3883

Open
vishwas-droid wants to merge 5 commits into
knative:mainfrom
vishwas-droid:fix-k8s-watcher-leak
Open

fix(k8s): prevent goroutine leak and deadlock in pod watcher#3883
vishwas-droid wants to merge 5 commits into
knative:mainfrom
vishwas-droid:fix-k8s-watcher-leak

Conversation

@vishwas-droid

Copy link
Copy Markdown

Changes

  • 🐛 Fixed a critical goroutine leak inside runWithVolumeMounted by making the pod watcher loop context-aware (<-localCtx.Done()).
  • 🐛 Prevented potential deadlocks/infinite hangs by introducing an errCh to handle abrupt or unexpected watch channel closures from the Kubernetes API server.
  • 🧹 Cleaned up a double type assertion on event.Object to eliminate any sudden runtime panic risks when unexpected object states are encountered.

/kind bug

Release Note

Fixed a potential goroutine leak and deadlock scenario in the Kubernetes pod watcher inside `pkg/k8s`.

@knative-prow knative-prow Bot added the kind/bug Bugs label Jun 6, 2026
@linux-foundation-easycla

linux-foundation-easycla Bot commented Jun 6, 2026

Copy link
Copy Markdown

CLA Signed
The committers listed above are authorized under a signed CLA.

@knative-prow knative-prow Bot requested review from dsimansk and jrangelramos June 6, 2026 16:02
@knative-prow

knative-prow Bot commented Jun 6, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: vishwas-droid
Once this PR has been reviewed and has the lgtm label, please assign gauron99 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow

knative-prow Bot commented Jun 6, 2026

Copy link
Copy Markdown

Welcome @vishwas-droid! It looks like this is your first PR to knative/func 🎉

@knative-prow knative-prow Bot added size/M 🤖 PR changes 30-99 lines, ignoring generated files. needs-ok-to-test 🤖 Needs an org member to approve testing labels Jun 6, 2026
@knative-prow

knative-prow Bot commented Jun 6, 2026

Copy link
Copy Markdown

Hi @vishwas-droid. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Kubernetes pod watcher logic in pkg/k8s to avoid goroutine leaks and hangs while waiting for a pod’s container termination state during runWithVolumeMounted.

Changes:

  • Made the watcher goroutine context-aware so it can exit promptly on cancellation.
  • Added an error channel to surface unexpected watch channel closure from the API server.
  • Removed a redundant/dangerous double type assertion on event.Object to reduce panic risk.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/k8s/persistent_volumes.go Outdated
Comment thread pkg/k8s/persistent_volumes.go
@matejvasek

Copy link
Copy Markdown
Contributor

/ok-to-test

@knative-prow knative-prow Bot added ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test 🤖 Needs an org member to approve testing labels Jun 7, 2026
@vishwas-droid

Copy link
Copy Markdown
Author

@matejvasek Addressed the Copilot suggestion. PTAL.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread pkg/k8s/persistent_volumes.go
Comment thread pkg/k8s/persistent_volumes.go Outdated
@matejvasek

Copy link
Copy Markdown
Contributor

Please rebase on the main branch to fix the CI issues.

@vishwas-droid vishwas-droid force-pushed the fix-k8s-watcher-leak branch from 8b99da8 to ca8558b Compare June 9, 2026 00:07
@vishwas-droid

vishwas-droid commented Jun 9, 2026

Copy link
Copy Markdown
Author

Done, rebased on main. Can you please rerun the ci?

@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 40.62500% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 53.88%. Comparing base (081d663) to head (69c239b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/k8s/persistent_volumes.go 40.62% 17 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3883      +/-   ##
==========================================
+ Coverage   53.78%   53.88%   +0.10%     
==========================================
  Files         200      200              
  Lines       23652    23674      +22     
==========================================
+ Hits        12721    12757      +36     
+ Misses       9707     9683      -24     
- Partials     1224     1234      +10     
Flag Coverage Δ
e2e 33.46% <40.62%> (-0.01%) ⬇️
e2e go 29.40% <50.00%> (+<0.01%) ⬆️
e2e node 25.73% <50.00%> (+<0.01%) ⬆️
e2e python 29.73% <50.00%> (?)
e2e quarkus 25.85% <50.00%> (?)
e2e rust 25.32% <50.00%> (+0.06%) ⬆️
e2e springboot 23.99% <50.00%> (-0.02%) ⬇️
e2e typescript 25.84% <50.00%> (+<0.01%) ⬆️
e2e-config-ci 26.93% <0.00%> (-0.03%) ⬇️
integration 15.68% <50.00%> (+0.02%) ⬆️
unit macos-14 42.80% <0.00%> (-0.04%) ⬇️
unit macos-latest 42.80% <0.00%> (-0.04%) ⬇️
unit ubuntu-24.04-arm 43.12% <0.00%> (-0.05%) ⬇️
unit ubuntu-latest 43.66% <0.00%> (-0.04%) ⬇️
unit windows-latest 42.87% <0.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@matejvasek

Copy link
Copy Markdown
Contributor

@vishwas-droid it appears that test related to your change has failed. Also have you addressed all the copilot's concerns?

@vishwas-droid vishwas-droid force-pushed the fix-k8s-watcher-leak branch from ca8558b to 4e9c6e2 Compare June 9, 2026 15:25
@vishwas-droid

Copy link
Copy Markdown
Author

@matejvasek Addressed the Copilot feedback, and the CI failure related to this PR has also been resolved.

@matejvasek

Copy link
Copy Markdown
Contributor

/retest

@matejvasek

Copy link
Copy Markdown
Contributor

You need to sign CLA.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread pkg/k8s/persistent_volumes.go
Comment thread pkg/k8s/persistent_volumes.go Outdated
@vishwas-droid

Copy link
Copy Markdown
Author

CLA signed, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Bugs ok-to-test 🤖 Non-member PR verified by an org member that is safe to test. size/M 🤖 PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants