Skip to content

NIFI-15449 - NAR deletion blocks indefinitely when Python processor is initializing#10753

Merged
exceptionfactory merged 3 commits intoapache:mainfrom
pvillard31:NIFI-15449
Feb 6, 2026
Merged

NIFI-15449 - NAR deletion blocks indefinitely when Python processor is initializing#10753
exceptionfactory merged 3 commits intoapache:mainfrom
pvillard31:NIFI-15449

Conversation

@pvillard31
Copy link
Copy Markdown
Contributor

Summary

NIFI-15449 - NAR deletion blocks indefinitely when Python processor is initializing

The issue is caused by:

  • Deadlock between locks: a deadlock occurs between StandardPythonBridge and StandardExtensionDiscoveringManager:
    • Thread A (NAR deletion): Acquires StandardExtensionDiscoveringManager lock -> waits for StandardPythonBridge lock
    • Thread B (Processor initialization): Acquires StandardPythonBridge lock -> calls getNarDirectories() which waits for StandardExtensionDiscoveringManager lock
  • Non-interruptible initialization: The Python process initialization (virtual environment creation, debugpy installation) runs in a tight loop with no mechanism to cancel or interrupt it when a shutdown is requested.

Changes:

  • Break the deadlock: Pre-compute the NAR directories before acquiring the StandardPythonBridge synchronized block, ensuring locks are always acquired in a consistent order.
  • Make initialization interruptible:
    • Add periodic isShutdown() checks during venv creation and dependency installation loops in PythonProcess
    • Add a CANCELLED state to AsyncLoadedProcessor.LoadState and a cancelLoading() method
    • Implement cancellation support in StandardPythonProcessorBridge that sets a flag checked during initialization
    • Track Python processes from the moment they're created (before start() is called) so they can be properly shut down during NAR deletion

Tracking

Please complete the following tracking steps prior to pull request creation.

Issue Tracking

Pull Request Tracking

  • Pull Request title starts with Apache NiFi Jira issue number, such as NIFI-00000
  • Pull Request commit message starts with Apache NiFi Jira issue number, as such NIFI-00000
  • Pull request contains commits signed with a registered key indicating Verified status

Pull Request Formatting

  • Pull Request based on current revision of the main branch
  • Pull Request refers to a feature branch with one commit containing changes

Verification

Please indicate the verification steps performed prior to pull request creation.

Build

  • Build completed using ./mvnw clean install -P contrib-check
    • JDK 21
    • JDK 25

Licensing

  • New dependencies are compatible with the Apache License 2.0 according to the License Policy
  • New dependencies are documented in applicable LICENSE and NOTICE files

Documentation

  • Documentation formatting appears as expected in rendered files

@pvillard31 pvillard31 added the python Pull requests that update Python code label Jan 9, 2026
Copy link
Copy Markdown
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this issue @pvillard31. The general strategy looks good. I noted a few initial recommendations. I also recommend reviewing most of the new log and exception messages to include a relevant detail, such as the Process ID or component ID where applicable.

@pvillard31
Copy link
Copy Markdown
Contributor Author

Thanks for the review @exceptionfactory - pushed a commit to address your comments

@pvillard31 pvillard31 added the bug label Feb 4, 2026
Copy link
Copy Markdown
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the initial updates @pvillard31. On review, the overall approach looks good, but I noted quite a few logs and comments that seemed too verbose and not useful. After cleaning these up, this should be ready to go forward.

…s initializing

Signed-off-by: Pierre Villard <pierre.villard.fr@gmail.com>
@pvillard31
Copy link
Copy Markdown
Contributor Author

pvillard31 commented Feb 5, 2026

Thanks for the extensive review @exceptionfactory, I really appreciate it. I did add a lot of logs and comments in the code while I was debugging the issue and figuring out the deadlock especially as it was the first time for me to really dig into this part of the code. I just pushed a commit to clean up everything. Thanks again.

Copy link
Copy Markdown
Contributor

@exceptionfactory exceptionfactory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through the feedback @pvillard31, the latest version looks good! +1 merging

@exceptionfactory exceptionfactory merged commit 83259a1 into apache:main Feb 6, 2026
15 of 16 checks passed
yisun-anetac pushed a commit to Eng-Anetac/nifi that referenced this pull request Apr 4, 2026
…ssor is initializing (apache#10753)

Signed-off-by: David Handermann <exceptionfactory@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants