Conversation
chaubold
approved these changes
Mar 19, 2026
9af8607 to
12b3548
Compare
…unication failures Logs gateway lifecycle events (creation, closure) with thread and identity information to help diagnose the root cause of "Cannot obtain a new communication channel" errors. Tracks gateway ownership through object hash codes for correlation. - DefaultPythonGateway.close(): Log PID and calling thread at INFO level - PythonScriptingSession: Log gateway hash and thread at creation (INFO) and shutdown (ERROR) - PythonGatewayTracker.clear(): Log process count and triggering thread at ERROR level - QueuedPythonGatewayFactory: Log eviction count and thread at gate-close (WARN) - PythonGatewayCreationGate: Include thread name in P2 phase event logs (INFO) - PythonScriptNodeModel: Handle no-cause "Cannot obtain" variant with improved error message When this error occurs again, correlating gateway hash and PID across log entries will reveal which code path triggered the unexpected shutdown. AP-25563 (Investigate "Cannot obtain a new communication channel" Python failures)
12b3548 to
ad8b02f
Compare
|
Contributor
There was a problem hiding this comment.
Pull request overview
Adds diagnostic logging and improved error messaging across the KNIME Java↔Python (Py4J) gateway lifecycle to help trace intermittent “Cannot obtain a new communication channel” failures by correlating gateway identity, PID, and thread context.
Changes:
- Add structured lifecycle logs (thread, gateway hash, PID, eviction counts) across gateway creation/closure and installation gating.
- Improve handling of the no-cause “Cannot obtain a new communication channel” Py4J exception with a clearer KNIMEException message and resolutions.
- Enhance installation-phase logs and tracker diagnostics to better identify who/what triggered gateway shutdown/termination.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| org.knime.python3/src/main/java/org/knime/python3/QueuedPythonGatewayFactory.java | Logs eviction count/thread when the gateway creation gate closes before evicting queued gateways. |
| org.knime.python3/src/main/java/org/knime/python3/PythonGatewayTracker.java | Adds process count + triggering thread to the “aborting running Python processes” log entry. |
| org.knime.python3/src/main/java/org/knime/python3/PythonGatewayCreationGate.java | Adds thread name to P2 phase transition INFO logs controlling gateway creation blocking/unblocking. |
| org.knime.python3/src/main/java/org/knime/python3/DefaultPythonGateway.java | Logs PID + calling thread when closing a Python gateway. |
| org.knime.python3.scripting.nodes/src/main/java/org/knime/python3/scripting/nodes2/PythonScriptingSession.java | Logs gateway identity hash on creation; adds targeted ERROR diagnostics when CallbackClient is already shut down. |
| org.knime.python3.scripting.nodes/src/main/java/org/knime/python3/scripting/nodes2/PythonScriptNodeModel.java | Adds a dedicated handler branch for the no-cause “Cannot obtain…” error with a clearer user-facing message. |
Comment on lines
+220
to
+226
| var gatewaysToEvict = m_gateways.values().stream()// | ||
| .flatMap(Collection::stream)// | ||
| .collect(Collectors.toList()); | ||
| LOGGER.warnWithFormat( | ||
| "PythonGatewayCreationGate closed: evicting %d queued gateways from thread '%s'", | ||
| gatewaysToEvict.size(), Thread.currentThread().getName()); | ||
| evictGateways(gatewaysToEvict); |
| @Override | ||
| public void close() throws IOException { | ||
| if (m_clientServer != null) { | ||
| LOGGER.infoWithFormat("Closing PythonGateway (PID=%s) from thread '%s'", m_pid, |
Comment on lines
+185
to
+186
| LOGGER.info("Blocking Python process startup during installation (thread='" | ||
| + Thread.currentThread().getName() + "')"); |
| + "If this leads to failures in node execution, " | ||
| + "please restart those nodes once the installation has finished"); | ||
| LOGGER.errorWithFormat( | ||
| "Found running Python processes (%d). Aborting them to allow installation process. " |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


…unication failures
Logs gateway lifecycle events (creation, closure) with thread and identity information to help diagnose the root cause of "Cannot obtain a new communication channel" errors. Tracks gateway ownership through object hash codes for correlation.
When this error occurs again, correlating gateway hash and PID across log entries will reveal which code path triggered the unexpected shutdown.
AP-25563 (Investigate "Cannot obtain a new communication channel" Python failures)