Skip to content

Improve socket connection robustness in external_control.urscript#446

Open
srvald wants to merge 5 commits intoUniversalRobots:masterfrom
srvald:master
Open

Improve socket connection robustness in external_control.urscript#446
srvald wants to merge 5 commits intoUniversalRobots:masterfrom
srvald:master

Conversation

@srvald
Copy link

@srvald srvald commented Feb 25, 2026

Summary
This PR strengthens the socket connection logic in external_control.urscript to make driver startup more reliable and fault‑tolerant.
Issue related: #439

What’s changed

Added a connect_socket_with_retry() helper with configurable attempts and delay.
Enforced the required connection order:

  • trajectory_socket
  • script_command_socket
  • reverse_socket (must be last)

Implemented a loop that retries only the sockets that are not yet connected (no redundant reconnects).
Added a popup notification when any connection attempt fails.

Why

Ensures stable behavior during driver startup under transient network conditions.
Avoids unnecessary reconnection attempts once a socket is successfully established.
Avoids the program to run although the sockets haven't been connected.

Testing
Manual tests were performed using netcat to simulate the three server endpoints:

  • When the first socket fails to connect after 5 attempts, a popup appears as expected.
  • Selecting Stop program terminates the script cleanly.
  • Selecting Continue, then enabling the first socket via netcat, results in a successful connection.
  • Upon the second popup (second socket still down), enabling that socket confirms the code does not attempt to - reconnect the already‑connected first socket.
  • After enabling all three sockets, the loop exits as intended and the script prints the success messages indicating all sockets are connected.

Notes

The logic explicitly prevents reverse control from opening before the other two sockets are established.
Popup messaging provides clear operator feedback without spamming unnecessary retries.


Note

Medium Risk
Changes startup connection behavior for the three control sockets and can block program start behind a popup/retry loop, so misconfiguration or edge cases could impact driver bring-up.

Overview
Improves resources/external_control.urscript startup robustness by replacing the three one-shot socket_open calls with a connect_socket_with_retry() helper and a loop that enforces ordered connection (trajectory_socketscript_command_socketreverse_socket).

If any socket cannot be established after retries, the script now shows a blocking error popup with per-socket connection status and only retries the sockets that are still disconnected, preventing external control from running with missing connections.

Written by Cursor Bugbot for commit 553e389. This will update automatically on new commits. Configure here.

@urfeex urfeex linked an issue Feb 26, 2026 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.32%. Comparing base (7b57b66) to head (553e389).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #446      +/-   ##
==========================================
- Coverage   76.33%   76.32%   -0.01%     
==========================================
  Files         104      104              
  Lines        5531     5529       -2     
  Branches      594      593       -1     
==========================================
- Hits         4222     4220       -2     
  Misses       1010     1010              
  Partials      299      299              
Flag Coverage Δ
start_ursim 83.57% <ø> (+0.96%) ⬆️
ur20-latest 68.70% <ø> (ø)
ur5-3.14.3 71.84% <ø> (+0.04%) ⬆️
ur5e-10.11.0 64.95% <ø> (+0.24%) ⬆️
ur5e-10.12.0 65.85% <ø> (-0.35%) ⬇️
ur5e-10.7.0 64.06% <ø> (-0.36%) ⬇️
ur5e-5.9.4 69.17% <ø> (-3.31%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@urfeex urfeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this well-documented change! I'll submit the proposed changes and test it in conjunction with the ROS driver and a looping UR program and then probably merge this.

@urfeex urfeex added the bugfix label Feb 26, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable autofix in the Cursor dashboard.


popup("Error connecting sockets", title="Socket error", blocking=True, error=True)

end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale socket flags after driver restart during popup

Medium Severity

The connection flags (traj_flag, com_flag) are permanently latched True once socket_open succeeds and are never re-evaluated. When the blocking popup is displayed because a later socket failed, the operator's natural recovery action is to restart the driver, which tears down all server-side sockets and invalidates already-established connections. After clicking Continue, only sockets with False flags are retried — stale sockets are skipped. The script then enters the control loop with a mix of dead and live socket connections, causing silent communication failures on trajectory_socket or script_command_socket operations.

Fix in Cursor Fix in Web

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srvald I think, that is actually a valid concern. Could you please add closing the sockets that have been opened when showing the dialog?

Copy link
Member

@urfeex urfeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed during testing that when using this with a looping program it can occur that the script program reconnects to the socket before it is properly shutdown on the remote. This way, it keeps hanging in the first read again.

Adding a timeout to the first read would be the obvious way to solve this, but we added that on purpose.

Alternatively, a sleep at the end of the program would help, as well, but also doesn't feel really clean to me.

srvald and others added 3 commits February 26, 2026 10:11
More documented error popup depending on the socket failing

Co-authored-by: Felix Exner <feex@universal-robots.com>
Remove textmsg as external control active message will appear after sockets have been connected successfully

Co-authored-by: Felix Exner <feex@universal-robots.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

External Control program doesn't fail if opening the sockets fail

2 participants