Description
The Problem Currently, our latency sorting algorithm in dzd_latency.rs strictly ranks devices and tunnel endpoints by min_latency_ns. Because we use a stable sort (sort_by), if two devices have the exact same minimum latency, the winner is chosen arbitrarily based on their original order in the array.
In physical networks, it is common for two distinct paths (e.g., Devices in the same metro area/datacenter) to share the exact same minimum propagation delay, but have vastly different congestion profiles.
- Path A: Min = 15ms, Avg = 80ms (High congestion / Jitter)
- Path B: Min = 15ms, Avg = 16ms (Clean / Idle)
Under the current logic, if Path A happens to be evaluated first, the client will connect to the highly congested device, ignoring the perfect connection sitting right next to it.
Proposed Solution
We should update the sorting closures in retrieve_latencies and select_tunnel_endpoint to explicitly use avg_latency_ns as a secondary tie-breaker
Testing Strategy
Alongside the fix, we should add a dedicated unit test (e.g., test_retrieve_latencies_tiebreaker_prefers_lower_avg) that seeds two mock devices with identical minimum latencies but differing averages to ensure the tie-breaker is strictly enforced and prevents regressions.
Description
The Problem Currently, our latency sorting algorithm in dzd_latency.rs strictly ranks devices and tunnel endpoints by
min_latency_ns. Because we use a stable sort (sort_by), if two devices have the exact same minimum latency, the winner is chosen arbitrarily based on their original order in the array.In physical networks, it is common for two distinct paths (e.g., Devices in the same metro area/datacenter) to share the exact same minimum propagation delay, but have vastly different congestion profiles.
Under the current logic, if Path A happens to be evaluated first, the client will connect to the highly congested device, ignoring the perfect connection sitting right next to it.
Proposed Solution
We should update the sorting closures in
retrieve_latenciesandselect_tunnel_endpointto explicitly useavg_latency_nsas a secondary tie-breakerTesting Strategy
Alongside the fix, we should add a dedicated unit test (e.g., test_retrieve_latencies_tiebreaker_prefers_lower_avg) that seeds two mock devices with identical minimum latencies but differing averages to ensure the tie-breaker is strictly enforced and prevents regressions.