Bound shutdown after termination signal#6164
Conversation
|
如果观测到了意料外的 Close() 阻塞 应该直接找到阻塞 Close 的是谁 耗时太久肯定是设计问题 尝试异步关闭 Close 不是一个好办法 |
we have similar situation with #5844 but this can happens at both server and client, there is another reason why most people didn't realize this problem, it's because of allowing reuse port, client will run another process without knowing there is a zombie process, since most people even don't know or doesn't use this feature its better to make it disable by default, maybe if it was disabled this issue have been found way sooner since were closing the process i think this would be enough and GC will handle the dirty works, yes there may be better approach but it needs deep inspection, very good understanding about the project structure and takes way much time and testing. |
|
https://github.com/XTLS/Xray-core/tree/close-trace |
i handled this before for server side in my project and i kill the zombie process manually with a double check, at the time we found no exact pattern of this, and there is no guaranty it happens again, also were seeing it mostly on client side (mobile usually) |
i tried this version on windows but i was unable to reproduce the issue |
|
|
This change prevents Xray from waiting indefinitely during graceful shutdown after receiving SIGTERM or interrupt.
Previously, shutdown depended on the deferred server.Close() call returning. If any feature Close() path blocked, the process could stay alive while listeners or active connections continued operating. This made stale Xray processes difficult to detect, especially when another process could bind the same port.
Now Xray runs server.Close() explicitly after receiving a termination signal and waits up to 10 seconds. If shutdown does not complete in time, the process exits with code 1 so supervisors can treat it as a failed shutdown and avoid leaving a functional stale process behind.
This issue was most visible on mobile clients. Some apps, such as Telegram proxy-style traffic, may continue using an already-established path even when the VPN/client UI appears disabled. If the old Xray process remains alive during a stuck shutdown, those connections can continue transferring traffic and updating usage counters, making the node look stopped from the client side while traffic is still flowing server-side.