Poll for environment available in test process startup in GetProcessTmpDir tests#5782
Poll for environment available in test process startup in GetProcessTmpDir tests#5782hoyosjs wants to merge 1 commit intodotnet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses a Linux timing race where /proc/{pid}/environ may be temporarily empty immediately after Process.Start, impacting TMPDIR resolution in GetProcessTmpDir-related tests.
Changes:
- Add a short polling loop in
GetProcessTmpDir_ChildProcess_ReadsTmpdirto wait for/proc/{pid}/environto become non-empty. - Add an early-return in
GetProcessTmpDirwhenenvironDatais empty, documenting the fork/exec timing window.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/tests/Microsoft.Diagnostics.NETCore.Client/PidIpcEndpointTests.cs | Adds polling for /proc/{pid}/environ population before collecting diagnostics / calling TMPDIR resolution. |
| src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcTransport.cs | Adds handling for empty environ reads in GetProcessTmpDir (documenting fork/exec race). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| bool environPopulated = false; | ||
| const long MAX_TRIES = 20; | ||
| const int DELAY_MS = 50; | ||
| for (int attempt = 0; attempt < MAX_TRIES && !environPopulated; attempt++) |
There was a problem hiding this comment.
Local const naming/type here is inconsistent with the rest of the test suite: other tests use camelCase locals for consts (e.g., retryCount, maxAttempts). Also MAX_TRIES is a long but is only used in an int loop counter. Consider switching to const int maxTries = 20; const int delayMs = 50; (or similar) for consistency and to avoid mixed integral types.
| // Wait for /proc/{pid}/environ to be populated. | ||
| // Between fork() and execve(), the environ file may be empty (0 bytes). | ||
| bool environPopulated = false; | ||
| const long MAX_TRIES = 20; | ||
| const int DELAY_MS = 50; | ||
| for (int attempt = 0; attempt < MAX_TRIES && !environPopulated; attempt++) | ||
| { | ||
| try | ||
| { | ||
| environPopulated = File.ReadAllBytes(environPath).Length > 0; | ||
| } | ||
| catch (IOException) | ||
| { | ||
| // File may be temporarily unavailable. | ||
| } | ||
|
|
||
| if (!environPopulated) | ||
| { | ||
| Thread.Sleep(DELAY_MS); | ||
| } | ||
| } |
There was a problem hiding this comment.
The poll result is computed but not used to ensure the subsequent GetProcessTmpDir call won’t still observe an empty /proc/{pid}/environ. If environPopulated remains false after the loop, the test can still flake (reading empty environ and falling back), defeating the purpose of the retry. Consider either (a) failing fast with the collected diagnostics when environPopulated is false, or (b) reusing the same retry loop around the GetProcessTmpDir call itself until it returns the expected TMPDIR (with a bounded timeout).
| environReadable = true; | ||
|
|
||
| if (environData.Length == 0) | ||
| { | ||
| // Empty environ can mean the process has no environment variables, or that we read | ||
| // between fork() and execve() before the kernel populates the environment. | ||
| return tmpDirPath; | ||
| } | ||
|
|
There was a problem hiding this comment.
GetProcessTmpDir now explicitly recognizes that an empty /proc/{pid}/environ can be a transient fork/exec window, but it still immediately returns the fallback temp path in that case. This can cause callers (e.g., GetDefaultAddress) to search the wrong directory when the target actually has TMPDIR set, producing a spurious connection failure. Consider adding a small bounded retry when environData.Length == 0 (similar to the test change), and only fall back after retries (optionally treating the final empty read as environReadable = false so the error message guidance remains accurate).
| environReadable = true; | |
| if (environData.Length == 0) | |
| { | |
| // Empty environ can mean the process has no environment variables, or that we read | |
| // between fork() and execve() before the kernel populates the environment. | |
| return tmpDirPath; | |
| } | |
| // Retry a few times if the environment is initially empty, to avoid the fork/exec window | |
| // where the kernel has not yet populated /proc/{pid}/environ. | |
| const int maxEmptyEnvironRetries = 3; | |
| const int emptyEnvironRetryDelayMilliseconds = 10; | |
| int remainingRetries = maxEmptyEnvironRetries; | |
| while (environData.Length == 0 && remainingRetries-- > 0) | |
| { | |
| // Empty environ can mean the process has no environment variables, or that we read | |
| // between fork() and execve() before the kernel populates the environment. | |
| Thread.Sleep(emptyEnvironRetryDelayMilliseconds); | |
| try | |
| { | |
| environData = File.ReadAllBytes(string.Format(_procEnvironPathFormat, hostPid)); | |
| } | |
| catch (DirectoryNotFoundException) | |
| { | |
| // The process may have exited between discovery and reading its environ file. | |
| return tmpDirPath; | |
| } | |
| catch (UnauthorizedAccessException) | |
| { | |
| // /proc/{pid}/environ is owner-readable only (0400); other users' environ is inaccessible. | |
| return tmpDirPath; | |
| } | |
| } | |
| if (environData.Length == 0) | |
| { | |
| // After retries, treat an empty environ as effectively unreadable so callers | |
| // can provide accurate guidance. | |
| environReadable = false; | |
| return tmpDirPath; | |
| } | |
| environReadable = true; |
There was a problem hiding this comment.
@copilot - There are several mutations that occur early in the process life-cycle before the connection can be successfully established. If we wanted to poll while establishing a connection it would be preferable to do it at the layer where we can tell if the connection was successful, not on each individual sub-calculation. We shouldn't add polling here and if we do add connection-level polling elsewhere it is out-of-scope for this PR. A comment at the method definition warning that the calculated TmpDir might change during process startup would be nice though.
There's a short period of time between
Process.Startreturning, and the pinvokes into fork and exec. During this period, the target might not have a populated environ procfs file. Loop with a sleep and try again.