Skip to content

Poll for environment available in test process startup in GetProcessTmpDir tests#5782

Open
hoyosjs wants to merge 1 commit intodotnet:mainfrom
hoyosjs:juhoyosa/test-failure
Open

Poll for environment available in test process startup in GetProcessTmpDir tests#5782
hoyosjs wants to merge 1 commit intodotnet:mainfrom
hoyosjs:juhoyosa/test-failure

Conversation

@hoyosjs
Copy link
Copy Markdown
Member

@hoyosjs hoyosjs commented Mar 28, 2026

There's a short period of time between Process.Start returning, and the pinvokes into fork and exec. During this period, the target might not have a populated environ procfs file. Loop with a sleep and try again.

@hoyosjs hoyosjs requested a review from a team as a code owner March 28, 2026 10:12
Copilot AI review requested due to automatic review settings March 28, 2026 10:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a Linux timing race where /proc/{pid}/environ may be temporarily empty immediately after Process.Start, impacting TMPDIR resolution in GetProcessTmpDir-related tests.

Changes:

  • Add a short polling loop in GetProcessTmpDir_ChildProcess_ReadsTmpdir to wait for /proc/{pid}/environ to become non-empty.
  • Add an early-return in GetProcessTmpDir when environData is empty, documenting the fork/exec timing window.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/tests/Microsoft.Diagnostics.NETCore.Client/PidIpcEndpointTests.cs Adds polling for /proc/{pid}/environ population before collecting diagnostics / calling TMPDIR resolution.
src/Microsoft.Diagnostics.NETCore.Client/DiagnosticsIpc/IpcTransport.cs Adds handling for empty environ reads in GetProcessTmpDir (documenting fork/exec race).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +165 to +168
bool environPopulated = false;
const long MAX_TRIES = 20;
const int DELAY_MS = 50;
for (int attempt = 0; attempt < MAX_TRIES && !environPopulated; attempt++)
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local const naming/type here is inconsistent with the rest of the test suite: other tests use camelCase locals for consts (e.g., retryCount, maxAttempts). Also MAX_TRIES is a long but is only used in an int loop counter. Consider switching to const int maxTries = 20; const int delayMs = 50; (or similar) for consistency and to avoid mixed integral types.

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +183
// Wait for /proc/{pid}/environ to be populated.
// Between fork() and execve(), the environ file may be empty (0 bytes).
bool environPopulated = false;
const long MAX_TRIES = 20;
const int DELAY_MS = 50;
for (int attempt = 0; attempt < MAX_TRIES && !environPopulated; attempt++)
{
try
{
environPopulated = File.ReadAllBytes(environPath).Length > 0;
}
catch (IOException)
{
// File may be temporarily unavailable.
}

if (!environPopulated)
{
Thread.Sleep(DELAY_MS);
}
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The poll result is computed but not used to ensure the subsequent GetProcessTmpDir call won’t still observe an empty /proc/{pid}/environ. If environPopulated remains false after the loop, the test can still flake (reading empty environ and falling back), defeating the purpose of the retry. Consider either (a) failing fast with the collected diagnostics when environPopulated is false, or (b) reusing the same retry loop around the GetProcessTmpDir call itself until it returns the expected TMPDIR (with a bounded timeout).

Copilot uses AI. Check for mistakes.
Comment on lines 447 to +455
environReadable = true;

if (environData.Length == 0)
{
// Empty environ can mean the process has no environment variables, or that we read
// between fork() and execve() before the kernel populates the environment.
return tmpDirPath;
}

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetProcessTmpDir now explicitly recognizes that an empty /proc/{pid}/environ can be a transient fork/exec window, but it still immediately returns the fallback temp path in that case. This can cause callers (e.g., GetDefaultAddress) to search the wrong directory when the target actually has TMPDIR set, producing a spurious connection failure. Consider adding a small bounded retry when environData.Length == 0 (similar to the test change), and only fall back after retries (optionally treating the final empty read as environReadable = false so the error message guidance remains accurate).

Suggested change
environReadable = true;
if (environData.Length == 0)
{
// Empty environ can mean the process has no environment variables, or that we read
// between fork() and execve() before the kernel populates the environment.
return tmpDirPath;
}
// Retry a few times if the environment is initially empty, to avoid the fork/exec window
// where the kernel has not yet populated /proc/{pid}/environ.
const int maxEmptyEnvironRetries = 3;
const int emptyEnvironRetryDelayMilliseconds = 10;
int remainingRetries = maxEmptyEnvironRetries;
while (environData.Length == 0 && remainingRetries-- > 0)
{
// Empty environ can mean the process has no environment variables, or that we read
// between fork() and execve() before the kernel populates the environment.
Thread.Sleep(emptyEnvironRetryDelayMilliseconds);
try
{
environData = File.ReadAllBytes(string.Format(_procEnvironPathFormat, hostPid));
}
catch (DirectoryNotFoundException)
{
// The process may have exited between discovery and reading its environ file.
return tmpDirPath;
}
catch (UnauthorizedAccessException)
{
// /proc/{pid}/environ is owner-readable only (0400); other users' environ is inaccessible.
return tmpDirPath;
}
}
if (environData.Length == 0)
{
// After retries, treat an empty environ as effectively unreadable so callers
// can provide accurate guidance.
environReadable = false;
return tmpDirPath;
}
environReadable = true;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot - There are several mutations that occur early in the process life-cycle before the connection can be successfully established. If we wanted to poll while establishing a connection it would be preferable to do it at the layer where we can tell if the connection was successful, not on each individual sub-calculation. We shouldn't add polling here and if we do add connection-level polling elsewhere it is out-of-scope for this PR. A comment at the method definition warning that the calculated TmpDir might change during process startup would be nice though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Tests] GetProcessTmpDir_ChildProcess_ReadsTmpdir fails with Assert.Equal() Failure

4 participants