phd-runner option to defer guest cleanup on failure by lifning · Pull Request #1088 · oxidecomputer/propolis

lifning · 2026-03-24T07:43:43Z

When --manual-stop-on-failure is passed, each propolis-server in failed test cases is left running (if it hadn't been shut down by the test case prior to failure explicitly), and its address is echoed to the operator such that they can e.g. connect to its serial console to investigate or debug whatever may have caused the test failure. The test suite pauses until the instances left in this state are shut down manually, then continues running further tests (unless interrupted).

This can be materially useful vs. reproducing test failures with manually-reconstructed scenarios via a transcription of a phd-test's instance spec and steps, which may result in unintended differences along the path to the moment of failure due to human-scale timing of guest shell command invocations, or possible errors in transcription of the instance spec. (I also believe this might be a nice convenience to have in general, even absent those factors.)

When `--manual-stop-on-failure` is passed, each propolis-server in failed test cases is left running (if it hadn't been shut down by the test case prior to failure explicitly), and its address is echoed to the operator such that they can e.g. connect to its serial console to investigate or debug whatever may have caused the test failure. The test suite pauses until the instances left in this state are shut down manually, then continues running further tests (unless interrupted). Aside from convenience, this can be useful vs. reproducing test failures with manually-reconstructed scenarios via a transcription of a phd-test's instance spec and steps, which may have differences due to human-scale timing of guest shell command invocations.

iximeow

whole smattering of comments. this is neat! and it's cool that the plumbing is not too difficult here, I was a little worried on your behalf at first :D

phd-tests/framework/src/test_vm/mod.rs

phd-tests/runner/src/execute.rs

iximeow · 2026-03-25T19:52:04Z

phd-tests/runner/src/execute.rs

        );

+        if let Some(tx) = success_tx {
+            let succeeded = !matches!(&test_outcome, TestOutcome::Failed(_));


(also if you clone the outcome and send that along it instead of just failed-or-not, might be nice to have "test failed because ..." as part of the prelude to a particular VM's informational blurb? might get too wordy. also not really attached to this idea as much as it'd be nice to distinguish the receivers versus the "is Option<bool> the test status, or is that the bool, hmm")

in my experience the cause of the failure is usually right above this message in the log, so i didn't feel a particular pull to pass it along, but i definitely wouldn't be opposed if you can think of a case for it

phd-tests/framework/src/test_vm/mod.rs

lifning requested a review from iximeow March 24, 2026 07:43

lifning force-pushed the lif/phd-plot-armor branch from 7be0c9e to d2b3f44 Compare March 25, 2026 04:40

iximeow reviewed Mar 25, 2026

View reviewed changes

PR feedback, thanks ixi

e2470bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

phd-runner option to defer guest cleanup on failure#1088

phd-runner option to defer guest cleanup on failure#1088
lifning wants to merge 2 commits intomasterfrom
lif/phd-plot-armor

lifning commented Mar 24, 2026

Uh oh!

iximeow left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iximeow Mar 25, 2026

Uh oh!

lifning Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lifning commented Mar 24, 2026

Uh oh!

iximeow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

iximeow Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

lifning Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants