phd-runner option to defer guest cleanup on failure#1088
phd-runner option to defer guest cleanup on failure#1088
Conversation
When `--manual-stop-on-failure` is passed, each propolis-server in failed test cases is left running (if it hadn't been shut down by the test case prior to failure explicitly), and its address is echoed to the operator such that they can e.g. connect to its serial console to investigate or debug whatever may have caused the test failure. The test suite pauses until the instances left in this state are shut down manually, then continues running further tests (unless interrupted). Aside from convenience, this can be useful vs. reproducing test failures with manually-reconstructed scenarios via a transcription of a phd-test's instance spec and steps, which may have differences due to human-scale timing of guest shell command invocations.
7be0c9e to
d2b3f44
Compare
iximeow
left a comment
There was a problem hiding this comment.
whole smattering of comments. this is neat! and it's cool that the plumbing is not too difficult here, I was a little worried on your behalf at first :D
| ); | ||
|
|
||
| if let Some(tx) = success_tx { | ||
| let succeeded = !matches!(&test_outcome, TestOutcome::Failed(_)); |
There was a problem hiding this comment.
(also if you clone the outcome and send that along it instead of just failed-or-not, might be nice to have "test failed because ..." as part of the prelude to a particular VM's informational blurb? might get too wordy. also not really attached to this idea as much as it'd be nice to distinguish the receivers versus the "is Option<bool> the test status, or is that the bool, hmm")
There was a problem hiding this comment.
in my experience the cause of the failure is usually right above this message in the log, so i didn't feel a particular pull to pass it along, but i definitely wouldn't be opposed if you can think of a case for it
When
--manual-stop-on-failureis passed, each propolis-server in failed test cases is left running (if it hadn't been shut down by the test case prior to failure explicitly), and its address is echoed to the operator such that they can e.g. connect to its serial console to investigate or debug whatever may have caused the test failure. The test suite pauses until the instances left in this state are shut down manually, then continues running further tests (unless interrupted).This can be materially useful vs. reproducing test failures with manually-reconstructed scenarios via a transcription of a phd-test's instance spec and steps, which may result in unintended differences along the path to the moment of failure due to human-scale timing of guest shell command invocations, or possible errors in transcription of the instance spec. (I also believe this might be a nice convenience to have in general, even absent those factors.)