virtio-nic: in-repo tests and a better device state FSM#1064
Conversation
VirtIO 1.2 states clearly that a driver MUST NOT clear bits from device_status (except, implicitly, by writing RESET aka 0 to device_status to.. do a device reset). Since this is the only way to clear a bit, bits that *are* set can be set at most once. Our old FSM accepted whatever status a guest happened to provide and stored that; a guest could intentionally or unintentionally clear NEEDS_RESET even, though it would be very much in error at that point. Instead of this very lax treatment of device_status, be more clear that FEATURES_OK can be toggled to set only once (which fixes a bug on its own), and if a guest tries clearing a status bit by any means other than reset, set NEEDS_RESET and demand a reset outright. Along with this, `viona.rs` gets a very simple driver to exercise some usage patterns we've seen in starting up virtio NIC devices. `basic_operation_multiqueue` fails without the device state FSM improvements, a reflection of the bug mentioned above; since FEATURES_OK is set before DEVICE_OK, the final status write including DEVICE_OK would cause a second attempt to apply the negotiated features. We would then try to set up all the virtqueues again, which fails because the virtqueues were just taken out of the RESET state when the driver initialized them, and we come to rest with the NIC in NEEDS_RESET. Linux (and BSDs, and illumos) clearly do not care about this, as such instances proceed to network seemingly without issue. Windows seems to have a periodic task that notices the NEEDS_RESET bit and obliges. It resets the NIC, which fails to ever come up correctly, and dutifully stops networking (#1048). Commenting out parts of prior patches (957f5c4, 1df49a4, 45af0f7) also cause these new tests to fail; they are collectively as necessary as it seemed for correct operation in the face of resets. Finally, one of the new tests verifies that a device in NEEDS_RESET cannot have the bit inappropriately cleared without reset. This tripped over a deadlock if a virtqueue is set with an inappropriate size. That is fixed in this patch as well.
|
I need to test the actual device state FSM change across a gamut of guests still, and I'm not totally sure how I want to handle communicating the test vnic to these tests (hardcoded device name probably is not it), hence the draft. |
lib/propolis/src/hw/virtio/viona.rs
Outdated
| /// Finally, we'll actually create and destroy some vnics so not only do we need | ||
| /// `dladm`, we need a recent enough viona and everything.. |
There was a problem hiding this comment.
well, not yet, but I want to. I was hesitant to automatically deleting vnics (such as in cleanup) because if your host OS is subject to https://www.illumos.org/issues/17853 then deleting the last vnic will panic your machine. to be clear, this is very fixed in illumos and helios, my OS is just on old bits and I haven't updated yet.
There was a problem hiding this comment.
ok! test does this now! these don't run unless VIONA_TEST_NIC is set, which I'll figure out in CI next. ideally maybe we'd dladm show-link or something to find a link we should use, and this won't be so optional.
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| crate::lifecycle::Lifecycle::reset(test_ctx.dev.as_ref()); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); |
There was a problem hiding this comment.
aesthetically I just want to say I was surprised and experienced an emotion (not sure if good) when I realized all these lines are the same length (except reset())
lib/propolis/src/hw/virtio/viona.rs
Outdated
| test_device_status_writes, | ||
| basic_operation_modern, | ||
| basic_operation_multiqueue, |
There was a problem hiding this comment.
also this is most obviously missing a basic_operation_legacy, which should be straightforward to add from here.
| // For simplicity, shrink `queue_size` small enough that it fits in | ||
| // one page. There are a few additional items for the various parts | ||
| // of virtquues in addition to just an array of 16-byte elements, so | ||
| // we use the next smaller power of two so we round up to one page | ||
| // in the end. |
There was a problem hiding this comment.
the relevant structures below, which I should note here, are struct virtq_desc, struct virtq_avail, and struct virtq_used. I'm kinda just.. not initializing the structures much at all, because just setting valid addresses and sizes up is Good Enough for my purposes today.
I think we'd just interpret the addresses here as the corresponding struct from the VirtIO spec, but I'm not totally sure and these are already useful tests.
|
as-is if a test fails this will leave a stray vnic hanging around; you'll need to and a note on testing: in addition to the tests here, I've given this a try on Windows Server 2022, Ubuntu 22.04, 24.04, Debian 13, FreeBSD 14.1, and OmniOS r151052b, though I did not do an exhaustive mix of boot order configs along the way. guest NICs consistently worked (light validation: ping, dig, curl when available) and continued working across two reboots. |
hawkw
left a comment
There was a problem hiding this comment.
i will admit to just kinda having skimmed some of the tests, so, i'm taking your word on that part, but overall, this looks good. i had some thoughts about the current factoring of apply_status(), but feel free to ignore me if you dislike them...
lib/propolis/src/hw/virtio/viona.rs
Outdated
| /// | ||
| /// Finally, we'll actually create and destroy some vnics so not only do we need | ||
| /// `dladm`, we need a recent enough viona and everything.. | ||
| #[cfg(all(test, target_os = "illumos"))] |
There was a problem hiding this comment.
turbo nitpick: does this module require illumos-specific libraries to compile, or just to run it? if it doesn't, i think it might perhaps be nicer to stick
#[cfg_attr(not(target_os = "illumos"), ignore)]on the actual test functions, so that they're still compiled on e.g. linux, and just don't actually run...?
There was a problem hiding this comment.
yoooo yeah that's a lot better
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| crate::lifecycle::Lifecycle::reset(test_ctx.dev.as_ref()); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); | ||
| let mut driver = test_ctx.create_driver(); | ||
| driver.modern_device_init(expected_feats); |
| testcase!(test_device_status_writes), | ||
| testcase!(basic_operation_modern), | ||
| testcase!(basic_operation_multiqueue), | ||
| testcase!(multiqueue_to_singlequeue_to_multiqueue), |
There was a problem hiding this comment.
it's a bit of a shame that these test cases are all run inside of one big tokio::test, as you can't request one of them by name through nextest (or run with no_fail_fast to discover if multiple ones are broken. i get why you did it this way, so that each one is surrounded by the test setup/teardown stuff and checking the env var and so on.
I kinda wonder if there's a way to achieve this where you could still have multiple test cases , where instead of the macro, you have a
async fn run_testcase(test_fn: impl FnOnce(&TestCtx) {
// read env var, get nic etc
create_vnic(&underlying_nic, TEST_VNIC).await;
let test_ctx = create_test_ctx(TEST_VNIC);
(test_fn)(&test_ctx);
drop(test_ctx);
delete_vnic(TEST_VNIC).await;
}There was a problem hiding this comment.
OH NO NEVER MIND THEY ALL USE THE SAME UNDERLYING NIC SO THEY ALSO HAVE TO BE RUN SEQUENTIALLY okay yeah the current thing is fine
There was a problem hiding this comment.
they should be able to all use the same nic! in the same way a bunch of propolises can have vnics on top of cxgbe0 or whatever. the more immediate issue is that TEST_VNIC is shared so they're all actually using a vnic of the same name, but unique instances of viona.
hmm. i suppose now it's a little more clear to me how we'd have distinct test functions so i'll give this a try. before it seemed like it'd be terrible to read, but.. hmm
Co-authored-by: Eliza Weisman <eliza@elizas.website>
| "Test wants more features than the device offers? \n\ | ||
| Device offers: {:#08x}\n\ | ||
| Test wants: {:#08x}\n\ | ||
| Device lacks: {:#08x}\n", |
VirtIO 1.2 states clearly that a driver MUST NOT clear bits from device_status (except, implicitly, by writing RESET aka 0 to device_status to.. do a device reset). Since this is the only way to clear a bit, bits that are set can be set at most once. Our old FSM accepted whatever status a guest happened to provide and stored that; a guest could intentionally or unintentionally clear NEEDS_RESET even, though it would be very much in error at that point.
Instead of this very lax treatment of device_status, be more clear that FEATURES_OK can be toggled to set only once (which fixes a bug on its own), and if a guest tries clearing a status bit by any means other than reset, set NEEDS_RESET and demand a reset outright.
Along with this,
viona.rsgets a very simple driver to exercise some usage patterns we've seen in starting up virtio NIC devices.basic_operation_multiqueuefails without the device state FSM improvements, a reflection of the bug mentioned above; since FEATURES_OK is set before DEVICE_OK, the final status write including DEVICE_OK would cause a second attempt to apply the negotiated features. We would then try to set up all the virtqueues again, which fails because the virtqueues were just taken out of the RESET state when the driver initialized them, and we come to rest with the NIC in NEEDS_RESET.Linux (and BSDs, and illumos) clearly do not care about this, as such instances proceed to network seemingly without issue. Windows seems to have a periodic task that notices the NEEDS_RESET bit and obliges. It resets the NIC, which fails to ever come up correctly, and dutifully stops networking (#1048).
Commenting out parts of prior patches (957f5c4, 1df49a4, 45af0f7) also cause these new tests to fail; they are collectively as necessary as it seemed for correct operation in the face of resets.
Finally, one of the new tests verifies that a device in NEEDS_RESET cannot have the bit inappropriately cleared without reset. This tripped over a deadlock if a virtqueue is set with an inappropriate size. That is fixed in this patch as well.