[uss_qualifier/scenarios/utm] Cleanup extents of op intents actually created; Extend max test run duration; Introduce max scenario execution duration for 'Verify clear area'#1395
Conversation
badbdaa to
ecc53e2
Compare
BenjaminPelletier
left a comment
There was a problem hiding this comment.
I do think evaluating the expected extents at scenario start time is a good idea when deciding the area to check is clear. But, I think we definitely want to fix the known-incorrect assumption that the maximum test run length is 45 minutes -- given an observation of ~2 hours, probably going up to 6 hours or so seems reasonable, and this would happen to fix the problem also.
I think we should also be sure that our cleanup area is correct, and therefore extend the anticipated extents by the actual extents during the test scenario when we know the intent extents are going to be used for cleanup (we could also consider starting with empty extents after ensuring the area is clear since that would allow us to more tightly wrap the extents actually used).
I think my preference would be to keep the fail-fast feature on flight intent validation (at the beginning of the test run rather than beginning of the test scenario), but if we did call validate_flight_intent_templates at the start of the scenario, we'd need to fix its implementation so it was conceptually valid to call it at that time. Currently, validate_flight_intent_templates seems to assume it is being called at the beginning of the test run because it uses MAX_TEST_RUN_DURATION.
There is some duplicated computation if we validate at scenario instantiation then estimate extents at scenario start, but I think we almost always prefer conceptual clarity over performance or code-length optimization.
|
|
||
| def _setup(self): | ||
| try: | ||
| self.intents_extents = validate_flight_intent_templates( |
There was a problem hiding this comment.
I think this will take care of all current issues with this scenario, but I think it would be a good idea to future-proof even more by aligning what we're doing with the concepts we're trying to implement. The main reasons we want intents_extents are 1) verify area is clear before beginning scenario and 2) clear area of anything from the test scenario afterward.
For 1, we need to make a best guess for the 4D extent of the test scenario. validate_flight_intent_templates does this using an assumption for the maximum test run length. The reason it uses test run length is because the original primary purpose of validate_flight_intent_templates was to make sure the test design was valid and fail fast (at the beginning of the test run) if it wasn't. Note that not performing validate_flight_intent_templates at scenario instantiation time will remove this fail-fast feature. We could keep the feature by running it twice -- once at instantiation to fail fast if the test design was bad, and once at scenario start to determine the maximum extents during the scenario. But note that calling validate_flight_intent_templates at scenario start isn't what it's currently designed to do -- it has a constant for assumed maximum test run length (45 minutes) which we would actually be using as maximum test scenario length. The root cause of the problem is that a maximum test run time of 45 minutes is no longer a valid assumption (the problematic test run was probably up around 2 hours). By moving validate_flight_intent_templates execution to the start of the scenario, we're effectively fixing the bad 45 minute test run assumption by changing the assumption to a 45 minute test scenario assumption. I think that's probably a pretty good new assumption. But I think we should recognize what we're doing and comment on it since I think it's fairly unobvious (or better yet, make it more obvious without comments).
For 2 though, I think we're continuing to make another assumption similar to the original one. When we clean up, we want to clean up the 4D extents that were actually used. For 1, we couldn't know those 4D extents exactly. But for 2, everything is now complete so we no longer need to guess what 4D extents need to be cleaned -- we should simply clean the union/superset of all 4D volumes used in the test. If our new original assumption (<45 minute test scenario time) holds, then intents_extents is good enough for this. But if the test scenario ever took longer than 45 minutes for some reason, we would potentially miss some of the 4D extents actually used during the test.
I don't think this is particularly likely, but we also didn't think a test run longer than 45 minutes was particularly likely either. I think we can protect against these surprises by making sure it's clear what we're doing and why. It seems like what we really want to do is validate the flight intents at instantiation, calculate their maximum predicted extent at the beginning of the test scenario, and then clear (at least) the actual extent used during the test scenario during cleanup. It seems like the motivation to simply move the flight validation is because validate_flight_intent_templates combines the two distinct activities of validation and extent estimation into one function for code brevity and/or performance. I think that is probably the true root of this problem -- because it wasn't obvious that validate_flight_intent_templates was also functioning as estimate_max_extents_during_test_run, it was less obvious that the intents extent used to clean up at the end was just an estimate of the max extents (not necessarily matching or even encompassing the actual extents used during the test, which is what we actually want during cleanup).
ecc53e2 to
b5a6f20
Compare
|
@BenjaminPelletier updated the PR following your comments.
Also, I realize this PR could have been split in smaller chunks. Please LMK if it worth splitting it now. |
b5a6f20 to
d8b6310
Compare
…ntents extents in run rather than __init__
d8b6310 to
b4988e4
Compare
BenjaminPelletier
left a comment
There was a problem hiding this comment.
Also, I realize this PR could have been split in smaller chunks. Please LMK if it worth splitting it now.
I appreciate the focus on small PRs, but +104/-53 lines on pretty closely-related concepts seems comfortably within expectations to me. And, I think contributors should generally get the benefit of the doubt when their single-concept initial PR turns out to have a bit more in it after updates from review comments. Certainly there are times where splitting is justified, but we also want to be practical given that splitting isn't zero cost. But again, focusing on small PRs is very appreciated :)
| self.begin_test_step("Clear operational intents created by virtual USS") | ||
| self._clear_op_intents() | ||
| self.end_test_step() |
There was a problem hiding this comment.
Why remove this operation? The general airspace-clearing should remove all op intents created by a USS (including mock_uss), but uss_qualifier also creates some op intents directly. This is the "virtual USS", and uss_qualifier can't respond to a general airspace-clearing request. So, this intent/expectation here is that uss_qualifier removes any of the op intents it owns but somehow accidentally left behind. That still seems applicable even if we now remove them successfully more often at the end of the scenario (when it works correctly).
This should fix #1394.
This has an impact in the 3 scenarios that use the return value of
validate_flight_intent_templatesin order to determine the extents in which to search for op intents to be cleaned up:This also removes a step of clearing operational intents in setup of scenario execution that should not be needed.