Skip to content

[netrid] Check response validity when calling all_flights#1388

Merged
BenjaminPelletier merged 1 commit intointeruss:mainfrom
Orbitalize:fix_1386
Mar 18, 2026
Merged

[netrid] Check response validity when calling all_flights#1388
BenjaminPelletier merged 1 commit intointeruss:mainfrom
Orbitalize:fix_1386

Conversation

@the-glu
Copy link
Contributor

@the-glu the-glu commented Mar 10, 2026

Fix #1386 by checking all_flights responses for errors, before doing anything else.

A few points:

  • Most of steps already had a ISA query check, but is it really an 'ISA query' since it's also querying flights?
  • Some checks expects the query to fail (e.g. with too large area checks) so checks are not done every time. I noticed that some Misbehavior won't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).
  • diagonal_km is computed twice to avoid returning two values, but we could change that.

Copy link
Member

@BenjaminPelletier BenjaminPelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some checks expects the query to fail (e.g. with too large area checks) so checks are not done every time. I noticed that some Misbehavior won't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).

Yes, making sure our documentation is correct and actual actions match documentation are both important, and I think there is probably some existing tech debt that will need to be resolved as part of the main issue. I think the core scenario does work currently though; here is the sequence view artifact from the latest CI run.

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

diagonal_km is computed twice to avoid returning two values, but we could change that.

I think the strong presumption to favor clarity over maximum performance holds here and computing twice is no problem if that makes the code easier to read, understand, maintain, etc.

@the-glu
Copy link
Contributor Author

the-glu commented Mar 11, 2026

Yes, making sure our documentation is correct and actual actions match documentation are both important, and I think there is probably some existing tech debt that will need to be resolved as part of the main issue. I think the core scenario does work currently though; here is the sequence view artifact from the latest CI run.

I think that the issue that in misbehavior, the test should not try to do authentication tests with too large area (nor multiple ones since it's the same), because it will directly fail at initial step, therefore not finding any flights to test.

image

Commenting line 100/101, and doing on "self._rid_version.max_details_diagonal_km * 1000 - 100" check seems to do it more cleanly:

image

(Also print-based-debugging show similar result, however I see the 3 pass/pooling (too large without flights to test, clustered ok and details ok, but I don't understand why the rapport show only 2 "Initial queries" (and one check)), the two OK pass seems to be merged)

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

Yes, with the new explicit check it will be better (see previous screenshots) so we're probably better now.

@the-glu
Copy link
Contributor Author

the-glu commented Mar 11, 2026

(Also print-based-debugging show similar result, however I see the 3 pass/pooling (too large without flights to test, clustered ok and details ok, but I don't understand why the rapport show only 2 "Initial queries" (and one check)), the two OK pass seems to be merged)

Answering to myself:

Because the pooler stop at the first function returning result.

That seems wrong, I would just switch to a fix, self._rid_version.max_diagonal_km * 1000 - 100 area for all tests no?

The goal is to evaluate authentification, not if initial query fail because it's too large; and the dedicated function to test search area is already restricting to this for that reason.

(I can side fix it there or in another PR)

@BenjaminPelletier
Copy link
Member

I think that the issue that in misbehavior, the test should not try to do authentication tests with too large area (nor multiple ones since it's the same), because it will directly fail at initial step, therefore not finding any flights to test.

The authentication step still seems valid with area-too-large. The correct response to a properly-authenticated "GET /flights" request is 413 (area too large). When that same "GET /flights" request is made without authentication, the correct response is 401 (413 reveals some information to an unauthenticated user).

It does seem like good coverage would probably (also) include a correct response of 200 to compare against the correct unauthenticated response of 401, but the current test scenario documentation doesn't specifically call for a 200 or 413 to compare against 401, so comparing 413 to 401 does seem to accurately fulfill/implement the test scenario documentation.

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

Yes, with the new explicit check it will be better (see previous screenshots) so we're probably better now.

Yes, I think a separate, explicit check for the DSS query is an improvement and could be sufficient for a single PR. But, to fully fix the scenario, I think the DSS query check should point to the DSS query it's checking (specify query_timestamp) and each query should have at least 1 check. A query with no check means "the result of this query could be literally anything and there would be nothing of concern to report" (in that case, why make the query at all?) In the new screenshot, the USS /flights query in e29 appears to have no checks. So, it could return literally anything (413: error, 200: flights, 200: cake recipe) and we wouldn't be justified reporting any concerns. If that's actually true, we probably shouldn't make the query. In the more likely event that's not true, we should document the check we'd perform -- presumably we expect that query to work properly, which means if it didn't return 413 we'd be concerned per NET0710,1.

@the-glu
Copy link
Contributor Author

the-glu commented Mar 16, 2026

The authentication step still seems valid with area-too-large. The correct response to a properly-authenticated "GET /flights" request is 413 (area too large). When that same "GET /flights" request is made without authentication, the correct response is 401 (413 reveals some information to an unauthenticated user).
It does seem like good coverage would probably (also) include a correct response of 200 to compare against the correct unauthenticated response of 401, but the current test scenario documentation doesn't specifically call for a 200 or 413 to compare against 401, so comparing 413 to 401 does seem to accurately fulfill/implement the test scenario documentation.

It does validate that a properly-authenticated "GET /flights" request is 413, however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

I do find the current code confusing. We do have the following documentation (witch seems fine, minus 'Initial queries' that need to be fixed)

  • Invalid search area test step [A]
    • Initial queries check [A.1]
    • Area too large check [A.2]
  • Unauthenticated requests test step [B]
    • Initial queries check [B.1]
    • Missing credentials check [B.2]
  • Incorrectly authenticated requests test step [C]
    • Initial queries check [C.1]
    • Invalid credentials check [C.2]

However, if I understand it correctly, the code is doing that (assuming perfect case):

  • Start Invalid search area test step [A]
    • Perform an invalid query (area too large) [A.1]
    • Validate query / area to large result [A.2]
  • Start Unauthenticated requests test step [B]
    • Perform an invalid query (area too large) [B.1] this line
      • Don't check for missing credentials, as query was invalid
    • Perform a clustered query [B.1] this line
      • Check again with missing credentials [B.2]
    • Don't perform a details query as previous check was successful. this line
  • Start Incorrectly authenticated requests test step [C]
    • Perform an invalid query (area too large) [C.1] this line
      • Don't check for wrong credentials, as query was invalid
    • Perform a clustered query [C.1] this line
      • Check again with wrong credentials [C.2]
    • Don't perform a details query as previous check was successful. this line

Aren't the step in bold not needed?
There is no need to do area too large again in part B and C, and no need to have a distinction between clustered and details queries, especially if the later one is never run.

Copy link
Member

@BenjaminPelletier BenjaminPelletier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

Are you sure?

I do find the current code confusing.

I haven't looked at it in detail, but I would not find that surprising at all. Improvements welcome before, during, and after current PR :)

Don't check for missing credentials, as query was invalid

Yeah, that doesn't seem to match documentation/intent.

Don't perform a details query as previous check was successful

Not sure I follow, but actions should definitely match documentation and documentation should clearly communicate what we expect to happen. If either of those things aren't true, we should definitely fix.

Aren't the step in bold not needed?

It seems useful to have wider coverage for unauthenticated and incorrectly-authenticated queries. But we certainly shouldn't do a bunch of stuff during the unauthenticated/incorrectly-authenticated test steps that don't lead to checks against unauthenticated/incorrectly-authenticated requirements, and that seems to be what we do today (e.g., invalid area query in B.1 doesn't lead to any unauthenticated checks).

I'm currently agnostic as to exactly the level of appropriate coverage. Generally, I would expect a positive and negative result for unauthenticated (correctly authenticating is accepted, not authenticating is rejected). But there are an ~infinite number of conditions we could test regarding unauthenticated queries and I haven't formed an opinion on which ones should be included (there is definitely no objective to be exhaustive because that would be effectively impossible). Candidates:

  • Nominal GET /flights
  • GET /flights with area too large
  • GET /flights with various individual arguments invalid
  • All of the above when no flights would be present
  • The first three when some flights would be present and details would be returned
  • The first three when some flights would be present and clusters would be returned

...though presumably nominal GET /flights is more valuable than GET /flights with area too large, despite what we may currently do.

def success(self) -> bool:
return not self.errors

def perform_checks(self, scenario: GenericTestScenario, dss_participant_id: str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is in monitorlib which should be a use-agnostic library of monitoring tools. uss_qualifier is a specific application, so code specific to that application should live in the uss_qualifier folder rather than monitorlib. Basically, we should be able to pull monitorlib out into its own repo which could then be a subrepo to all the other applications in monitoring. Or put another way, any monitoring content outside the monitorlib folder should be able to import anything in the monitorlib folder without the risk of a circular dependency. So, this code should go somewhere in the uss_qualifier folder structure.

Also, "perform_checks" seems to imply a standard set of checks that will be reused in multiple places. If that's the case, it seems like a good fit for a test step fragment which documents these shared/repeated checks, perhaps along with the information-collection action that enables them (if that action is always taken prior to performing these checks). The code to actually perform this test step fragment is usually maintained alongside the documentation for that test step fragment -- e.g., see test_steps.py along with validate*.md.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't go for a test fragment, as it's shared code (fecthing / perform_checks), but the code don't always perform the same call/checks (dues to get_details flag). It's not a check that is optionally done, it's never done, so it didn't match the conditions.

Should we still use fragment? Maybe two for both cases?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow -- if the content of this function is only used once, it seems like it should be located directly next to the one place it's called. If the content is reused more than once, that seems like a candidate for a test step fragment.

the code don't always perform the same call/checks (dues to get_details flag). It's not a check that is optionally done, it's never done, so it didn't match the conditions.

Yeah, that sounds like two test step fragments (two different .md files) implemented by the same function with a boolean flag. But, if a fragment is never re-used (e.g., we only ever ask for details in one situation), then that probably shouldn't be a fragment.

I think this fragment/no-fragment difficulty might reveal a little bit of a code smell -- if we have something like:

def do_stuff(bool do_thing_b):
  do_thing_a()
  if do_thing_b:
    do_thing_b()

...we probably want to drop do_stuff and just use do_thing_a and do_thing_b directly. It would be hard to make a test step fragment for do_stuff, but I'd guess it would be fairly easy to make a test step fragment for do_thing_a and do_thing_b.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that exactly what all_flight is doing, sadly.

But I'm not sure rewriting it seems worth it, as it is nicely integrated and compact as is.

I will do a global pass on the PR with everything clean based on discussions that should be better and we can iterate on that

@the-glu
Copy link
Contributor Author

the-glu commented Mar 17, 2026

however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

Are you sure?

Yes, as this won't return any flight if the area is too large, so code won't go into the loop.

(...) Candidates:

I will do a fixing pass in the second PR, to limite scope of that one.

@the-glu
Copy link
Contributor Author

the-glu commented Mar 18, 2026

@BenjaminPelletier PR is ready for review, with everything ready.

I also created #1396, stacked on that one, for the Unauthenticated / Incorrectly authenticated requests issue

@BenjaminPelletier BenjaminPelletier merged commit 31f047a into interuss:main Mar 18, 2026
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Check response validity separate from flight matching in RID tests

2 participants