[netrid] Check response validity when calling all_flights by the-glu · Pull Request #1388 · interuss/monitoring

the-glu · 2026-03-10T10:55:59Z

Fix #1386 by checking all_flights responses for errors, before doing anything else.

A few points:

Most of steps already had a ISA query check, but is it really an 'ISA query' since it's also querying flights?
Some checks expects the query to fail (e.g. with too large area checks) so checks are not done every time. I noticed that some Misbehavior won't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).
diagonal_km is computed twice to avoid returning two values, but we could change that.

BenjaminPelletier

Some checks expects the query to fail (e.g. with too large area checks) so checks are not done every time. I noticed that some Misbehavior won't work as they expect the initial query to pass, no? (Here there will no flight if area is too large)(But that previous behavior).

Yes, making sure our documentation is correct and actual actions match documentation are both important, and I think there is probably some existing tech debt that will need to be resolved as part of the main issue. I think the core scenario does work currently though; here is the sequence view artifact from the latest CI run.

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

diagonal_km is computed twice to avoid returning two values, but we could change that.

I think the strong presumption to favor clarity over maximum performance holds here and computing twice is no problem if that makes the code easier to read, understand, maintain, etc.

monitoring/uss_qualifier/scenarios/astm/netrid/common/misbehavior.py

monitoring/uss_qualifier/scenarios/astm/netrid/v22a/misbehavior.md

monitoring/uss_qualifier/scenarios/astm/netrid/display_data_evaluator.py

the-glu · 2026-03-11T11:13:34Z

Yes, making sure our documentation is correct and actual actions match documentation are both important, and I think there is probably some existing tech debt that will need to be resolved as part of the main issue. I think the core scenario does work currently though; here is the sequence view artifact from the latest CI run.

I think that the issue that in misbehavior, the test should not try to do authentication tests with too large area (nor multiple ones since it's the same), because it will directly fail at initial step, therefore not finding any flights to test.

Commenting line 100/101, and doing on "self._rid_version.max_details_diagonal_km * 1000 - 100" check seems to do it more cleanly:

(Also print-based-debugging show similar result, however I see the 3 pass/pooling (too large without flights to test, clustered ok and details ok, but I don't understand why the rapport show only 2 "Initial queries" (and one check)), the two OK pass seems to be merged)

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

Yes, with the new explicit check it will be better (see previous screenshots) so we're probably better now.

the-glu · 2026-03-11T11:22:59Z

(Also print-based-debugging show similar result, however I see the 3 pass/pooling (too large without flights to test, clustered ok and details ok, but I don't understand why the rapport show only 2 "Initial queries" (and one check)), the two OK pass seems to be merged)

Answering to myself:

Because the pooler stop at the first function returning result.

That seems wrong, I would just switch to a fix, self._rid_version.max_diagonal_km * 1000 - 100 area for all tests no?

The goal is to evaluate authentification, not if initial query fail because it's too large; and the dedicated function to test search area is already restricting to this for that reason.

(I can side fix it there or in another PR)

BenjaminPelletier · 2026-03-11T19:16:50Z

I think that the issue that in misbehavior, the test should not try to do authentication tests with too large area (nor multiple ones since it's the same), because it will directly fail at initial step, therefore not finding any flights to test.

The authentication step still seems valid with area-too-large. The correct response to a properly-authenticated "GET /flights" request is 413 (area too large). When that same "GET /flights" request is made without authentication, the correct response is 401 (413 reveals some information to an unauthenticated user).

It does seem like good coverage would probably (also) include a correct response of 200 to compare against the correct unauthenticated response of 401, but the current test scenario documentation doesn't specifically call for a 200 or 413 to compare against 401, so comparing 413 to 401 does seem to accurately fulfill/implement the test scenario documentation.

Note though that there seem to be 6 queries for the test step, but only 1 check for the whole test step. That's a strong code smell as I would expect there to be something we would want to check regarding the reasonableness of the response for every query we make. Of course, this is existing tech debt and not caused by this PR, so any fixes could be in a separate PR.

Yes, with the new explicit check it will be better (see previous screenshots) so we're probably better now.

Yes, I think a separate, explicit check for the DSS query is an improvement and could be sufficient for a single PR. But, to fully fix the scenario, I think the DSS query check should point to the DSS query it's checking (specify query_timestamp) and each query should have at least 1 check. A query with no check means "the result of this query could be literally anything and there would be nothing of concern to report" (in that case, why make the query at all?) In the new screenshot, the USS /flights query in e29 appears to have no checks. So, it could return literally anything (413: error, 200: flights, 200: cake recipe) and we wouldn't be justified reporting any concerns. If that's actually true, we probably shouldn't make the query. In the more likely event that's not true, we should document the check we'd perform -- presumably we expect that query to work properly, which means if it didn't return 413 we'd be concerned per NET0710,1.

the-glu · 2026-03-16T08:36:48Z

The authentication step still seems valid with area-too-large. The correct response to a properly-authenticated "GET /flights" request is 413 (area too large). When that same "GET /flights" request is made without authentication, the correct response is 401 (413 reveals some information to an unauthenticated user).
It does seem like good coverage would probably (also) include a correct response of 200 to compare against the correct unauthenticated response of 401, but the current test scenario documentation doesn't specifically call for a 200 or 413 to compare against 401, so comparing 413 to 401 does seem to accurately fulfill/implement the test scenario documentation.

It does validate that a properly-authenticated "GET /flights" request is 413, however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

I do find the current code confusing. We do have the following documentation (witch seems fine, minus 'Initial queries' that need to be fixed)

Invalid search area test step [A]
- Initial queries check [A.1]
- Area too large check [A.2]
Unauthenticated requests test step [B]
- Initial queries check [B.1]
- Missing credentials check [B.2]
Incorrectly authenticated requests test step [C]
- Initial queries check [C.1]
- Invalid credentials check [C.2]

However, if I understand it correctly, the code is doing that (assuming perfect case):

Start Invalid search area test step [A]
- Perform an invalid query (area too large) [A.1]
- Validate query / area to large result [A.2]
Start Unauthenticated requests test step [B]
- Perform an invalid query (area too large) [B.1] this line
  - Don't check for missing credentials, as query was invalid
- Perform a clustered query [B.1] this line
  - Check again with missing credentials [B.2]
- Don't perform a details query as previous check was successful. this line
Start Incorrectly authenticated requests test step [C]
- Perform an invalid query (area too large) [C.1] this line
  - Don't check for wrong credentials, as query was invalid
- Perform a clustered query [C.1] this line
  - Check again with wrong credentials [C.2]
- Don't perform a details query as previous check was successful. this line

Aren't the step in bold not needed?
There is no need to do area too large again in part B and C, and no need to have a distinction between clustered and details queries, especially if the later one is never run.

BenjaminPelletier

however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

Are you sure?

I do find the current code confusing.

I haven't looked at it in detail, but I would not find that surprising at all. Improvements welcome before, during, and after current PR :)

Don't check for missing credentials, as query was invalid

Yeah, that doesn't seem to match documentation/intent.

Don't perform a details query as previous check was successful

Not sure I follow, but actions should definitely match documentation and documentation should clearly communicate what we expect to happen. If either of those things aren't true, we should definitely fix.

Aren't the step in bold not needed?

It seems useful to have wider coverage for unauthenticated and incorrectly-authenticated queries. But we certainly shouldn't do a bunch of stuff during the unauthenticated/incorrectly-authenticated test steps that don't lead to checks against unauthenticated/incorrectly-authenticated requirements, and that seems to be what we do today (e.g., invalid area query in B.1 doesn't lead to any unauthenticated checks).

I'm currently agnostic as to exactly the level of appropriate coverage. Generally, I would expect a positive and negative result for unauthenticated (correctly authenticating is accepted, not authenticating is rejected). But there are an ~infinite number of conditions we could test regarding unauthenticated queries and I haven't formed an opinion on which ones should be included (there is definitely no objective to be exhaustive because that would be effectively impossible). Candidates:

Nominal GET /flights
GET /flights with area too large
GET /flights with various individual arguments invalid
All of the above when no flights would be present
The first three when some flights would be present and details would be returned
The first three when some flights would be present and clusters would be returned

...though presumably nominal GET /flights is more valuable than GET /flights with area too large, despite what we may currently do.

BenjaminPelletier · 2026-03-16T21:27:55Z

monitoring/monitorlib/fetch/rid.py

    def success(self) -> bool:
        return not self.errors

+    def perform_checks(self, scenario: GenericTestScenario, dss_participant_id: str):


This file is in monitorlib which should be a use-agnostic library of monitoring tools. uss_qualifier is a specific application, so code specific to that application should live in the uss_qualifier folder rather than monitorlib. Basically, we should be able to pull monitorlib out into its own repo which could then be a subrepo to all the other applications in monitoring. Or put another way, any monitoring content outside the monitorlib folder should be able to import anything in the monitorlib folder without the risk of a circular dependency. So, this code should go somewhere in the uss_qualifier folder structure.

Also, "perform_checks" seems to imply a standard set of checks that will be reused in multiple places. If that's the case, it seems like a good fit for a test step fragment which documents these shared/repeated checks, perhaps along with the information-collection action that enables them (if that action is always taken prior to performing these checks). The code to actually perform this test step fragment is usually maintained alongside the documentation for that test step fragment -- e.g., see test_steps.py along with validate*.md.

I didn't go for a test fragment, as it's shared code (fecthing / perform_checks), but the code don't always perform the same call/checks (dues to get_details flag). It's not a check that is optionally done, it's never done, so it didn't match the conditions.

Should we still use fragment? Maybe two for both cases?

I'm not sure I follow -- if the content of this function is only used once, it seems like it should be located directly next to the one place it's called. If the content is reused more than once, that seems like a candidate for a test step fragment.

the code don't always perform the same call/checks (dues to get_details flag). It's not a check that is optionally done, it's never done, so it didn't match the conditions.

Yeah, that sounds like two test step fragments (two different .md files) implemented by the same function with a boolean flag. But, if a fragment is never re-used (e.g., we only ever ask for details in one situation), then that probably shouldn't be a fragment.

I think this fragment/no-fragment difficulty might reveal a little bit of a code smell -- if we have something like:

def do_stuff(bool do_thing_b): do_thing_a() if do_thing_b: do_thing_b()

...we probably want to drop do_stuff and just use do_thing_a and do_thing_b directly. It would be hard to make a test step fragment for do_stuff, but I'd guess it would be fairly easy to make a test step fragment for do_thing_a and do_thing_b.

Yes, that exactly what all_flight is doing, sadly.

But I'm not sure rewriting it seems worth it, as it is nicely integrated and compact as is.

I will do a global pass on the PR with everything clean based on discussions that should be better and we can iterate on that

monitoring/uss_qualifier/scenarios/astm/netrid/display_data_evaluator.py

monitoring/monitorlib/fetch/rid.py

the-glu · 2026-03-17T07:18:19Z

however it doesn't validate that a badly-authenticated "GET /flights" request is 401 (in authentication step).

Are you sure?

Yes, as this won't return any flight if the area is too large, so code won't go into the loop.

(...) Candidates:

I will do a fixing pass in the second PR, to limite scope of that one.

the-glu · 2026-03-18T11:35:41Z

@BenjaminPelletier PR is ready for review, with everything ready.

I also created #1396, stacked on that one, for the Unauthenticated / Incorrectly authenticated requests issue

…ts (#1388) 31f047a

the-glu force-pushed the fix_1386 branch from 0f4c94e to a29e31d Compare March 10, 2026 11:11

BenjaminPelletier reviewed Mar 10, 2026

View reviewed changes

the-glu force-pushed the fix_1386 branch from a29e31d to 15d3b96 Compare March 11, 2026 10:48

the-glu force-pushed the fix_1386 branch from 15d3b96 to 79ed434 Compare March 16, 2026 09:54

BenjaminPelletier reviewed Mar 16, 2026

View reviewed changes

the-glu force-pushed the fix_1386 branch from 79ed434 to 0fe2bff Compare March 18, 2026 11:06

[netrid] Check response validity when calling all_flights

f97c1d1

the-glu force-pushed the fix_1386 branch from 0fe2bff to f97c1d1 Compare March 18, 2026 11:17

the-glu mentioned this pull request Mar 18, 2026

[netrid] Fix misbehaviour tests by not using too large area for Unauthenticated / Incorrectly authenticated requests #1396

Merged

the-glu requested a review from BenjaminPelletier March 18, 2026 11:35

BenjaminPelletier approved these changes Mar 18, 2026

View reviewed changes

BenjaminPelletier merged commit 31f047a into interuss:main Mar 18, 2026
22 checks passed

github-actions bot added a commit that referenced this pull request Mar 18, 2026

[uss_qualifier/netrid] Check response validity when calling all_fligh…

a272e79

…ts (#1388) 31f047a

Conversation

the-glu commented Mar 10, 2026

Uh oh!

BenjaminPelletier left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

the-glu commented Mar 11, 2026

Uh oh!

the-glu commented Mar 11, 2026

Uh oh!

BenjaminPelletier commented Mar 11, 2026

Uh oh!

the-glu commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminPelletier left a comment

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

the-glu Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

BenjaminPelletier Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

the-glu Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

the-glu commented Mar 17, 2026

Uh oh!

the-glu commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

the-glu commented Mar 16, 2026 •

edited

Loading