Skip to content

feat(filter): support nested-namespace map_entity_field; expose evaluator labels (AIRCORE-389)#196

Open
maxdubrinsky wants to merge 2 commits into
mainfrom
aircore-389-labels-prefix/md
Open

feat(filter): support nested-namespace map_entity_field; expose evaluator labels (AIRCORE-389)#196
maxdubrinsky wants to merge 2 commits into
mainfrom
aircore-389-labels-prefix/md

Conversation

@maxdubrinsky
Copy link
Copy Markdown
Contributor

@maxdubrinsky maxdubrinsky commented Jun 4, 2026

`filter[data.labels.*]` worked but was an implementation detail that leaked storage layout to users. This adds a `namespace=True` flag to `map_entity_field` so a filter model can declare a named prefix — e.g. `labels` → `data.labels` — that the parser rewrites transparently. Evaluator benchmarks and metrics now expose `labels` as such a namespace, making `filter[labels.eval_category]=agentic` the canonical form.

Docs updated to show only the short path.

Summary by CodeRabbit

  • New Features

    • Added support for filtering benchmarks and metrics by entity labels using simplified bracket notation syntax (e.g., filter[labels.eval_category]).
  • Documentation

    • Updated filtering guidance and benchmark management documentation to reflect the new label filter syntax approach across all relevant guides.

…ator labels (AIRCORE-389)

Extend map_entity_field with a `namespace=True` flavor so a filter model can map
a nested data blob (not just a scalar) to its entity-store path. Sub-paths under
a declared namespace (`labels.<key>`) are validated and rewritten to
`data.labels.<key>` generically: the shared filter parser learns "this field is a
namespace" from the model's annotation and holds no service-specific knowledge.

Evaluator BenchmarksListFilter and MetricsListFilter declare `labels` as such a
namespace, so users can write `filter[labels.eval_category]=agentic` instead of
the storage-qualified `filter[data.labels.eval_category]` (which still works).

- nmp_common entities.values: map_entity_field(path, namespace=True);
  _get_entity_namespace_map(); prefix rewrite in _apply_field_map; namespace
  fields excluded from the scalar field map and folded into translate_operation.
- nemo_platform_plugin api.parsed_filter: _validate_fields accepts sub-paths of
  model-declared namespaces; make_filter_dep threads the namespace set through.
- evaluator: BenchmarksListFilter / MetricsListFilter declare a `labels` namespace.
- docs updated to the short labels.* path; data.labels.* documented as the
  explicit fallback for any non-namespace nested field.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
The long form is an implementation detail; the short labels.<key> path
is what users should use. No pre-existing docs mentioned the long form,
so there's nothing to stay consistent with.

Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>
@maxdubrinsky maxdubrinsky requested review from a team as code owners June 4, 2026 22:19
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Documentation preview is ready

Preview: https://nvidia-nemo.github.io/nemo-platform/pr-preview/pr-196/pr-196/

Built from 9979f7f in workflow run.

This preview is deployed from this PR branch, updates when docs changes are pushed, and will be removed when the PR closes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR adds namespace-aware field mapping to enable cleaner label filtering syntax (filter[labels.eval_harness] instead of filter[data.labels.eval_harness]). Core infrastructure supports prefix rewriting; schemas wire label filters; tests validate end-to-end; docs updated.

Changes

Label filtering with bracket notation

Layer / File(s) Summary
Namespace field mapping and validation
packages/nmp_common/src/nmp/common/entities/values.py, packages/nemo_platform_plugin/src/nemo_platform_plugin/api/parsed_filter.py, packages/nmp_common/tests/api/test_parsed_filter.py
EntityFieldMapping and map_entity_field accept namespace=True flag. _map_field resolves exact matches or rewrites namespace prefixes (e.g., labels.xdata.labels.x). _validate_fields allows dotted sub-paths only when the top-level segment is a declared namespace. Filter.translate_operation computes and applies both exact and namespace mappings. Tests verify prefix rewriting, logical operator handling, and validation of invalid dotted paths.
Label filter fields in schema
services/evaluator/src/nmp/evaluator/api/v2/benchmarks/schemas/benchmarks.py, services/evaluator/src/nmp/evaluator/api/v2/metrics/schemas/metrics_resp.py
BenchmarksListFilter and MetricsListFilter each add optional labels field mapped to "data.labels" with namespace=True, enabling bracket-notation filtering by individual label sub-paths.
Integration test
services/evaluator/tests/nmp/evaluator/api/v2/benchmarks/test_benchmarks_filter.py
Validates filter[labels.label1]=value1 returns matching benchmark with 200 response.
Documentation
docs/evaluator/benchmarks/agentic.md, docs/evaluator/benchmarks/discover-industry-benchmarks.md, docs/evaluator/benchmarks/manage-benchmarks.md, docs/get-started/concepts/filtering.md
Updated examples and guidance to use filter[labels.<label>] bracket notation.

Suggested reviewers

  • mckornfield
  • matthewgrossman
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.38% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title accurately describes the main change: adding namespace support to map_entity_field and exposing labels filtering in evaluator APIs.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch aircore-389-labels-prefix/md

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/evaluator/benchmarks/agentic.md`:
- Line 36: Add a short note to the evaluator benchmark docs explaining that both
query key forms are accepted: the new compact form filter[labels.eval_harness]
and the legacy form filter[data.labels.eval_harness]; update the text near the
example containing workspace="system",
extra_query={"filter[labels.eval_harness]": "bfcl"} to mention backward
compatibility (referencing the behavior validated by test_parsed_filter.py) so
users know existing code using filter[data.labels.eval_harness] will still work.

In `@docs/evaluator/benchmarks/discover-industry-benchmarks.md`:
- Line 32: Document that both filter[labels.eval_category] and
filter[data.labels.eval_category] are accepted for backward compatibility;
update the text around extra_query to explicitly mention both keys as valid
equivalents and clarify they map to the same label field so users won't be
confused when either form appears in examples or requests.

In `@docs/evaluator/benchmarks/manage-benchmarks.md`:
- Line 106: Update the documentation sentence that lists filterable fields to
also mention the explicit fallback form: append a short clause stating "The
fully-qualified filter[data.labels.<key>] form also works" so readers know they
can use filter[data.labels.<key>] in addition to filter[labels.<key>] (refer to
the existing line that mentions filter[labels.eval_category] and add the new
clause immediately after it).

In `@docs/get-started/concepts/filtering.md`:
- Around line 146-147: Add a backward-compatibility note to the filtering
reference: mention that both the new query form
"?filter[labels.eval_category]=agentic" and the legacy field path form
"data.labels.*" are accepted (as verified by test_parsed_filter.py lines around
test_parsed_filter.py:550-555), and give a short example showing both syntaxes
so readers know they can use either "filter[labels.eval_category]=..." or the
older "data.labels.eval_category=...".
- Around line 145-147: The fenced code block containing the example query string
"?filter[labels.eval_category]=agentic" is missing a language specifier; update
that fenced block (the triple-backtick block around
"?filter[labels.eval_category]=agentic") to include an appropriate language tag
such as http or text (e.g., ```http) or convert it to a plain indented/text
block so Markdownlint no longer flags it.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bc7f1da0-82b6-4f5b-8daf-584f99e45e2c

📥 Commits

Reviewing files that changed from the base of the PR and between 3b9bab0 and 9979f7f.

📒 Files selected for processing (10)
  • docs/evaluator/benchmarks/agentic.md
  • docs/evaluator/benchmarks/discover-industry-benchmarks.md
  • docs/evaluator/benchmarks/manage-benchmarks.md
  • docs/get-started/concepts/filtering.md
  • packages/nemo_platform_plugin/src/nemo_platform_plugin/api/parsed_filter.py
  • packages/nmp_common/src/nmp/common/entities/values.py
  • packages/nmp_common/tests/api/test_parsed_filter.py
  • services/evaluator/src/nmp/evaluator/api/v2/benchmarks/schemas/benchmarks.py
  • services/evaluator/src/nmp/evaluator/api/v2/metrics/schemas/metrics_resp.py
  • services/evaluator/tests/nmp/evaluator/api/v2/benchmarks/test_benchmarks_filter.py

```python
bfcl_system_benchmarks = client.evaluation.benchmarks.list(
workspace="system", extra_query={"filter[data.labels.eval_harness]": "bfcl"}
workspace="system", extra_query={"filter[labels.eval_harness]": "bfcl"}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document backward compatibility.

The old filter[data.labels.eval_harness] form still works (per test_parsed_filter.py:555). Add a note that both forms are valid to avoid confusion for users with existing code.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/evaluator/benchmarks/agentic.md` at line 36, Add a short note to the
evaluator benchmark docs explaining that both query key forms are accepted: the
new compact form filter[labels.eval_harness] and the legacy form
filter[data.labels.eval_harness]; update the text near the example containing
workspace="system", extra_query={"filter[labels.eval_harness]": "bfcl"} to
mention backward compatibility (referencing the behavior validated by
test_parsed_filter.py) so users know existing code using
filter[data.labels.eval_harness] will still work.

filtered_benchmarks = client.evaluation.benchmarks.list(
workspace="system",
extra_query={"filter[data.labels.eval_category]": "advanced_reasoning"},
extra_query={"filter[labels.eval_category]": "advanced_reasoning"},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document backward compatibility.

Both filter[labels.eval_category] and filter[data.labels.eval_category] are valid. Mention this to avoid confusion.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/evaluator/benchmarks/discover-industry-benchmarks.md` at line 32,
Document that both filter[labels.eval_category] and
filter[data.labels.eval_category] are accepted for backward compatibility;
update the text around extra_query to explicitly mention both keys as valid
equivalents and clarify they map to the same label field so users won't be
confused when either form appears in examples or requests.

## Filter

Filter benchmarks using the `filter` parameter. Filterable fields are `name`, `description`, `project`, `created_at`, and `updated_at`. Nested label fields (e.g., `data.labels.eval_category`) can be filtered via `extra_query` with bracket notation (see above).
Filter benchmarks using the `filter` parameter. Filterable fields are `name`, `description`, `project`, `created_at`, and `updated_at`. Individual labels can be filtered via `extra_query` with bracket notation (e.g. `filter[labels.eval_category]`; see above).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the fallback form.

PR description states data.labels.* should be documented as "explicit fallback." Add: "The fully-qualified filter[data.labels.<key>] form also works."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/evaluator/benchmarks/manage-benchmarks.md` at line 106, Update the
documentation sentence that lists filterable fields to also mention the explicit
fallback form: append a short clause stating "The fully-qualified
filter[data.labels.<key>] form also works" so readers know they can use
filter[data.labels.<key>] in addition to filter[labels.<key>] (refer to the
existing line that mentions filter[labels.eval_category] and add the new clause
immediately after it).

Comment on lines 145 to 147
```
?filter[data.labels.eval_category]=agentic
?filter[labels.eval_category]=agentic
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix missing language specifier.

Add language identifier to the fenced code block (line 145) or use plain text block. Markdownlint flagged this.

Proposed fix
-```
+```http
 ?filter[labels.eval_category]=agentic
</details>

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>

[warning] 145-145: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @docs/get-started/concepts/filtering.md around lines 145 - 147, The fenced
code block containing the example query string
"?filter[labels.eval_category]=agentic" is missing a language specifier; update
that fenced block (the triple-backtick block around
"?filter[labels.eval_category]=agentic") to include an appropriate language tag
such as http or text (e.g., ```http) or convert it to a plain indented/text
block so Markdownlint no longer flags it.


</details>

<!-- fingerprinting:phantom:triton:puma -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +146 to 147
?filter[labels.eval_category]=agentic
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add backward compatibility note.

Context snippet test_parsed_filter.py:550-555 confirms data.labels.* form still works. Document both forms here since this is the central filtering reference.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/get-started/concepts/filtering.md` around lines 146 - 147, Add a
backward-compatibility note to the filtering reference: mention that both the
new query form "?filter[labels.eval_category]=agentic" and the legacy field path
form "data.labels.*" are accepted (as verified by test_parsed_filter.py lines
around test_parsed_filter.py:550-555), and give a short example showing both
syntaxes so readers know they can use either "filter[labels.eval_category]=..."
or the older "data.labels.eval_category=...".

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Suite Lines Covered Line Rate Branch Rate
Unit Tests 18749/24796 75.6% 62.2%
Integration Tests 12019/23560 51.0% 26.2%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant