feat(filter): support nested-namespace map_entity_field; expose evaluator labels (AIRCORE-389) by maxdubrinsky · Pull Request #196 · NVIDIA-NeMo/nemo-platform

maxdubrinsky · 2026-06-04T22:19:19Z

`filter[data.labels.*]` worked but was an implementation detail that leaked storage layout to users. This adds a `namespace=True` flag to `map_entity_field` so a filter model can declare a named prefix — e.g. `labels` → `data.labels` — that the parser rewrites transparently. Evaluator benchmarks and metrics now expose `labels` as such a namespace, making `filter[labels.eval_category]=agentic` the canonical form.

Docs updated to show only the short path.

Summary by CodeRabbit

New Features
- Added support for filtering benchmarks and metrics by entity labels using simplified bracket notation syntax (e.g., filter[labels.eval_category]).
Documentation
- Updated filtering guidance and benchmark management documentation to reflect the new label filter syntax approach across all relevant guides.

…ator labels (AIRCORE-389) Extend map_entity_field with a `namespace=True` flavor so a filter model can map a nested data blob (not just a scalar) to its entity-store path. Sub-paths under a declared namespace (`labels.<key>`) are validated and rewritten to `data.labels.<key>` generically: the shared filter parser learns "this field is a namespace" from the model's annotation and holds no service-specific knowledge. Evaluator BenchmarksListFilter and MetricsListFilter declare `labels` as such a namespace, so users can write `filter[labels.eval_category]=agentic` instead of the storage-qualified `filter[data.labels.eval_category]` (which still works). - nmp_common entities.values: map_entity_field(path, namespace=True); _get_entity_namespace_map(); prefix rewrite in _apply_field_map; namespace fields excluded from the scalar field map and folded into translate_operation. - nemo_platform_plugin api.parsed_filter: _validate_fields accepts sub-paths of model-declared namespaces; make_filter_dep threads the namespace set through. - evaluator: BenchmarksListFilter / MetricsListFilter declare a `labels` namespace. - docs updated to the short labels.* path; data.labels.* documented as the explicit fallback for any non-namespace nested field. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>

The long form is an implementation detail; the short labels.<key> path is what users should use. No pre-existing docs mentioned the long form, so there's nothing to stay consistent with. Signed-off-by: Max Dubrinsky <mdubrinsky@nvidia.com>

github-actions · 2026-06-04T22:21:07Z

Documentation preview is ready

Preview: https://nvidia-nemo.github.io/nemo-platform/pr-preview/pr-196/pr-196/

Built from 9979f7f in workflow run.

This preview is deployed from this PR branch, updates when docs changes are pushed, and will be removed when the PR closes.

coderabbitai · 2026-06-04T22:25:17Z

📝 Walkthrough

Walkthrough

This PR adds namespace-aware field mapping to enable cleaner label filtering syntax (filter[labels.eval_harness] instead of filter[data.labels.eval_harness]). Core infrastructure supports prefix rewriting; schemas wire label filters; tests validate end-to-end; docs updated.

Changes

Label filtering with bracket notation

Layer / File(s)	Summary
Namespace field mapping and validation `packages/nmp_common/src/nmp/common/entities/values.py`, `packages/nemo_platform_plugin/src/nemo_platform_plugin/api/parsed_filter.py`, `packages/nmp_common/tests/api/test_parsed_filter.py`	`EntityFieldMapping` and `map_entity_field` accept `namespace=True` flag. `_map_field` resolves exact matches or rewrites namespace prefixes (e.g., `labels.x` → `data.labels.x`). `_validate_fields` allows dotted sub-paths only when the top-level segment is a declared namespace. `Filter.translate_operation` computes and applies both exact and namespace mappings. Tests verify prefix rewriting, logical operator handling, and validation of invalid dotted paths.
Label filter fields in schema `services/evaluator/src/nmp/evaluator/api/v2/benchmarks/schemas/benchmarks.py`, `services/evaluator/src/nmp/evaluator/api/v2/metrics/schemas/metrics_resp.py`	`BenchmarksListFilter` and `MetricsListFilter` each add optional `labels` field mapped to `"data.labels"` with `namespace=True`, enabling bracket-notation filtering by individual label sub-paths.
Integration test `services/evaluator/tests/nmp/evaluator/api/v2/benchmarks/test_benchmarks_filter.py`	Validates `filter[labels.label1]=value1` returns matching benchmark with 200 response.
Documentation `docs/evaluator/benchmarks/agentic.md`, `docs/evaluator/benchmarks/discover-industry-benchmarks.md`, `docs/evaluator/benchmarks/manage-benchmarks.md`, `docs/get-started/concepts/filtering.md`	Updated examples and guidance to use `filter[labels.<label>]` bracket notation.

Suggested reviewers

mckornfield
matthewgrossman

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 41.38% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title accurately describes the main change: adding namespace support to map_entity_field and exposing labels filtering in evaluator APIs.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch aircore-389-labels-prefix/md

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/evaluator/benchmarks/agentic.md`:
- Line 36: Add a short note to the evaluator benchmark docs explaining that both
query key forms are accepted: the new compact form filter[labels.eval_harness]
and the legacy form filter[data.labels.eval_harness]; update the text near the
example containing workspace="system",
extra_query={"filter[labels.eval_harness]": "bfcl"} to mention backward
compatibility (referencing the behavior validated by test_parsed_filter.py) so
users know existing code using filter[data.labels.eval_harness] will still work.

In `@docs/evaluator/benchmarks/discover-industry-benchmarks.md`:
- Line 32: Document that both filter[labels.eval_category] and
filter[data.labels.eval_category] are accepted for backward compatibility;
update the text around extra_query to explicitly mention both keys as valid
equivalents and clarify they map to the same label field so users won't be
confused when either form appears in examples or requests.

In `@docs/evaluator/benchmarks/manage-benchmarks.md`:
- Line 106: Update the documentation sentence that lists filterable fields to
also mention the explicit fallback form: append a short clause stating "The
fully-qualified filter[data.labels.<key>] form also works" so readers know they
can use filter[data.labels.<key>] in addition to filter[labels.<key>] (refer to
the existing line that mentions filter[labels.eval_category] and add the new
clause immediately after it).

In `@docs/get-started/concepts/filtering.md`:
- Around line 146-147: Add a backward-compatibility note to the filtering
reference: mention that both the new query form
"?filter[labels.eval_category]=agentic" and the legacy field path form
"data.labels.*" are accepted (as verified by test_parsed_filter.py lines around
test_parsed_filter.py:550-555), and give a short example showing both syntaxes
so readers know they can use either "filter[labels.eval_category]=..." or the
older "data.labels.eval_category=...".
- Around line 145-147: The fenced code block containing the example query string
"?filter[labels.eval_category]=agentic" is missing a language specifier; update
that fenced block (the triple-backtick block around
"?filter[labels.eval_category]=agentic") to include an appropriate language tag
such as http or text (e.g., ```http) or convert it to a plain indented/text
block so Markdownlint no longer flags it.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: bc7f1da0-82b6-4f5b-8daf-584f99e45e2c

📥 Commits

Reviewing files that changed from the base of the PR and between 3b9bab0 and 9979f7f.

📒 Files selected for processing (10)

docs/evaluator/benchmarks/agentic.md
docs/evaluator/benchmarks/discover-industry-benchmarks.md
docs/evaluator/benchmarks/manage-benchmarks.md
docs/get-started/concepts/filtering.md
packages/nemo_platform_plugin/src/nemo_platform_plugin/api/parsed_filter.py
packages/nmp_common/src/nmp/common/entities/values.py
packages/nmp_common/tests/api/test_parsed_filter.py
services/evaluator/src/nmp/evaluator/api/v2/benchmarks/schemas/benchmarks.py
services/evaluator/src/nmp/evaluator/api/v2/metrics/schemas/metrics_resp.py
services/evaluator/tests/nmp/evaluator/api/v2/benchmarks/test_benchmarks_filter.py

coderabbitai · 2026-06-04T22:25:20Z

 ```python
 bfcl_system_benchmarks = client.evaluation.benchmarks.list(
-    workspace="system", extra_query={"filter[data.labels.eval_harness]": "bfcl"}
+    workspace="system", extra_query={"filter[labels.eval_harness]": "bfcl"}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document backward compatibility.

The old filter[data.labels.eval_harness] form still works (per test_parsed_filter.py:555). Add a note that both forms are valid to avoid confusion for users with existing code.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/evaluator/benchmarks/agentic.md` at line 36, Add a short note to the evaluator benchmark docs explaining that both query key forms are accepted: the new compact form filter[labels.eval_harness] and the legacy form filter[data.labels.eval_harness]; update the text near the example containing workspace="system", extra_query={"filter[labels.eval_harness]": "bfcl"} to mention backward compatibility (referencing the behavior validated by test_parsed_filter.py) so users know existing code using filter[data.labels.eval_harness] will still work.

coderabbitai · 2026-06-04T22:25:20Z

 filtered_benchmarks = client.evaluation.benchmarks.list(
    workspace="system",
-    extra_query={"filter[data.labels.eval_category]": "advanced_reasoning"},
+    extra_query={"filter[labels.eval_category]": "advanced_reasoning"},


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document backward compatibility.

Both filter[labels.eval_category] and filter[data.labels.eval_category] are valid. Mention this to avoid confusion.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/evaluator/benchmarks/discover-industry-benchmarks.md` at line 32, Document that both filter[labels.eval_category] and filter[data.labels.eval_category] are accepted for backward compatibility; update the text around extra_query to explicitly mention both keys as valid equivalents and clarify they map to the same label field so users won't be confused when either form appears in examples or requests.

coderabbitai · 2026-06-04T22:25:20Z

 ## Filter

-Filter benchmarks using the `filter` parameter. Filterable fields are `name`, `description`, `project`, `created_at`, and `updated_at`. Nested label fields (e.g., `data.labels.eval_category`) can be filtered via `extra_query` with bracket notation (see above).
+Filter benchmarks using the `filter` parameter. Filterable fields are `name`, `description`, `project`, `created_at`, and `updated_at`. Individual labels can be filtered via `extra_query` with bracket notation (e.g. `filter[labels.eval_category]`; see above).


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the fallback form.

PR description states data.labels.* should be documented as "explicit fallback." Add: "The fully-qualified filter[data.labels.<key>] form also works."

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/evaluator/benchmarks/manage-benchmarks.md` at line 106, Update the documentation sentence that lists filterable fields to also mention the explicit fallback form: append a short clause stating "The fully-qualified filter[data.labels.<key>] form also works" so readers know they can use filter[data.labels.<key>] in addition to filter[labels.<key>] (refer to the existing line that mentions filter[labels.eval_category] and add the new clause immediately after it).

coderabbitai · 2026-06-04T22:25:20Z

 ```
-?filter[data.labels.eval_category]=agentic
+?filter[labels.eval_category]=agentic
 ```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix missing language specifier.

Add language identifier to the fenced code block (line 145) or use plain text block. Markdownlint flagged this.

Proposed fix

-``` +```http ?filter[labels.eval_category]=agentic

</details> <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 145-145: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @docs/get-started/concepts/filtering.md around lines 145 - 147, The fenced
code block containing the example query string
"?filter[labels.eval_category]=agentic" is missing a language specifier; update
that fenced block (the triple-backtick block around
"?filter[labels.eval_category]=agentic") to include an appropriate language tag
such as http or text (e.g., ```http) or convert it to a plain indented/text
block so Markdownlint no longer flags it.

</details>  

coderabbitai · 2026-06-04T22:25:21Z

+?filter[labels.eval_category]=agentic
 ```


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add backward compatibility note.

Context snippet test_parsed_filter.py:550-555 confirms data.labels.* form still works. Document both forms here since this is the central filtering reference.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/get-started/concepts/filtering.md` around lines 146 - 147, Add a backward-compatibility note to the filtering reference: mention that both the new query form "?filter[labels.eval_category]=agentic" and the legacy field path form "data.labels.*" are accepted (as verified by test_parsed_filter.py lines around test_parsed_filter.py:550-555), and give a short example showing both syntaxes so readers know they can use either "filter[labels.eval_category]=..." or the older "data.labels.eval_category=...".

github-actions · 2026-06-04T22:31:25Z

Suite	Lines Covered	Line Rate	Branch Rate
Unit Tests	18749/24796	75.6%	62.2%
Integration Tests	12019/23560	51.0%	26.2%

maxdubrinsky added 2 commits June 4, 2026 14:33

maxdubrinsky requested review from a team as code owners June 4, 2026 22:19

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(filter): support nested-namespace map_entity_field; expose evaluator labels (AIRCORE-389)#196

feat(filter): support nested-namespace map_entity_field; expose evaluator labels (AIRCORE-389)#196
maxdubrinsky wants to merge 2 commits into
mainfrom
aircore-389-labels-prefix/md

maxdubrinsky commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Walkthrough

Changes

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maxdubrinsky commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions Bot commented Jun 4, 2026

Documentation preview is ready

Uh oh!

coderabbitai Bot commented Jun 4, 2026

Walkthrough

Changes

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

maxdubrinsky commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading