Skip to content

feat(discovery): selector-driven discover() and discover_labels()#28

Merged
soupat merged 8 commits into
mainfrom
updated-hierarchical-discovery
May 19, 2026
Merged

feat(discovery): selector-driven discover() and discover_labels()#28
soupat merged 8 commits into
mainfrom
updated-hierarchical-discovery

Conversation

@soupat
Copy link
Copy Markdown
Collaborator

@soupat soupat commented May 10, 2026

Summary

Replace the legacy hierarchical discovery trio (describe_fleet, list_devices, get_device_functions) with a single selector-driven pair (discover, discover_labels) plus a label foundation. One grammar covers device-only, device.function, device.event, function-only, and event-only queries; same string drives every discovery and operation tool.

Three additive layers:

  1. Labels foundation. Optional labels: dict[str, str | list[str]] field on DeviceCapabilities, FunctionDef, and EventDef, populated via class-level DeviceDriver.labels = {...} or @rpc(labels=...) / @emit(labels=...) decorator kwargs.
  2. Selector DSL. Pure-Python parser at device_connect_edge.selector mapping a structured string onto a Selector dataclass. Supports key:value, key:[v1,v2] (OR within key), key:pattern* (anchored glob), k1:v1,k2:v2 (AND across keys), and bare-string id/name match.
  3. discover() and discover_labels() tools. Selector-driven discovery with stable pagination envelope ({scope, matched, returned, offset, next_offset, results, label_histogram}) and a label_histogram so callers can choose how to narrow next without a second call. Errors returned as data with structured {code, message} for the five failure modes.

flatten_device mirrors the legacy DeviceStatus.location into labels["location"] when capabilities don't declare one, so existing drivers populating only the heartbeat field remain discoverable.

The legacy trio remains for one release as advisory-deprecated wrappers (each emits a DeprecationWarning pointing at the equivalent discover() invocation). All first-party adapters (Claude Agent SDK, Strands, LangChain, the in-tree StrandsOpenAIDeviceConnectAgent) migrated to discover / discover_labels so they don't trigger the warning.

What's new vs main

  • New tools: discover(selector, offset, limit), discover_labels(key, offset, limit)
  • New module: device_connect_edge.selector (parser + matcher, dependency-free stdlib only)
  • Labels field on FunctionDef / EventDef / DeviceCapabilities
  • @rpc(labels=...) / @emit(labels=...) decorator kwargs
  • flatten_device legacy-location mirror
  • Adapters migrated; integration tests added; ADR added at docs/adr/0001-selector-driven-discovery.md

Backwards compatibility

  • describe_fleet / list_devices / get_device_functions still work — they emit a DeprecationWarning pointing to the equivalent discover() call. Existing tests in test_tools_hierarchical.py continue to pass against them.
  • discover_devices() (the long-deprecated flat-roster tool) also gains a DeprecationWarning for parity.
  • Drivers do not need to declare labels to be discoverable; discover("device(*)") returns everything, and the legacy-location mirror keeps location-based queries working for drivers that only set DeviceStatus.location.

Test plan

  • 912 unit tests pass (495 edge + 159 agent-tools + 258 server)
  • 91 integration tests pass on NATS backend (tests/tests/test_tools_selector.py adds 22 new tests covering all five scope shapes, label filters, OR-within-key, AND-across-keys, pagination, error envelope, and discover_labels per-axis + per-key forms)
  • No existing integration test broken
  • CI integration tests on Zenoh backend (skipped locally)

Commits

feat(types): add labels to capabilities, functions, and events
feat(selector): add selector DSL parser and matcher
feat(discovery): selector-driven discover and discover_labels
feat(discovery): structure discover/discover_labels error envelope

@kavya-chennoju
Copy link
Copy Markdown
Collaborator

Case sensitivity in selector matching

KeyFilter.matches (and Filter.matches for the bare-string axis) uses fnmatch.fnmatchcase, which is case-sensitive. Concretely:

discover("device(category:Camera)")     # 0 matches
discover("device(category:camera)")     # N matches

Users coming from Kubernetes labels, AWS tags, etc. won't expect this. Two reasonable resolutions:

  1. Document it. Add a callout to docs/discovery.md and the discover() docstring stating that selector matching is case-sensitive, and recommend lowercase label keys/values as the convention.

  2. Switch to case-insensitive. Lower-case both sides at compare time (fnmatch.fnmatchcase(actual.lower(), pattern.lower())). Cheap, removes the footgun, no impact on existing test fixtures since they use lowercase already.

I'd lean (2) — labels are metadata, not identifiers; case-sensitive matching is rarely what callers want. Happy to put up the patch if useful.

Minor: also worth aligning the glob-detection check ("*" in pattern or "?" in pattern) with the matcher's actual fnmatch capability — fnmatch supports [abc] ranges too, so currently category:[abc]rgb would be treated as a literal-string match rather than a glob.

kavya-chennoju
kavya-chennoju previously approved these changes May 12, 2026
@atsyplikhin
Copy link
Copy Markdown
Collaborator

Three follow-ups from the discovery design doc worth flagging before downstream callers grow against the current shape.

Long-tail cropping is missing in the multi-axis form

Spec §3 "Scale handling" calls out:

Long tail cropped at top 20 per key with "... and N more" hint

The current implementation returns the full sorted value map in the multi-axis form (discover_labels() with no key). For a 1247-device fleet with hundreds of unique locations, that response gets large. The per-key paginated form (discover_labels(key="device.location")) is the escape hatch, but the multi-axis form was meant to be self-limiting. Either truncate to top-N per key with a "... and N more" hint that nudges the agent toward the per-key form, or update the spec to drop the cropping requirement and route long tails through pagination only.

Per-key vs multi-axis response shapes diverge sharply

  • discover_labels(){total_devices, total_functions, total_events, device_keys, function_keys, event_keys}; values live under device_keys["location"]["values"].
  • discover_labels(key="device.location"){axis, key, matched, returned, offset, next_offset, values, axis_total, multivalued?}; values live at values.

An LLM (and the spec) have to know which shape to inspect. Not a bug, but the spec is light on the per-key shape; pinning the contract now is cheaper than after callers grow against it.

Server-side push-down deferred

discover() loads the full fleet via conn.list_devices() and filters in-process (tools.py:258-270). Fine for v1 -- the spec doesn't require server-side filtering -- but the 10K-device worked example will need push-down before it scales. Worth flagging as a known v1 limit so the operations layer can plan around it.


Aside on @kavya-chennoju's case-sensitivity question: both Kubernetes labels (keys + values) and AWS resource tags (keys + values, across most services, with an IAM policy condition exception) are case-sensitive. The labeling systems the spec invokes are case-sensitive by precedent, which argues for documenting + recommending lowercase as a convention rather than lowercasing at compare time.

soupat added a commit that referenced this pull request May 19, 2026
…nning

Five follow-ups from the PR #28 review thread.

- Long-tail cropping in multi-axis discover_labels(): each key's value
  list is now capped at LABEL_VALUES_TOP_N (env, default 20) and emits
  a sibling `more: <int>` field carrying the dropped count. Per-key
  form (discover_labels(key=...)) is paginated and untouched. The same
  truncation contract applies to discover()'s label_histogram. Matches
  the discovery design doc §3 "Scale handling".

- Selector matcher glob heuristic recognises fnmatch's full meta set
  (`*`, `?`, `[`); previously `[abc]rgb`-style patterns were silently
  treated as literal. Reachable end-to-end via the name axis
  (`function([sg]et_*)`); the value axis still rejects brackets at
  parse time. Extracted into `_is_glob()` so KeyFilter.matches and
  Filter.matches share one source of truth.

- Selector matching documented as case-sensitive on both keys and
  values, following K8s label-selector and AWS resource-tag
  precedent. Convention is lowercase. Note added to discover() and
  discover_labels() docstrings and a callout in docs/discovery.md.

- docs/discovery.md "Response envelopes" restructured to pin three
  shapes explicitly (discover, discover_labels multi-axis,
  discover_labels per-key); field-by-field contract cross-checked
  against the test assertions and the v4 spec.

- "Known limits" section added flagging client-side filtering as a
  v1 limit; ops layer should plan registry push-down before fleet
  growth past ~1K devices.

Tests: 60/60 selector unit, 41/41 agent-tools unit (5 new in
TestLongTailTruncation), 56/56 integration on NATS + Zenoh (12 new
exercising bracket-glob end-to-end, long-tail truncation,
per-key non-truncation, label_histogram truncation, case-sensitivity,
and response-shape pinning).
soupat added a commit that referenced this pull request May 19, 2026
JSON-RPC params may be omitted, null, an object, or an array. The
previous chained ``payload.get("params", {}).get(...)`` only treated
the omitted case as default — an explicit ``"params": null`` returned
``None``, then ``None.get(...)`` crashed. Same for array / scalar
params.

Hypothesis fuzz test found this on the PR #28 CI run with the
falsifying example ``{"params": None}``. Normalize raw_params to a
dict before the device_id lookup so the function delivers on its
"return a dict" contract for any well-formed JSON object input.

Adds deterministic regression tests in
test_connection_unit.py::TestParseEventPayloadNonDictParams covering
the four non-dict shapes (null, array, scalar, missing) plus the
happy path.
soupat added 8 commits May 18, 2026 21:10
Add an optional labels: dict[str, str | list[str]] field on
DeviceCapabilities, FunctionDef, and EventDef. Drivers populate them
either via class-level DeviceDriver.labels = {...} (device metadata)
or @rpc(labels=...) / @emit(labels=...) decorator kwargs. List values
express composite identity (a device that is both camera and inference).

These labels are the foundation for selector-based discovery and
operations: the discover/invoke/broadcast tools filter on them.
Add a pure-Python parser at device_connect_edge.selector that maps a
structured selector string onto a parsed Selector dataclass with five
scope shapes:

    device(<filters>)
    device(<filters>).function(<filters>)
    device(<filters>).event(<filters>)
    function(<filters>)
    event(<filters>)

Inside (...): key:value, key:[v1,v2] (OR within key), key:pattern*
(anchored glob), k1:v1,k2:v2 (AND across keys), bare-string id/name
match, or * to match all.

Parse errors carry source + caret position for diagnostics. The matcher
is dependency-free (stdlib only) and applies vacuous-True semantics on
unset axes so callers can iterate without scope branching.
Add two new agent tools that replace the hierarchical trio:

- discover(selector, offset, limit) resolves a selector to matched
  devices, function tuples, or event tuples. Adaptive response shape:
  small result sets include full schemas inline; large sets paginate
  with name-and-labels summaries (DC_FUNCTION_THRESHOLD=20).
- discover_labels(key, offset, limit) returns the label vocabulary,
  per axis (no key) or paginated values for one key.

Response envelope: {scope, matched, returned, offset, next_offset,
results, label_histogram}. The label_histogram describes the matched
set (pre-pagination) so callers can choose how to narrow next without
a second call. On the device axis, multi-valued keys also expose
unique_devices for cardinality.

flatten_device now mirrors the legacy DeviceStatus.location into
labels["location"] when capabilities.labels does not declare one, so
drivers populating only the heartbeat field remain discoverable via
selector queries on location.

Migrate first-party adapters (Claude Agent SDK, Strands, LangChain,
the in-tree StrandsOpenAIDeviceConnectAgent) to discover/discover_labels.
The legacy describe_fleet/list_devices/get_device_functions trio
remains for one release as advisory-deprecated wrappers; each call
emits a DeprecationWarning pointing to the equivalent discover()
invocation.

Test drivers carry category, direction, modality, and safety labels so
integration tests can exercise the full selector grammar end-to-end.
Errors returned by discover() and discover_labels() are now structured
{"code": ..., "message": ...} dicts rather than free-form strings. This
lets callers branch on the code programmatically while still surfacing
the message to logs or end users.

Codes emitted:
  - invalid_selector         selector is not a string
  - selector_parse_error     selector is a string but malformed
  - connection_error         registry / messaging backend unavailable
  - key_not_axis_qualified   discover_labels key missing axis prefix
  - unknown_axis             discover_labels axis not in
                             {device, function, event}
…ools

The doc is a developer guide rather than a decision record: drop the
"ADR 0001:" framing, status line, and motivation paragraph. Trim the
content to the discovery surface that ships with this PR (labels,
selector grammar, discover, discover_labels, response envelope, error
codes) so worked examples are runnable today.
Trim docs/discovery.md to the discovery surface that ships in this PR
(labels, selector grammar, discover, discover_labels, response envelope,
error codes). Drop the ADR framing (status line, summary/motivation),
the "Operations" section listing tools that have not landed yet, the
CLI section, and worked examples that called those tools, so the guide
matches what a developer can actually run today.
…nning

Five follow-ups from the PR #28 review thread.

- Long-tail cropping in multi-axis discover_labels(): each key's value
  list is now capped at LABEL_VALUES_TOP_N (env, default 20) and emits
  a sibling `more: <int>` field carrying the dropped count. Per-key
  form (discover_labels(key=...)) is paginated and untouched. The same
  truncation contract applies to discover()'s label_histogram. Matches
  the discovery design doc §3 "Scale handling".

- Selector matcher glob heuristic recognises fnmatch's full meta set
  (`*`, `?`, `[`); previously `[abc]rgb`-style patterns were silently
  treated as literal. Reachable end-to-end via the name axis
  (`function([sg]et_*)`); the value axis still rejects brackets at
  parse time. Extracted into `_is_glob()` so KeyFilter.matches and
  Filter.matches share one source of truth.

- Selector matching documented as case-sensitive on both keys and
  values, following K8s label-selector and AWS resource-tag
  precedent. Convention is lowercase. Note added to discover() and
  discover_labels() docstrings and a callout in docs/discovery.md.

- docs/discovery.md "Response envelopes" restructured to pin three
  shapes explicitly (discover, discover_labels multi-axis,
  discover_labels per-key); field-by-field contract cross-checked
  against the test assertions and the v4 spec.

- "Known limits" section added flagging client-side filtering as a
  v1 limit; ops layer should plan registry push-down before fleet
  growth past ~1K devices.

Tests: 60/60 selector unit, 41/41 agent-tools unit (5 new in
TestLongTailTruncation), 56/56 integration on NATS + Zenoh (12 new
exercising bracket-glob end-to-end, long-tail truncation,
per-key non-truncation, label_histogram truncation, case-sensitivity,
and response-shape pinning).
JSON-RPC params may be omitted, null, an object, or an array. The
previous chained ``payload.get("params", {}).get(...)`` only treated
the omitted case as default — an explicit ``"params": null`` returned
``None``, then ``None.get(...)`` crashed. Same for array / scalar
params.

Hypothesis fuzz test found this on the PR #28 CI run with the
falsifying example ``{"params": None}``. Normalize raw_params to a
dict before the device_id lookup so the function delivers on its
"return a dict" contract for any well-formed JSON object input.

Adds deterministic regression tests in
test_connection_unit.py::TestParseEventPayloadNonDictParams covering
the four non-dict shapes (null, array, scalar, missing) plus the
happy path.
@soupat soupat force-pushed the updated-hierarchical-discovery branch from 9419e8b to 3ad2eec Compare May 19, 2026 04:10
Copy link
Copy Markdown
Collaborator

@atsyplikhin atsyplikhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All three blocking items from my prior review are resolved:

  • Long-tail cropping via LABEL_VALUES_TOP_N + sibling more field, applied symmetrically to discover_labels() and discover()'s label_histogram.
  • Response envelopes pinned in docs/discovery.md §"Response envelopes" (three shapes documented).
  • Server-side push-down flagged in §"Known limits" as a v1 limit.

Bonus fixes picked up beyond what I asked for: bracket-glob recognition via _is_glob() closes Kavya's open question, case-sensitivity documented with K8s/AWS precedent, contract pinning cross-checked against tests.

Test coverage is solid (5 new unit + 12 new integration). One minor doc nit for a follow-up: the v4 spec example still shows "...": "... and 12 more" as an inline pseudo-entry; the shipped shape is a sibling "more": N field, which is cleaner. Worth a spec update at some point but not blocking.

@soupat soupat merged commit 86dfd38 into main May 19, 2026
9 checks passed
soupat added a commit that referenced this pull request May 19, 2026
…re, CLI (#29)

## Summary

Stacked on top of PR #28. Completes the discovery design's operations
layer (phases 4-6 of the spec) and the optional CEL `where` predicate.
With this PR merged, the discovery API ships end-to-end: labels,
selectors, discovery, invocation, async fan-out, and operator CLI.

This PR targets `updated-hierarchical-discovery` so the diff shows only
the operations work. GitHub auto-retargets to `main` once #28 merges.

## What's new

- **`invoke(selector, params, llm_reasoning)`** — selector-driven sync
  call. Selector must resolve to one (device, function) tuple; zero or
  multiple matches return structured errors (`no_match`,
  `ambiguous_match`). `invoke_device` becomes advisory-deprecated.
- **`invoke_many(selector, params, timeout, max_concurrency)`** — sync
  parallel fan-out. Partial-failure semantics: per-target results and
  errors are returned even if some fail. Per-target 30s timeout default.
- **`broadcast(selector, params, where=, bindings=, fire_at=,
on_late=)`**
  — async fan-out. Returns `correlation_id` immediately; replies stream
on
`device-connect.<zone>.<device_id>.event.async_reply.<correlation_id>`.
- **CEL `where` predicate** evaluated at each candidate device against
  `{identity, labels, status, bindings}`. Self-deselection is silent;
  compile-validated at the dispatcher before publication. Optional
  dependency: `pip install device-connect-agent-tools[predicate]`.
- **Synchronized fan-out** via `fire_at` (wall-clock epoch seconds) +
  `on_late` (`skip`|`fire`). Each device holds the message and fires
  from its own clock at the deadline. NTP-typical spread is 5-10 ms.
- **`subscribe(selector)`** returns a `Subscription` handle for live
  events. Two selector forms: `"correlation:<id>"` for broadcast
  replies, or event-scoped selectors (`event(<name>)` /
  `device(...).event(<name>)`). Supports `read()`, `iter(timeout)`,
  context-manager `with`, and the standard `for msg in sub:` protocol.
- **`await_replies(correlation_id, timeout, until)`** — sync helper for
  the common broadcast-then-collect pattern.
- **`devctl discover` / `discover-labels`** — selector-driven discovery
  on the CLI. Historical `devctl discover` (mDNS scan) renamed to
  `devctl mdns-scan` with `scan` alias.
- **`statectl invoke` / `invoke-many` / `broadcast` / `subscribe` /
  `await`** — operator-facing wrappers for the new tools. JSON-shaped
  `--param k=v`, structured exit codes for pipelines.
- **`docs/discovery.md`** extended with operations, edge-side `where`,
  synchronized fan-out, worked examples, and CLI reference.

## Commits

```
fix(broadcast):  robustness pass on edge handler, subscribe, and CLI
docs:            extend discovery guide for operations, where, and CLI
feat(cli):       selector-driven verbs in devctl and statectl
feat(broadcast): async fan-out with correlation, fire_at, and subscribe
feat(predicate): add CEL where evaluator with optional [predicate] extra
feat(invoke):    selector-based invoke and invoke_many with sync fan-out
```

The final commit is the result of running the same three reviews used
on PR #28 (`code-review-and-quality`, `code-simplification`,
`api-and-interface-design`) on this branch and addressing 11 findings:
race fix in `Subscription.read`, broadcast handler no longer blocks the
subscription callback on `fire_at`, wire-format rename from
`target_device_ids` to `targets`, `__iter__` protocol on Subscription,
CLI SIGINT handling, structured exit codes, `identity.device_id` in
predicate context, ASCII compliance.

## Backwards compatibility

- `invoke_device(device_id, function, ...)` still works, now emits a
`DeprecationWarning` pointing at
`invoke('device(<id>).function(<name>)', params)`.
- All first-party adapters (Claude Agent SDK, Strands, LangChain, the
  in-tree `StrandsOpenAIDeviceConnectAgent`) migrated to `invoke` /
  `invoke_many`; `invoke_device` is removed from their tool surfaces.
- `invoke_device_with_fallback` is kept unchanged — no selector
  equivalent exists for "try a list of devices in order".
- The historical `devctl discover` mDNS-scan verb is renamed to
  `devctl mdns-scan` (with `scan` alias). Scripts that called
  `devctl discover` for the mDNS path should switch to `devctl scan`.

## Test plan

- [x] 1031 unit tests pass (511 edge + 206 agent-tools + 314 server)
- [x] 105 integration tests pass on NATS, zero failures. New
      `tests/tests/test_tools_invoke.py` (9 tests) and
      `tests/tests/test_tools_broadcast.py` (9 tests) exercise:
  - single-target invoke against camera / sensor / robot
  - invoke ambiguous-match, no-match, invalid-scope, parse-error
  - invoke_many full fan-out, partial failure, zero candidates,
    function-only selector
  - end-to-end broadcast + reply on `correlation_id`
  - CEL where filters at the edge (location, bindings allowlist)
  - `fire_at` synchronization (spread under 0.5s with on_late=skip),
    late-arrival drop with on_late=skip
  - `await_replies(until=K)` early return
  - `subscribe("correlation:<id>")` streaming
  - `subscribe(event(<name>))` live event capture
  - `for msg in sub:` iter protocol end-to-end
- [ ] Zenoh integration tests skipped locally; CI will exercise them
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants