Skip to content

feat(OBE-9898): always require a token; read site_version claim#110

Open
saurabhchauhan-s1 wants to merge 1 commit into
masterfrom
feat-obe-9898-site-version-from-token
Open

feat(OBE-9898): always require a token; read site_version claim#110
saurabhchauhan-s1 wants to merge 1 commit into
masterfrom
feat-obe-9898-site-version-from-token

Conversation

@saurabhchauhan-s1

Copy link
Copy Markdown

Summary

The vector source's auth previously had a require_token config. When set to false it allowed token-less requests through (a staged-migration escape hatch for older agents). The problem: that bypass applied to every sink — including upgraded sites that should never be able to skip auth — making it a blunt, unsafe instrument.

This PR (OBE-9898):

  • Removes require_token entirely. A token is now always required: Auth::authenticate rejects a missing authorization header unconditionally. This closes the "upgraded sites skipping auth" gap.
  • Reads the per-token site_version claim that the manager's auth-service stamps onto every site token (OBE-9896), instead of relying on a global flag. It's surfaced for telemetry / future per-version policy via a small pure extract_site_version helper. Tokens minted before the claim existed are tolerated (the claim is optional).

The site_version claim name is the cross-repo contract with the manager's SiteVersionClaim — both sides pin the literal "site_version" with a unit test so a rename on either side fails its own suite rather than silently breaking compatibility.

Files: src/sources/util/jwt_auth.rs (config + auth logic + tests), src/sources/vector/mod.rs (integration tests + comments), src/sinks/vector/config.rs and src/test_util/jwt_auth.rs (comment / helper cleanup).

Vector configuration

A vector source with auth — note there is no longer a require_token key:

[sources.in]
type = "vector"
address = "0.0.0.0:6000"

[sources.in.auth]
pub_key.type  = "inline"
pub_key.value = "<PEM>"
membership_claim = "site_ids"

Any config that still sets require_token = ... will now fail to parse (deny_unknown_fields).

How did you test this PR?

Unit tests added/updated in src/sources/util/jwt_auth.rs:

  • missing_authorization_is_rejected, valid_token_is_accepted, invalid_token_is_rejected — a token is always required.
  • site_version_claim_name_is_stable — pins the claim name to "site_version" (cross-repo contract).
  • extract_site_version_reads_the_claim / _absent_is_none / _wrong_type_is_none — read behavior incl. tolerating absent/odd claims.
  • authenticate_surfaces_site_version_when_present — end-to-end on a real signed token.
  • Legacy require_token push/healthcheck integration tests in src/sources/vector/mod.rs were removed or consolidated to the "token always required" posture.

⚠️ Reviewer note: I was unable to compile/run the suite locally — this repo builds via the flake.nix nix shell and nix isn't available on my machine; a direct cargo build fails on the transitive metrics dependency (unrelated to this change). The Rust changes were verified by inspection; please rely on CI (make fmt / make check-clippy / make test) to validate the build.

Change Type

  • Bug fix
  • New feature
  • Dependencies
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Removing require_token under #[serde(deny_unknown_fields)] means any config TOML that sets it will now fail to parse. The manager (the only generator of these source configs) never emitted that field, so generated configs are unaffected.

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

(No changelog fragment added yet — happy to add one, or a maintainer can apply the label.)

References

🤖 Generated with Claude Code

The `vector` source's auth had a `require_token` config that, when set to
false, let token-less requests through for staged migration. That bypass
applied to *every* sink — including upgraded sites that should never skip
auth — so it was a blunt, unsafe instrument.

This removes `require_token` entirely: a token is now always required
(`authenticate` rejects a missing `authorization` header unconditionally),
which closes the "upgraded sites skipping auth" gap. Instead of a global
flag, the source now reads the per-token `site_version` claim that the
manager stamps onto every site token (OBE-9896) and surfaces it for
telemetry / future per-version policy. Older tokens that predate the claim
are tolerated (the claim is optional).

Notes:
- `AuthConfig` uses `#[serde(deny_unknown_fields)]`, so this is a breaking
  change: any config TOML that *sets* `require_token` will now fail to
  parse. The manager never emitted that field, so generated configs are
  unaffected.
- The `site_version` claim name is the cross-repo contract with the
  manager's `SiteVersionClaim`; both sides pin the literal "site_version"
  with a unit test so a rename can't silently break compatibility.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant