Add outbound token framework with pluggable token sources

## Summary

Add a first-class outbound token framework for RAS services.

RAS services frequently need to call other systems. Sometimes the downstream system is a third-party API such as GitHub, Google, Slack, or a customer system. Sometimes it is another internal RAS service. Today each project tends to hand-roll token acquisition, grant/refresh-token storage, access-token caching, bearer-header attachment, host validation, and realistic fakes.

This issue should define the reusable outbound token/client framework with pluggable token sources. OAuth2/OIDC should be one token source for third-party APIs. RAS-internal JWT issuance should be another token source for internal service-to-service calls, backed by #13.

Related:

- #13 defines the RAS-native authorization control plane and internal service token issuer.
- The optional auth gateway/proxy issue will define the browser-facing multi-service token exchanger that consumes these token primitives.
- #15 defines the topology macro that can generate service graph, gateway profile, and authz policy artifacts consumed by #13/#14.

## Problem

Developers using RAS frequently need to call external APIs, internal services, or customer systems.

Today each project tends to hand-roll:

- where refresh tokens/grants come from and how they are stored
- how access tokens are cached and refreshed
- how internal service tokens are requested
- how bearer tokens are attached to outbound requests
- how refresh-token rotation is persisted
- how to validate that tokens are only sent to the right downstream hosts
- how to test integrations without hitting real providers
- how to exercise the actual service code path in tests instead of replacing everything with mocks

Important terminology:

- A refresh token is a stored **grant**. RAS should not magically know it; the application must provide a consent flow, seeded grant, or custom grant store. RAS should then use that grant to acquire and refresh access tokens.
- Internal service-to-service tokens should not require Auth0/Entra service app setup. For internal services, #13 should provide the RAS-owned service registry, authorization decision, and JWT issuer. This issue should consume that issuer through a token source.

## Proposed Direction

Add a new integration-auth crate family, likely under `crates/integration/`:

- `ras-integration-core`
  - `TokenSource` abstraction
  - token request/lease types
  - token manager
  - grant-store traits
  - access-token cache traits
  - in-memory stores for tests/dev
  - reqwest/header authorization helpers
  - redacted secret wrappers
  - capability-scoped integration clients so handlers cannot request arbitrary integrations/scopes/audiences
- `ras-integration-oauth2`
  - generic OAuth2/OIDC token source
  - authorization-code + PKCE user grants
  - refresh-token flow
  - client-credentials service tokens for third-party providers that support it
  - strict state/callback validation for user consent flows
- `ras-integration-ras`
  - `RasInternalTokenSource`
  - requests internal service tokens from the RAS authorization/token issuer tracked in #13
  - no Auth0/Entra app/client setup for internal services
- `ras-integration-test`
  - in-process fake OAuth/token/API provider
  - fake `TokenSource` implementations
  - test helpers for consent, refresh, invalid scope, revoked grant, provider failure, internal token issuance failure, and downstream bearer verification

Keep this separate from existing inbound auth. Existing RAS sessions identify the logged-in application user. The authorization/control-plane work in #13 decides whether a user/service/application may obtain a downstream token. This issue handles how that token is acquired, cached, refreshed, and attached.

For browser traffic that fans out to multiple backend services, token narrowing should be handled by the optional auth gateway/proxy rather than by every backend. This issue should provide reusable token and attachment primitives that the gateway can use, but deploying a gateway must remain opt-in.

For multi-service systems, topology-generated artifacts from #15 can provide the route/audience/service graph policy that determines which tokens may be requested or attached.

## Core API Sketch

Generic token source:

```rust
#[async_trait]
trait TokenSource {
    async fn issue_token(&self, request: TokenRequest) -> Result<TokenLease, TokenError>;
}
```

Token request model:

```rust
TokenRequest {
    integration_id: "google-calendar",
    subject: TokenSubject::User { user_id: "alice" },
    scopes: vec!["calendar.readonly"],
    audience: None,
    force_refresh: false,
}
```

Planned token sources:

```text
OAuth2TokenSource
  -> external OAuth/OIDC providers

RasInternalTokenSource
  -> RAS-issued JWTs for internal service-to-service calls, backed by #13

Future optional sources
  -> static token/API key/legacy adapters
```

Ordinary service code should prefer capability-scoped clients over constructing raw `TokenRequest` values:

```rust
let google = AuthorizedHttpClient::for_user(
    reqwest::Client::new(),
    token_manager.clone(),
    "google-calendar",
    user.user_id.clone(),
    ["calendar.readonly"],
);

google.get(calendar_url).send().await?;
```

Internal service usage should look similar:

```rust
let invoice_api = AuthorizedHttpClient::for_service(
    reqwest::Client::new(),
    token_manager.clone(),
    "invoice-service",
    ["invoice:write"],
);

invoice_api.post(invoice_url).json(&request).send().await?;
```

For the internal case:

```text
billing-service
  -> asks token manager for token to call invoice-service
  -> RasInternalTokenSource asks RAS issuer from #13
  -> RAS checks local service registration/grants/audience
  -> RAS issues signed JWT
  -> AuthorizedHttpClient attaches the JWT
  -> invoice-service validates via RAS JWKS
```

For a browser frontend behind the optional auth gateway/proxy:

```text
browser sends RAS web session cookie/JWT
  -> gateway validates RAS session locally
  -> gateway maps route to target audience
  -> gateway narrows permissions to that audience
  -> gateway mints/caches short-lived service-specific token
  -> backend receives only single-audience token
```

Backends should not need to parse permissions for unrelated audiences.

Internal service requests should distinguish principal mode instead of treating every call as the same subject:

- service-as-service: the caller requests a token for its own service identity
- user-delegated: the caller requests a downstream token on behalf of a RAS-authenticated user, constrained by both caller/service policy and the user's delegated permissions
- application/service-account: the caller uses a non-human principal with explicit RAS grants

The v1 implementation may focus on service-as-service, but `TokenSubject`, cache keys, and `RasInternalTokenSource` should leave room for delegated calls without introducing a second token model later.

Grant injection paths for external OAuth providers:

```rust
grant_store.put_user_grant(UserGrant {
    integration_id: "google-calendar".into(),
    user_id: "alice".into(),
    refresh_token: SecretString::new(refresh_token),
    scopes: vec!["calendar.readonly".into()],
}).await?;
```

Applications may acquire grants through:

- OAuth callback capture
- manual/admin seeding
- migration from an existing DB
- custom `GrantStore` backed by Vault, KMS, SQL, Redis, etc.

## Security And Threat Model Requirements

This layer must not become a broad token vending machine. The default ergonomic path should inject capability-scoped clients, not a raw global token manager. A handler should receive something like a preconfigured Google Calendar read client or invoice-service write client, not the ability to request any integration ID, scope, audience, or subject.

Primary assets:

- refresh tokens/grants
- OAuth client secrets
- RAS internal service tokens
- access tokens
- user-to-provider account links
- integration configuration, allowed hosts, allowed audiences, and allowed scopes

Primary trust boundaries:

- browser OAuth callback into service
- RAS session identity into integration grant lookup
- grant store into token manager
- RAS internal token issuer from #13 into `RasInternalTokenSource`
- token manager into outbound HTTP
- fake providers into test confidence

The implementation must address:

- Confused-deputy risk: handlers must not build arbitrary `TokenRequest` values from user-controlled input.
- Policy-bound token minting: `RasInternalTokenSource` must call #13 for authorization and must not duplicate or bypass partial local policy checks.
- Internal token requests must carry enough subject/principal mode information for #13 to distinguish service-as-service, user-delegated, and application/service-account issuance.
- OAuth CSRF/account-linking risk: consent `state` must bind to the RAS user/session, integration ID, provider/issuer, redirect URI, requested scopes/audience, and PKCE verifier.
- Secret persistence risk: `GrantStore` is a security boundary, and refresh grants must not be accidentally logged, debug-printed, or serialized.
- Bearer-token exfiltration risk: managed bearer tokens must only be attached to configured integration hosts/base URLs, with redirect behavior explicitly controlled.
- Cross-tenant/cache-collision risk: cache and grant keys must include tenant/customer context where applicable, subject, provider/client identity, audience/resource, canonical scopes, and grant/config version.
- Cache keys must also include token family/type and principal mode so external OAuth tokens, internal service tokens, delegated internal tokens, and gateway-derived tokens cannot collide.
- Token-family confusion risk: external OAuth tokens, internal RAS JWTs, static/legacy tokens, and future token kinds must not share ambiguous cache keys or host/audience attachment rules.
- Scope escalation risk: user token requests must be subset-checked against the stored grant scopes; broader scopes require a new consent flow.
- Internal audience escalation risk: internal service token requests must be checked by #13 against local service grants and allowed audiences before issuance.
- Internal token leakage risk: internal RAS JWTs must never be attached to third-party hosts or unregistered internal service hosts.
- Backend domain privacy risk: backend services should receive only their own audience-specific permissions, not the caller's full cross-service permission map, when the optional gateway mode is used.
- Stale authorization risk: cached internal RAS JWTs must be short-lived and must not outlive the revocation assumptions defined by #13.
- Refresh-token rotation risk: rotated refresh tokens must be persisted transactionally enough that a successful token response does not silently lose the new grant.
- Unsafe retry risk: automatic refresh-and-retry must not replay non-idempotent requests unless explicitly opted in.
- Revocation risk: users/admins must be able to disconnect or revoke stored external integration grants; internal service grant revocation is owned by #13.

## Acceptance Criteria

- A generic `TokenSource` abstraction exists and can be used by the token manager.
- OAuth2/OIDC is implemented as one `TokenSource`.
- RAS-internal JWT issuance is represented as a separate `RasInternalTokenSource` backed by #13.
- `RasInternalTokenSource` never mints or returns a token without a successful authorization/issuance response from #13.
- A service can request an access token for a service identity using OAuth2 client credentials when configured for an external provider.
- A service can request an access token for a RAS-authenticated user using a stored OAuth refresh grant.
- A service can request an internal RAS-issued JWT for an internal downstream service through `RasInternalTokenSource`.
- Internal token request types distinguish service-as-service, user-delegated, and application/service-account principals, even if v1 only fully implements service-as-service issuance.
- Token primitives support an optional gateway mode that derives single-audience backend tokens from a RAS web session without requiring a RAS authority call per proxied request.
- Missing user grants return a typed `ConsentRequired` error rather than falling back silently.
- Ordinary handler code can use capability-scoped clients without being able to request arbitrary integration IDs, scopes, audiences, or subjects.
- Integration configuration declares allowed scopes/audiences and allowed outbound hosts/base URLs; requests outside those bounds fail closed.
- OAuth consent state is opaque, single-use, expiring, and bound to the initiating RAS user/session, integration, provider/issuer, redirect URI, scopes/audience, and PKCE verifier.
- User token requests are subset-checked against stored grant scopes; broader scopes require a new consent flow.
- Access tokens are cached by token source/family, integration, tenant/customer context where applicable, subject, provider/client identity, audience/resource, canonicalized scopes, and grant/config version.
- Internal token cache keys include principal mode and target audience/service.
- Internal RAS JWTs and external OAuth tokens cannot collide in cache keys and cannot be attached through the wrong host/audience policy.
- Tokens refresh before expiry using configurable clock skew.
- Refresh-token rotation is persisted back to the configured `GrantStore`; save failures are surfaced and covered by tests.
- Concurrent refreshes for the same token key are de-duplicated.
- Secrets are redacted in `Debug`, errors, and logs, and refresh-token types do not accidentally serialize through serde.
- Existing generated-client `set_bearer_token` behavior remains compatible.
- A reqwest/header helper can attach managed bearer tokens to outbound HTTP requests only after integration host validation.
- Internal RAS JWTs are only attached to registered/allowed internal service hosts and are never attached to third-party hosts.
- Gateway-derived backend tokens contain only the target audience and target-audience permissions.
- Redirect behavior is explicit; bearer tokens are not forwarded to unvalidated redirect targets.
- No automatic replay of unsafe HTTP methods after refresh/401 unless the caller explicitly opts in.
- In-memory grant/cache stores are available for unit tests and local examples.
- Production persistence is provided by traits, not hardcoded to a specific database or secret manager.
- Users/admins can revoke or disconnect stored external integration grants, and future token requests fail after revocation.
- Internal token cache behavior respects #13 revocation assumptions through short TTLs, grant/config versioning, or explicit invalidation hooks.

## Test Plan

- Unit tests for cache hit/miss, expiry, early refresh, forced refresh, refresh-token rotation, and concurrent refresh.
- Token-source abstraction tests using fake OAuth2, fake RAS-internal, and static test sources.
- OAuth2 fake-provider tests for authorization-code, refresh-token, and client-credentials flows.
- RAS-internal token source tests proving issue #13 authorization failures surface correctly.
- RAS-internal token source tests proving it does not issue tokens from local request data without #13 approval.
- RAS-internal token source tests proving service-as-service and delegated-principal request shapes produce distinct cache keys and authorization requests.
- Failure tests for invalid scope, revoked refresh token, provider unavailable, malformed response, missing grant, wrong user/session state, wrong provider/issuer, wrong redirect URI, wrong PKCE verifier, unallowed outbound host, and unallowed internal audience.
- Integration-style service test where a RAS handler calls a fake downstream API through `AuthorizedHttpClient`; assert the fake API receives the expected bearer token.
- Tests ensuring token and refresh-token values are never printed through debug/error paths.
- Tests proving non-idempotent outbound requests are not automatically replayed after token refresh unless explicitly configured.
- Tests proving a handler cannot request undeclared scopes or integrations through the default/capability-scoped API.
- Tests proving external OAuth tokens and internal RAS JWTs use distinct cache keys and cannot be sent to each other's host classes.
- Tests proving gateway-style token narrowing produces single-audience backend tokens and omits unrelated audience permissions.

## Non-Goals For v1

- Do not replace inbound `AuthProvider`, `SessionService`, or existing identity crates.
- Do not implement the full RAS authorization control plane in this issue; that is tracked by #13.
- Do not implement the full reverse proxy/gateway in this issue; that should be tracked separately.
- Do not ship SQL/Redis/Vault persistence adapters in the first version.
- Do not build provider-specific SDKs for Google/Auth0/Azure/GitHub in v1.
- Do not implement non-OAuth signing schemes such as HMAC/API-key rotation in v1.

## Notes

The key design goal is to let service code say “give me a token for this integration/user/scope” and keep token authorization, acquisition, caching, refresh, request authorization, and realistic fakes behind reusable RAS abstractions.

For internal services, the external IdP should not be involved. The internal path should be RAS-native:

```text
RAS authorization/control plane (#13)
  -> RasInternalTokenSource
  -> TokenManager / AuthorizedHttpClient
  -> internal service-to-service request
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add outbound token framework with pluggable token sources #12

Summary

Problem

Proposed Direction

Core API Sketch

Security And Threat Model Requirements

Acceptance Criteria

Test Plan

Non-Goals For v1

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add outbound token framework with pluggable token sources #12

Description

Summary

Problem

Proposed Direction

Core API Sketch

Security And Threat Model Requirements

Acceptance Criteria

Test Plan

Non-Goals For v1

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions