Feature: clickhouse-client --login auto-discovery of OAuth parameters over the native port

## Summary

Let `clickhouse-client --login` **auto-discover** the OAuth 2.0 server parameters (`issuer`/`oauth-url`, `client_id`, `audience`, supported flows) instead of requiring the user to pass `--oauth-url`/`--oauth-client-id`/`--oauth-audience` or to ship a credentials JSON file.

The catch is that `clickhouse-client` is configured with only the **native** port (`9000`/`9440`) and speaks the binary protocol there, while OAuth discovery is an HTTP concept whose natural home is the HTTP port (`8123`/`8443`) — a different port the client was never told about, and one that is often **not exposed at all** in hardened deployments. So the design has to bridge "client only knows the native port" to "discovery is HTTP-shaped".

The good news: the native TCP port **already** answers HTTP requests with a fallback message (the "Port 9000 is for clickhouse-client" text), and the server **already** holds every value the client needs in `<token_processors>`. This proposal wires those together.

## Motivation

Today, to log in with OAuth a user must already know — and pass on the command line — the very things the server could tell them:

```
clickhouse-client --host ch.example.com --secure \
  --login=browser \
  --oauth-url=https://issuer.example.com \
  --oauth-client-id=abcd...apps \
  --oauth-audience=https://ch.example.com
```

These values are not secrets and the server already knows them (they live in `<token_processors>` for token *verification*). Forcing every user to copy them into flags (or distribute an `oauth_client.json`) is friction and a source of drift: the client-side values and the server-side `expected_issuer`/`introspection_client_id`/`expected_audience` can disagree. The goal is:

```
clickhouse-client --host ch.example.com --secure --login
```

…and the client discovers the rest from the server it is already connecting to.

## The core constraint

- `clickhouse-client` knows only the **native** port and speaks the binary protocol there.
- Discovery is HTTP. The **HTTP port** (`8123`/`8443`) is a *different* port the client was not given, and is **frequently firewalled off** — many secure deployments expose only `9440`.

⇒ A design that says "just fetch `http://host:8123/.well-known/…`" has two holes: the client doesn't know the HTTP port, and it may be unreachable. This pushes the discovery channel onto the **native port the client already has**.

## What already exists (so this is small)

| Building block | Where | Note |
|---|---|---|
| HTTP-on-native-port fallback | `src/Server/TCPHandler.cpp:1829` `formatHTTPErrorResponseWhenUserIsConnectedToWrongPort`, fired in `receiveHello` (`src/Server/TCPHandler.cpp:1886`) when the first byte is `'G'`/`'P'` | Takes `config` + `is_secure`; **already reads `tcp_port`/`http_port`** and tells the user the HTTP port. Can also see `<token_processors>`. Works on `9440` (TLS terminates, bytes reach `receiveHello`). |
| Client OAuth flags + flows | `programs/client/Client.cpp:760` (`--login`, `--oauth-url`, `--oauth-client-id`, `--oauth-audience`, `--oauth-credentials`); flows in `src/Client/OAuthFlowRunner.cpp` (device + auth-code/PKCE) | Discovery only needs to **populate these three values** before the flow starts. Endpoints (`auth_uri`/`token_uri`/`device_auth_uri`) come from the IdP's own OIDC discovery against `issuer`. |
| Server OAuth config (source of truth) | `<token_processors>`, `src/Access/TokenProcessorsParse.cpp` | Holds `expected_issuer`, `introspection_client_id`, `expected_audience`, discovery endpoint. No `/.well-known` is served today. |
| Public-config GET endpoint (sibling) | Companion proposal in #1930 | Emits the **public subset** of `token_processors` for a browser SPA — essentially the same discovery document the CLI needs. Same registry, same security model. |

## Proposed design

### One discovery document (public subset of a `token_processors` entry — never secrets)

```json
{
  "issuer": "https://issuer.example.com",
  "client_id": "abcd...apps",
  "audience": "https://ch.example.com",
  "scopes": ["openid", "profile"],
  "flows": ["browser", "device"]
}
```

The client then runs **standard OIDC discovery** against `issuer` (`/.well-known/openid-configuration`) to resolve `authorization_endpoint`/`token_endpoint`/`device_authorization_endpoint`. This keeps the ClickHouse document minimal and reuses the IdP's own well-known, mapping cleanly onto the existing client flags (`--oauth-url` ← `issuer`, `--oauth-client-id` ← `client_id`, `--oauth-audience` ← `audience`).

### Transport: serve it on the native port (primary), HTTP port (optional)

**Primary — native-port HTTP fallback (works everywhere the client can already reach).** Extend `receiveHello`'s HTTP path so that, when at least one `token_processors` entry is advertised for login, a request for a well-known path returns `200 OK` + the JSON above instead of the fixed `400`. The client, given `--login` with no explicit oauth flags, opens a socket to the **same native port it is already configured for** and sends a one-line HTTP `GET`:

```
GET /.well-known/clickhouse-oauth HTTP/1.0\r\n\r\n
```

- Reuses the existing hook; the function already has `config` and `is_secure`.
- Survives **`9440`-only** deployments (TLS terminates first, then the HTTP bytes reach `receiveHello` — the same path that produces today's message over `https://…:9440`).
- Keep today's human-readable `400` text for any non-discovery path so a mistaken `curl` still gets the helpful message.

**Optional — HTTP-port handler.** The same generator can also answer on the HTTP port as a normal handler (this is the GET companion in #1930) for SPAs and standards alignment. One source of truth, two transports.

### Client side

When `--login` is given and `--oauth-url`/`--oauth-client-id` are **absent**, probe the native port for the discovery document before starting the flow. Explicit flags and `--oauth-credentials` always override discovery. If discovery fails or OAuth isn't advertised, fall back to today's behavior (Cloud auto-login path / require flags) with a clear message.

## Implementation steps

1. **Discovery generator**: a small function that maps advertised `token_processors` entries to the public JSON subset (whitelist below). Reused by both transports.
2. **Native-port transport**: in `TCPHandler.cpp`, when the first byte is `'G'`/`'P'`, read the request line; if the path matches the well-known path *and* OAuth is advertised, write `HTTP/1.0 200 OK\r\nContent-Type: application/json\r\n\r\n` + the document; otherwise emit today's `400` text unchanged.
3. **HTTP-port transport (optional)**: register the same generator as an `<http_handlers>` handler / well-known route (converges with #1930).
4. **Client discovery**: in the `--login` path (`programs/client/Client.cpp` / `src/Client/OAuth*`), if oauth flags are unset, fetch the document from the native port, run IdP OIDC discovery against `issuer`, then proceed with the existing browser/device flow.
5. **Docs**: document `--login` auto-discovery and the per-processor opt-in.

## Security considerations

- The document is served **pre-auth** and is **public** by nature (`issuer`/`client_id`/`audience` already appear in every authorize URL).
- **Strict field whitelist, never a dump.** Emit only `issuer`/`client_id`/`audience`/`scopes`/`flows`. It must never emit `introspection_client_secret`, `static_key`, `private_key`, or any JWKS private material. The whitelist *is* the security boundary.
- **Per-processor opt-in** (e.g. `<advertise_for_login>true</advertise_for_login>`) so only intended IdPs are advertised over this pre-auth channel.
- Do **not** alter the existing human-readable `400` for ordinary mistaken connections — only add a `200` branch for the explicit well-known path.

## Alternatives considered

- **Extend the binary server `Hello`** with a protocol-versioned OAuth-metadata field (client does a pre-auth probe handshake). The most principled, fully port-agnostic, machine-readable option — but it needs a protocol revision + negotiation and more code on both sides. Good long-term direction; out of scope for v1.
- **Two-hop via the HTTP port** (parse the fallback message to learn `http_port`, then fetch `/.well-known/clickhouse-oauth` there). Standards-shaped and independently useful, but **breaks when the HTTP port isn't exposed** (common), relies on parsing a human-readable message, and adds a round trip. The optional HTTP-port handler above covers the SPA case without making the CLI depend on it.
- **Status quo** — require `--oauth-url`/`--oauth-client-id`/`--oauth-audience` or `--oauth-credentials`, or the hardcoded Cloud auto-login path. No discovery.

## Relationship to #1930

#1930 proposes server-side OAuth handlers for a browser SPA: a `POST /oauth/token` code-exchange (secret stays server-side) and a `GET` public-config endpoint sourced from `token_processors`. This issue is the **CLI counterpart**: the same public-config document, but reachable over the **native port** so `clickhouse-client --login` can discover it without knowing or reaching the HTTP port. Shared registry (`<token_processors>`), shared whitelist/security model — likely the same generator behind both transports.

---
_Drafted with [Claude Code](https://claude.com/claude-code) against the Antalya tree; file/line references from a read-through of `src/Server/TCPHandler.cpp`, `programs/client/Client.cpp`, `src/Client/OAuthFlowRunner.cpp`, and `src/Access/TokenProcessorsParse.cpp`._


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: clickhouse-client --login auto-discovery of OAuth parameters over the native port #1983

Summary

Motivation

The core constraint

What already exists (so this is small)

Proposed design

One discovery document (public subset of a `token_processors` entry — never secrets)

Transport: serve it on the native port (primary), HTTP port (optional)

Client side

Implementation steps

Security considerations

Alternatives considered

Relationship to #1930

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Building block	Where	Note
HTTP-on-native-port fallback	`src/Server/TCPHandler.cpp:1829` `formatHTTPErrorResponseWhenUserIsConnectedToWrongPort`, fired in `receiveHello` (`src/Server/TCPHandler.cpp:1886`) when the first byte is `'G'`/`'P'`	Takes `config` + `is_secure`; already reads `tcp_port`/`http_port` and tells the user the HTTP port. Can also see `<token_processors>`. Works on `9440` (TLS terminates, bytes reach `receiveHello`).
Client OAuth flags + flows	`programs/client/Client.cpp:760` (`--login`, `--oauth-url`, `--oauth-client-id`, `--oauth-audience`, `--oauth-credentials`); flows in `src/Client/OAuthFlowRunner.cpp` (device + auth-code/PKCE)	Discovery only needs to populate these three values before the flow starts. Endpoints (`auth_uri`/`token_uri`/`device_auth_uri`) come from the IdP's own OIDC discovery against `issuer`.
Server OAuth config (source of truth)	`<token_processors>`, `src/Access/TokenProcessorsParse.cpp`	Holds `expected_issuer`, `introspection_client_id`, `expected_audience`, discovery endpoint. No `/.well-known` is served today.
Public-config GET endpoint (sibling)	Companion proposal in #1930	Emits the public subset of `token_processors` for a browser SPA — essentially the same discovery document the CLI needs. Same registry, same security model.

Uh oh!

Feature: clickhouse-client --login auto-discovery of OAuth parameters over the native port #1983

Description

Summary

Motivation

The core constraint

What already exists (so this is small)

Proposed design

One discovery document (public subset of a token_processors entry — never secrets)

Transport: serve it on the native port (primary), HTTP port (optional)

Client side

Implementation steps

Security considerations

Alternatives considered

Relationship to #1930

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

One discovery document (public subset of a `token_processors` entry — never secrets)