Skip to content
Open
12 changes: 12 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,3 +157,15 @@ When making changes to the codebase:
## PR Reviews

CodeRabbit reviews PRs automatically, but it has an hourly quota and runs out of org credits. If a PR shows a "Review limit reached" / "out of usage credits" message instead of an actual review, run the `/review` skill locally against the PR to get review feedback without waiting for the quota to refill.

## PR Title and Description Maintenance

When pushing additional commits to an existing PR, check whether the title and description still describe the change accurately. They often go stale during review iterations: a flag gets renamed, an API gets reshaped, an extra fix lands, etc. The PR description is what shows up in the squash-merge commit, so a stale title/body means a misleading entry in `git log` forever.

Update them with `gh pr edit <num> --title "..." --body "..."` whenever the scope shifts. Specifically watch for:

- Flags, file names, or public APIs renamed in later commits but still referenced by their old name in the PR body.
- Bullet points in the "Summary" section that describe behavior the latest commits have changed or removed.
- The test-plan checklist getting out of date as new tests are added.

When you edit a PR description you authored, keep the `(Written by Claude)` marker so reviewers still know the body wasn't human-authored.
102 changes: 37 additions & 65 deletions doc/bin/relay/cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,96 +5,68 @@ description: Run multiple moq-relay instances across multiple hosts/regions

# Clustering

Multiple relay instances can cluster for geographic distribution and improved latency.
Relays can be joined together to proxy announcements and subscriptions between each other. A viewer talks to whichever relay is closest; if their broadcast lives somewhere else in the cluster, the local relay fetches it from a neighbor and caches it.

## Overview
A broadcast carries a small hop list as it travels. Each relay it passes through adds itself to the list, which is how loops are caught and how the network picks the shortest path when there's more than one.

`moq-relay` uses a simple clustering scheme:
## Topology

1. **Root node** - A single relay (can serve public traffic) that tracks cluster membership
2. **Other nodes** - Accept internet traffic and consult the root for routing
Each relay lists the peers it wants to dial in `cluster.connect`. That's it; the topology is whatever you draw with those links.

When a relay publishes a broadcast, it advertises its `node` address to other relays via the root.
A simple chain works well when one region is the source and others are caches:

## Configuration
```text
eu-west <--- us-east <--- us-west
```

```toml
# us-east.toml
[cluster]
root = "https://root-relay.example.com" # Root node
node = "https://us-east.relay.example.com" # This node's address
```

### Cluster Arguments
connect = ["eu-west.example.com:4443"]

- `--cluster-root <HOST>` - Hostname/IP of the root node (omit to make this node the root)
- `--cluster-node <HOST>` - Hostname/IP of this instance (needs valid TLS cert)

## How It Works
# us-west.toml
[cluster]
connect = ["us-east.example.com:4443"]
```

1. Each relay connects to the root node on startup
2. When a publisher connects to any relay, that relay announces the broadcast
3. The root node tracks which relay has which broadcasts
4. When a subscriber connects, the relay queries the root to find the broadcast
5. Relays connect to each other to forward traffic
A publisher on `eu-west` reaches a viewer on `us-west` through `us-east`. If a second `us-west` viewer subscribes to the same broadcast, `us-east` already has it cached, so only one fetch crosses the Atlantic. A full mesh (every relay dialing every other) would skip the cache entirely and waste an outbound link per pair.

## Benefits
Pick the shape that matches your traffic. Linear chains are great for fanout; small N-way meshes are fine when latency matters more than dedup; mixed shapes work too.

- **Lower latency** - Users connect to nearest relay
- **Higher availability** - Redundancy across regions
- **Geographic distribution** - Serve global audiences
## Auto-discovery

## Example Topology
Listing every peer by hand can get tedious in larger clusters. Set `cluster.mesh` to this relay's own URL and connected peers will discover and dial it back automatically:

```text
┌─────────────┐
│ Root Node │
│ (US-C) │
└──────┬──────┘
┌───────────────┼───────────────┐
│ │ │
┌──────┴──────┐ ┌──────┴──────┐ ┌──────┴──────┐
│ US-East │ │ EU-West │ │ Asia-SE │
│ Relay │ │ Relay │ │ Relay │
└─────────────┘ └─────────────┘ └─────────────┘
```toml
[cluster]
connect = ["us-east.example.com:4443"]
mesh = "us-west.example.com:4443"
```

## Peer Authentication

Cluster peers must authenticate to each other. Two options:

### JWT token

Each leaf reads a JWT from `cluster.token` (see [Authentication](/bin/relay/auth))
and presents it to the root on connect. The token must grant cluster privileges
and full publish/subscribe access.

### mTLS (recommended for new deployments)
Each node with `mesh` set creates a broadcast carrying its address, which other nodes pick up. `connect` is optional once gossip is running, but you still need at least one connection somewhere (either you dial a peer or a peer dials you) for the advertisement to flow.

Configure the root with `tls.root` pointing at the CA that signed the leaf
certificates. Leaves connect with a client certificate signed by that CA —
no JWT needed. The leaf's cluster node name is taken from the first DNS SAN on
its certificate, so identity is bound to the cert rather than self-declared.
A relay with `mesh` set and no `connect` is a passive rendezvous: it sits and waits for inbound connections, then helps everyone else find each other.

See [Authentication → mTLS Peer Authentication](/bin/relay/auth#mtls-peer-authentication)
for details.
## Authentication

## Current Limitations
Cluster peers must authenticate to each other:

- **Mesh topology** - All relays connect to all others
- **Not optimized for large clusters** - 3-5 nodes recommended
- **Single root node** - Future: multi-root for redundancy
- **mTLS** (recommended). Set `tls.root` to the CA that signed the cluster certificates. Inbound connections presenting a valid client cert are granted full access; outbound dials use `client.tls.cert` / `client.tls.key`.
- **JWT**. Each relay reads a token from `cluster.token` and presents it on outbound dials. The token needs broad enough scope to cover whatever paths the cluster carries.

## Production Example
See [Authentication](/bin/relay/auth) for the full setup.

The public CDN at `cdn.moq.dev` uses this clustering approach:
## Migration from older configs

- `usc.cdn.moq.dev` - US Central (root)
- `euc.cdn.moq.dev` - EU Central
- `sea.cdn.moq.dev` - Southeast Asia
`cluster.root` and `cluster.node` were both removed. If a config still sets either flag, the relay errors at startup with a message pointing at the replacements:

Clients use GeoDNS to connect to the nearest relay automatically.
| Old | New |
|---|---|
| `root = "rendezvous:4443"` + `node = "us-east:4443"` | `connect = ["rendezvous:4443"]` + `mesh = "us-east:4443"` |
| `root = "rendezvous:4443"` only | `mesh = "rendezvous:4443"` (passive rendezvous) |
| `node = "us-east:4443"` | `mesh = "us-east:4443"` |

## Next Steps
## Next steps

- Deploy to [Production](/bin/relay/prod)
- Set up [Authentication](/bin/relay/auth)
Expand Down
16 changes: 8 additions & 8 deletions doc/bin/relay/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,18 +120,18 @@ Clustering configuration for multi-relay deployments.

```toml
[cluster]
# Address of the root relay to connect to
# Omit this to make this relay the root
connect = "root.relay.example.com:4443"
# Peers this relay dials. The topology is whatever you draw with these links.
connect = ["us-east.example.com:4443"]

# JWT token file for cluster authentication
token = "cluster.jwt"
# Optional. Set to this relay's own URL to advertise it so other peers find
# you automatically.
mesh = "us-west.example.com:4443"

# This relay's address, as reachable by other cluster nodes
node = "leaf1.relay.example.com:4443"
# JWT used for outbound cluster dials (alternative to mTLS).
token = "cluster.jwt"
```

See [Clustering](/bin/relay/cluster) for deployment patterns.
See [Clustering](/bin/relay/cluster) for topology choices and the trade-off between hand-listed peers and gossip.

### \[client]

Expand Down
18 changes: 5 additions & 13 deletions rs/moq-relay/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,22 +58,14 @@ HTTPS is currently not supported.

## Clustering

In order to scale MoQ, you will eventually need to run multiple moq-relay instances potentially in different regions.
This is called *clustering*, where the goal is that a user connects to the closest relay and they magically form a mesh behind the scenes.
Relays can be joined together to proxy announcements and subscriptions. A viewer talks to whichever relay is closest; if their broadcast lives elsewhere in the cluster, the local relay fetches it from a neighbor and caches it. Hop tracking on every broadcast keeps loops out and picks the shortest path when there's more than one.

**moq-relay** uses a simple clustering scheme using moq-lite.
This is both dog-fooding and a surprisingly ueeful way to distribute live metadata at scale.
- `--cluster-connect <peer-url>` lists the peers this relay dials. Repeatable; defines the topology by hand. A simple chain like `eu-west <- us-east <- us-west` lets `us-east` cache and dedup the transatlantic fetches that fan out to many `us-west` viewers.
- `--cluster-mesh <self-url>` is optional. When set, this relay advertises its own URL to connected peers and dials any peers it learns about, so larger clusters don't need each node hand-configured. You still need at least one connection (in or out) so the advertisement has a path to flow. A relay with `--cluster-mesh` set and no `--cluster-connect` is a passive rendezvous.

We currently use a single "root" node that is used to discover members of the cluster and what broadcasts they offer.
This is a normal moq-relay instance, potentially serving public traffic, unaware of the fact that it's in charge of other relays.
`--cluster-root` and `--cluster-node` from earlier versions were removed. The relay errors at startup if either is set and points at the replacements.

The other moq-relay instances accept internet traffic and consult the root for routing.
They can then advertise their internal ip/hostname to other instances when publishing a broadcast.

Cluster arguments:

- `--cluster-root <HOST>`: The hostname/ip of the root node. If missing, this node is a root.
- `--cluster-node <HOST>`: The hostname/ip of this instance. There needs to be a corresponding valid TLS certificate, potentially self-signed. If missing, published broadcasts will only be available on this specific relay.
See [doc/bin/relay/cluster.md](https://github.com/moq-dev/moq/blob/main/doc/bin/relay/cluster.md) for the full walkthrough, including topology trade-offs and authentication.

## Authentication

Expand Down
Loading
Loading