Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/auth-simple-operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ The config is re-read on each request, so changes take effect immediately withou
}
```

> **Note:** Only `osImages` is required. Add `gatewayAppId` after deploying the Gateway. Add `apps` entries as you deploy applications.
> **Note:** `osImages` is always required. For KMS authorization, you must also populate `kms.mrAggregated`; if it is left empty, auth-simple denies all KMS boots. Add `gatewayAppId` after deploying the Gateway. Add `apps` entries as you deploy applications.

---

Expand Down Expand Up @@ -240,7 +240,7 @@ The `mrAggregated` is sent by the booting KMS in its auth request. To get this v
KMS boot auth request: { osImageHash: '0x...', mrAggregated: '0x...', ... }
```

2. **Initial setup**: Leave `kms.mrAggregated` empty for the first KMS (empty array allows any). After it boots, check the logs and add the value.
2. **Initial setup**: capture the first KMS measurement with `Onboard.GetAttestationInfo` or from auth logs, then add it to `kms.mrAggregated` before bootstrap. An empty array now denies all KMS boots.

### Add to Config

Expand Down
20 changes: 16 additions & 4 deletions docs/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,13 +98,16 @@ Start in separate terminals:
For production, deploy KMS and Gateway as CVMs with hardware-rooted security. Production deployments require:
- KMS running in a CVM (not on the host)
- Auth server for authorization (webhook mode)
- KMS measurements allowlisted before bootstrap / onboarding / trusted RPCs can succeed

If you skip the KMS allowlist step, the VM may boot and the onboard UI may still appear, but the KMS will reject bootstrap, onboarding, or later trusted RPCs with authorization errors.

### Production Checklist

**Required:**

1. Set up TDX host with dstack-vmm
2. Deploy KMS as CVM (with auth server)
2. Deploy KMS as CVM (with auth server, capture its attestation info, and allowlist the KMS `mrAggregated` before bootstrap)
3. Deploy Gateway as CVM

**Optional Add-ons:**
Expand Down Expand Up @@ -197,11 +200,16 @@ Create `auth-config.json` for initial KMS deployment:
```json
{
"osImages": ["0x<os-image-hash>"],
"kms": { "allowAnyDevice": true },
"kms": {
"mrAggregated": ["0x<kms-mr-aggregated>"],
"allowAnyDevice": true
},
"apps": {}
}
```

> **Important:** `auth-simple` now treats an empty `kms.mrAggregated` allowlist as deny-all for KMS. Capture the current KMS measurement with `Onboard.GetAttestationInfo` and add it before bootstrap.

Run auth-simple:

```bash
Expand Down Expand Up @@ -460,7 +468,6 @@ Additional KMS instances can onboard from an existing KMS to share the same root
[core.onboard]
enabled = true
auto_bootstrap_domain = "" # Empty = onboard mode
quote_enabled = true # Require TDX attestation
address = "0.0.0.0"
port = 9203 # HTTP port for onboard UI
```
Expand All @@ -480,7 +487,12 @@ curl http://<new-kms>:9203/finish
# Restart KMS - it will now serve as a full KMS with shared keys
```

> **Note:** For KMS onboarding with `quote_enabled = true`, add the KMS mrAggregated hash to your auth server's `kms.mrAggregated` whitelist.
> **Note:** KMS onboarding requires attested KMS instances, and both sides must already be authorized. Add the relevant KMS `mrAggregated` hashes to your auth backend first:
>
> - the destination KMS must allow the source KMS
> - the source KMS must allow the destination KMS
>
> If you skip this, `Onboard.Onboard` or later trusted RPCs will fail with KMS authorization errors.

---

Expand Down
2 changes: 0 additions & 2 deletions docs/tutorials/kms-build-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,6 @@ url = "http://127.0.0.1:9200"
[core.onboard]
enabled = true
auto_bootstrap_domain = ""
quote_enabled = true
address = "0.0.0.0"
port = 9100
EOF
Expand Down Expand Up @@ -495,7 +494,6 @@ enabled = true
# Empty domain = manual bootstrap mode (ensures bootstrap-info.json is written)
auto_bootstrap_domain = ""
# Enable TDX quotes - works because KMS runs in CVM
quote_enabled = true
address = "0.0.0.0"
port = 9100
EOF
Expand Down
32 changes: 30 additions & 2 deletions docs/tutorials/kms-cvm-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,6 @@ configs:
[core.onboard]
enabled = true
auto_bootstrap_domain = ""
quote_enabled = true
address = "0.0.0.0"
port = 9100
EOF
Expand Down Expand Up @@ -314,20 +313,49 @@ Onboarding
```

> **Important:** KMS is now in onboard mode — a plain HTTP server waiting for bootstrap. It will **not** serve TLS or respond to `KMS.GetMeta` until you complete the next step.
>
> **Critical prerequisite:** before bootstrap can succeed, the KMS must already be authorized by your auth backend.
>
> - For `auth-simple`, add the KMS `mrAggregated` to `kms.mrAggregated`
> - For `auth-eth`, add the KMS `mrAggregated` on-chain with `addKmsAggregatedMr(...)`
>
> You can fetch the value before bootstrap with:
>
> ```bash
> curl -s -X POST \
> -H "Content-Type: application/json" \
> -d '{}' \
> "http://localhost:9100/prpc/Onboard.GetAttestationInfo?json" | jq .
> ```
>
> If you skip this step, `Onboard.Bootstrap` will fail with a KMS authorization error and the KMS will not enter normal service.
>
> **Pre-bootstrap checklist:**
>
> 1. `Onboard.GetAttestationInfo` returns the current KMS measurement
> 2. that `mrAggregated` has been allowlisted in your auth backend
> 3. the auth backend is reachable from the KMS CVM
> 4. you are still calling the onboard HTTP endpoint, not the post-bootstrap TLS endpoint

### Step 6: Bootstrap KMS

With KMS in onboard mode, trigger key generation by calling the Bootstrap RPC endpoint. This generates root keys, a TDX attestation quote, and writes `bootstrap-info.json`:

```bash
# Inspect the KMS measurement before bootstrap
curl -s -X POST \
-H "Content-Type: application/json" \
-d '{}' \
"http://localhost:9100/prpc/Onboard.GetAttestationInfo?json" | jq .

# Replace kms.yourdomain.com with your actual KMS domain
curl -s -X POST \
-H "Content-Type: application/json" \
-d '{"domain":"kms.yourdomain.com"}' \
"http://localhost:9100/prpc/Onboard.Bootstrap?json" | tee ~/kms-deploy/bootstrap-info.json | jq .
```

> **Note:** This uses plain `http://` — KMS is still in onboard mode (no TLS yet). The `tee` command saves the response to `bootstrap-info.json` while also displaying it. You'll need this file later to register KMS on-chain.
> **Note:** This uses plain `http://` — KMS is still in onboard mode (no TLS yet). The `tee` command saves the response to `bootstrap-info.json` while also displaying it. You'll need this file later to register KMS on-chain. If this call fails with a KMS authorization error, allowlist the `mrAggregated` value first and retry.

Expected response:

Expand Down
6 changes: 2 additions & 4 deletions docs/tutorials/troubleshooting-kms-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,7 +241,7 @@ export DSTACK_VMM_AUTH_PASSWORD=$(cat ~/.dstack/secrets/vmm-auth-token)
"quote": null
```

This indicates quote_enabled might be false, guest-agent issues, or **SGX not properly configured**:
This indicates guest-agent issues, simulator misconfiguration, or **SGX not properly configured**:

```bash
# Check CVM logs for TDX-related errors (replace VM_ID with actual ID from lsvm)
Expand All @@ -259,9 +259,7 @@ curl -s -H "Authorization: Bearer $(cat ~/.dstack/secrets/vmm-auth-token)" \

2. **SGX Auto MP Registration not enabled** - Without this BIOS setting, your platform isn't registered with Intel's PCS, and attestation quotes cannot be verified. Re-enter BIOS and enable "SGX Auto MP Registration".

3. **quote_enabled is false** - Verify your `kms.toml` has `quote_enabled = true` in the `[core.onboard]` section.

4. **Guest-agent not running** - The `/var/run/dstack.sock` socket must exist inside the CVM.
3. **Guest-agent / simulator not running** - The KMS must be able to reach a working dstack guest agent endpoint. In a real CVM, `/var/run/dstack.sock` must exist. For local development, start `sdk/simulator` first.

### CVM Fails with "QGS error code: 0x12001"

Expand Down
9 changes: 5 additions & 4 deletions kms/auth-simple/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,13 @@ bun install

Create `auth-config.json` (see `auth-config.example.json`).

For initial KMS deployment, you only need the OS image hash:
For KMS deployment, you must allowlist both the OS image hash and the KMS `mrAggregated` value:

```json
{
"osImages": ["0x0b327bcd642788b0517de3ff46d31ebd3847b6c64ea40bacde268bb9f1c8ec83"],
"kms": {
"mrAggregated": ["0x<kms-mr-aggregated>"],
"allowAnyDevice": true
},
"apps": {}
Expand All @@ -39,7 +40,7 @@ Add more fields as you deploy Gateway and apps:
"osImages": ["0x..."],
"gatewayAppId": "0x...",
"kms": {
"mrAggregated": [],
"mrAggregated": ["0x..."],
"devices": [],
"allowAnyDevice": true
},
Expand All @@ -59,7 +60,7 @@ Add more fields as you deploy Gateway and apps:
|-------|----------|-------------|
| `osImages` | Yes | Allowed OS image hashes (from `digest.txt`) |
| `gatewayAppId` | No | Gateway app ID (add after Gateway deployment) |
| `kms.mrAggregated` | No | Allowed KMS aggregated MR values |
| `kms.mrAggregated` | Yes for KMS authorization | Allowed KMS aggregated MR values. An empty array denies all KMS boots. |
| `kms.devices` | No | Allowed KMS device IDs |
| `kms.allowAnyDevice` | No | If true, skip device ID check for KMS |
| `apps.<appId>.composeHashes` | No | Allowed compose hashes for this app |
Expand Down Expand Up @@ -160,7 +161,7 @@ KMS boot authorization.

1. `tcbStatus` must be "UpToDate"
2. `osImageHash` must be in `osImages` array
3. `mrAggregated` must be in `kms.mrAggregated` (if non-empty)
3. `mrAggregated` must be in `kms.mrAggregated`
4. `deviceId` must be in `kms.devices` (unless `allowAnyDevice` is true)

### App Boot Validation
Expand Down
4 changes: 3 additions & 1 deletion kms/auth-simple/auth-config.example.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
"0x0b327bcd642788b0517de3ff46d31ebd3847b6c64ea40bacde268bb9f1c8ec83"
],
"kms": {
"mrAggregated": [],
"mrAggregated": [
"0x<kms-mr-aggregated>"
],
"devices": [],
"allowAnyDevice": true
},
Expand Down
4 changes: 2 additions & 2 deletions kms/auth-simple/bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

31 changes: 29 additions & 2 deletions kms/auth-simple/index.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,10 @@ describe('auth-simple', () => {
writeTestConfig({
gatewayAppId: '0xgateway',
osImages: ['0x1fbb0cf9cc6cfbf23d6b779776fabad2c5403d643badb9e5e238615e4960a78a'],
kms: { allowAnyDevice: true }
kms: {
mrAggregated: ['0xabc123'],
allowAnyDevice: true
}
});

const res = await app.fetch(new Request('http://localhost/bootAuth/kms', {
Expand All @@ -93,7 +96,10 @@ describe('auth-simple', () => {
writeTestConfig({
gatewayAppId: '0xgateway',
osImages: ['0xdifferentimage'],
kms: { allowAnyDevice: true }
kms: {
mrAggregated: ['0xabc123'],
allowAnyDevice: true
}
});

const res = await app.fetch(new Request('http://localhost/bootAuth/kms', {
Expand Down Expand Up @@ -128,6 +134,27 @@ describe('auth-simple', () => {
expect(json.reason).toContain('MR');
});

it('rejects KMS boot when the allowlist is empty', async () => {
writeTestConfig({
gatewayAppId: '0xgateway',
osImages: ['0x1fbb0cf9cc6cfbf23d6b779776fabad2c5403d643badb9e5e238615e4960a78a'],
kms: {
mrAggregated: [],
allowAnyDevice: true
}
});

const res = await app.fetch(new Request('http://localhost/bootAuth/kms', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(baseBootInfo)
}));
const json = await res.json();

expect(json.isAllowed).toBe(false);
expect(json.reason).toContain('MR');
});

it('allows KMS boot with allowAnyDevice', async () => {
writeTestConfig({
gatewayAppId: '0xgateway',
Expand Down
2 changes: 1 addition & 1 deletion kms/auth-simple/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ class ConfigBackend {

// check aggregated MR
const allowedMrs = config.kms.mrAggregated.map(normalizeHex);
if (allowedMrs.length > 0 && !allowedMrs.includes(mrAggregated)) {
if (!allowedMrs.includes(mrAggregated)) {
return {
isAllowed: false,
reason: 'aggregated MR not allowed',
Expand Down
1 change: 0 additions & 1 deletion kms/dstack-app/compose-dev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,5 @@ configs:
[core.onboard]
enabled = true
auto_bootstrap_domain = ""
quote_enabled = true
address = "0.0.0.0"
port = 8000
1 change: 0 additions & 1 deletion kms/dstack-app/compose-simple.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,5 @@ configs:
[core.onboard]
enabled = true
auto_bootstrap_domain = ""
quote_enabled = true
address = "0.0.0.0"
port = 8000
1 change: 0 additions & 1 deletion kms/kms.toml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,5 @@ gateway_app_id = "any"
[core.onboard]
enabled = true
auto_bootstrap_domain = ""
quote_enabled = true
address = "0.0.0.0"
port = 8000
1 change: 0 additions & 1 deletion kms/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,5 @@ pub(crate) struct Dev {
#[derive(Debug, Clone, Deserialize)]
pub(crate) struct OnboardConfig {
pub enabled: bool,
pub quote_enabled: bool,
pub auto_bootstrap_domain: String,
}
7 changes: 1 addition & 6 deletions kms/src/main_service.rs
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,6 @@ struct BootConfig {

impl RpcHandler {
async fn ensure_self_allowed(&self) -> Result<()> {
if !self.state.config.onboard.quote_enabled {
return Ok(());
}
let boot_info = self
.state
.self_boot_info
Expand Down Expand Up @@ -355,9 +352,7 @@ impl KmsRpc for RpcHandler {
self.ensure_self_allowed()
.await
.context("KMS self authorization failed")?;
if self.state.config.onboard.quote_enabled {
let _info = self.ensure_kms_allowed(&request.vm_config).await?;
}
let _info = self.ensure_kms_allowed(&request.vm_config).await?;
Ok(KmsKeyResponse {
temp_ca_key: self.state.inner.temp_ca_key.clone(),
keys: vec![KmsKeys {
Expand Down
Loading
Loading