Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 13 additions & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,15 @@ path = "src/main.rs"
name = "witness"
path = "src/bin/witness.rs"

[[bin]]
name = "siglog-import"
path = "src/bin/import.rs"

[dependencies]
# Web framework
axum = "0.8"
tokio = { version = "1", features = ["full"] }
tower-http = { version = "0.6", features = ["cors", "trace"] }
tower-http = { version = "0.6", features = ["cors", "trace", "timeout"] }
tower = "0.5"
tower_governor = "0.8"

Expand Down Expand Up @@ -79,6 +83,14 @@ indicatif = "0.18.3"
# Optimization
smallvec = "1.13"

# WAL entry checksums
crc32fast = "1"

[profile.release]
# Size/index arithmetic guards the Merkle tree and witness rollback
# protection; wrap-on-overflow must never silently corrupt those checks.
overflow-checks = true

[dev-dependencies]
tempfile = "3"
portpicker = "0.1"
Expand Down
66 changes: 57 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,12 +84,18 @@ cargo build --release
| `S3_REGION` | S3 region | `auto` |
| `API_KEY` | Bearer token required for `/add` writes | Required unless `ALLOW_PUBLIC_WRITES=true` |
| `ALLOW_PUBLIC_WRITES` | Allow unauthenticated `/add` writes for local development | `false` |
| `EXTERNAL_WITNESSES` | External witnesses to collect cosignatures from, comma-separated. Format: `name=url=vkey` — the note-format verification key is required and cosignatures are verified against it before a checkpoint is published | - |
| `WITNESS_QUORUM` | Minimum number of external witness cosignatures required to publish a checkpoint | All configured witnesses |
| `WITNESS_KEYS` | In-process witness private keys for local development (comma-separated) | - |
| `VINDEX_SNAPSHOT_INTERVAL` | Entries between vindex snapshots. Each snapshot persists the full index and truncates the WAL, bounding WAL growth and startup replay time (0 disables) | `100000` |
| `RATE_LIMIT_PER_SECOND` | Requests per second allowed per client IP | `100` |
| `RATE_LIMIT_BURST_SIZE` | Burst capacity per client IP | `200` |
| `CHECKPOINT_INTERVAL` | Checkpoint frequency (seconds) | `1` |
| `BATCH_MAX_SIZE` | Max entries per batch | `256` |
| `BATCH_MAX_AGE_MS` | Max batch age (ms) | `1000` |
| `VINDEX_ENABLED` | Enable verifiable index | `false` |
| `VINDEX_KEY_FIELD` | JSON field for key extraction | `name` |
| `VINDEX_WAL_PATH` | WAL path for persistent vindex recovery | Required when enabling vindex on a non-empty log |
| `VINDEX_WAL_PATH` | WAL path for persistent vindex state (snapshot is stored alongside as `<path>.snapshot`). If the on-disk state is missing, corrupted, or behind the database, the vindex is automatically rebuilt from the log's entry bundles | Recommended when enabling vindex |

#### Witness Server (`witness`)

Expand Down Expand Up @@ -206,16 +212,18 @@ A witness independently verifies and co-signs transparency log checkpoints. Runn

#### POST /add-checkpoint

Request body:
```json
{
"checkpoint": "log.example.com\n123\nROOTHASH...\n\n- log.example.com SIGNATURE...",
"proof": ["HASH1...", "HASH2..."],
"old_size": 100
}
Request body (text/plain, per [c2sp.org/tlog-witness](https://c2sp.org/tlog-witness)):
```text
old <size>
<base64 consistency proof hash>
...

<checkpoint text with log signature>
```

Response (on success): The witness's cosignature line.
Response (on success): the witness's [cosignature/v1](https://c2sp.org/tlog-cosignature)
line — a timestamped Ed25519 signature whose key ID is computed with the
cosignature/v1 algorithm byte (0x04).

## API Reference

Expand Down Expand Up @@ -268,6 +276,46 @@ curl http://localhost:8080/tile/0/000
curl http://localhost:8080/tile/entries/000
```

## Bulk import (backfill)

Bootstrapping a log with existing data should not go through `POST /add` —
the incremental path acknowledges entries batch-by-batch and rewrites each
partial tile up to 256 times. `siglog-import` builds the tree in one pass
with concurrent uploads and produces a byte-identical tree to what
incremental integration would create (~5,000 entries/s locally vs ~36/s
over HTTP).

```bash
# 1. Convert conda repodata to normalized JSONL (one file per subdir)
conda-log-ingest --file linux-64/repodata.json --subdir linux-64 \
--jsonl-out linux-64.jsonl

# 2. Import into an EMPTY log (server must not be running)
siglog-import \
--origin conda.prefix.dev \
--database-url sqlite:/data/siglog.db?mode=rwc \
--storage-backend s3 \
--jsonl noarch.jsonl --jsonl linux-64.jsonl \
--epoch-note "conda-forge bootstrap $(date -u +%F), repodata sha256 ..." \
--vindex-wal-path /data/vindex.wal

# 3. Start the server; it continues incrementally from the imported state.
```

The import writes the database state only after every tile and bundle is
durably uploaded, so an interrupted run can be retried; `--resume` skips
objects that already exist (use the same `--chunk-size` and input). On
Fly.io, run it as a one-off machine holding the data volume:

```bash
fly machine destroy <server-machine> --force # volume must be free
fly machine run <image> --volume siglog_data:/data --entrypoint sleep -- infinity
fly ssh sftp shell # upload the .jsonl files to /data/
fly ssh console -C "siglog-import --origin ... --jsonl /data/noarch.jsonl ..."
fly machine destroy <import-machine> --force
fly deploy # recreate the server on the imported state
```

## Deployment

### Fly.io
Expand Down
45 changes: 44 additions & 1 deletion crates/conda-monitor/src/bin/ingest.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,16 @@ struct Args {
#[arg(long)]
dry_run: bool,

/// Write normalized entries as JSONL to this file instead of submitting
/// them over HTTP. Feed the output to `siglog-import` for bulk
/// bootstrapping.
#[arg(long)]
jsonl_out: Option<String>,

/// API key for authenticating write requests (Bearer token)
#[arg(long, env = "API_KEY")]
api_key: Option<String>,

/// Number of entries to process (for testing)
#[arg(long)]
limit: Option<usize>,
Expand Down Expand Up @@ -115,6 +125,12 @@ fn main() -> anyhow::Result<()> {
let mut error_count = 0;
let mut indices: HashMap<String, u64> = HashMap::new();

let mut jsonl_writer: Option<std::io::BufWriter<std::fs::File>> = args
.jsonl_out
.as_ref()
.map(|path| std::fs::File::create(path).map(std::io::BufWriter::new))
.transpose()?;

for (filename, entry) in all_packages.into_iter().take(total) {
pb.set_message(filename.to_string());

Expand All @@ -128,6 +144,15 @@ fn main() -> anyhow::Result<()> {

let json_bytes = normalized.to_normalized_json();

if let Some(writer) = &mut jsonl_writer {
use std::io::Write;
writer.write_all(&json_bytes)?;
writer.write_all(b"\n")?;
success_count += 1;
pb.inc(1);
continue;
}

if args.dry_run {
// Just print first few for verification
if success_count < 3 {
Expand All @@ -143,7 +168,11 @@ fn main() -> anyhow::Result<()> {

// Submit to log
let add_url = format!("{}/add", args.log_url.trim_end_matches('/'));
match client.post(&add_url).body(json_bytes.clone()).send() {
let mut request = client.post(&add_url).body(json_bytes.clone());
if let Some(key) = &args.api_key {
request = request.header("Authorization", format!("Bearer {}", key));
}
match request.send() {
Ok(resp) => {
if resp.status().is_success() {
if let Ok(text) = resp.text() {
Expand Down Expand Up @@ -172,6 +201,20 @@ fn main() -> anyhow::Result<()> {

pb.finish_with_message("Done!");

if let Some(mut writer) = jsonl_writer {
use std::io::Write;
writer.flush()?;
println!("\n=== JSONL Export ===");
println!("Subdir: {}", args.subdir);
println!("Entries written: {}", success_count);
println!("Skipped: {}", skip_count);
println!(
"Output: {} (feed to siglog-import for bulk bootstrap)",
args.jsonl_out.as_deref().unwrap_or_default()
);
return Ok(());
}

println!("\n=== Ingestion Summary ===");
println!("Subdir: {}", args.subdir);
println!("Submitted: {}", success_count);
Expand Down
7 changes: 5 additions & 2 deletions docker/Dockerfile.local
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,11 @@ COPY --from=builder /app/target/release/siglog /usr/local/bin/siglog
COPY --from=builder /app/target/release/witness /usr/local/bin/witness
COPY --from=builder /app/target/release/conda-monitor /usr/local/bin/conda-monitor

# Create data directories
RUN mkdir -p /data
# Run as a non-root user with a writable data directory
RUN useradd --system --uid 10001 --create-home --home-dir /data app \
&& mkdir -p /data \
&& chown -R app:app /data
USER app

WORKDIR /data

Expand Down
15 changes: 10 additions & 5 deletions docker/Dockerfile.server
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ COPY Cargo.toml Cargo.lock ./
COPY src ./src
COPY crates ./crates

# Build release binary
RUN cargo build --release --bin siglog
# Build release binaries (server + bulk importer)
RUN cargo build --release --bin siglog --bin siglog-import

# Runtime stage
FROM debian:bookworm-slim
Expand All @@ -24,18 +24,23 @@ RUN apt-get update && apt-get install -y \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*

# Copy binary from builder
# Copy binaries from builder
COPY --from=builder /app/target/release/siglog /usr/local/bin/siglog
COPY --from=builder /app/target/release/siglog-import /usr/local/bin/siglog-import

# Create data directory
RUN mkdir -p /data
# Run as a non-root user with a writable data directory
RUN useradd --system --uid 10001 --create-home --home-dir /data siglog \
&& mkdir -p /data \
&& chown -R siglog:siglog /data
USER siglog

# Default environment variables
ENV LISTEN_ADDR=0.0.0.0:8080
ENV DATABASE_URL=sqlite:/data/tessera.db?mode=rwc
ENV STORAGE_BACKEND=fs
ENV FS_ROOT=/data/tiles

VOLUME /data
EXPOSE 8080

ENTRYPOINT ["siglog"]
8 changes: 6 additions & 2 deletions docker/Dockerfile.witness
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,17 @@ RUN apt-get update && apt-get install -y \
# Copy binary from builder
COPY --from=builder /app/target/release/witness /usr/local/bin/witness

# Create data directory
RUN mkdir -p /data
# Run as a non-root user with a writable data directory
RUN useradd --system --uid 10001 --create-home --home-dir /data witness \
&& mkdir -p /data \
&& chown -R witness:witness /data
USER witness

# Default environment variables
ENV LISTEN_ADDR=0.0.0.0:8081
ENV DATABASE_URL=sqlite:/data/witness.db?mode=rwc

VOLUME /data
EXPOSE 8081

ENTRYPOINT ["witness"]
20 changes: 15 additions & 5 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@
#
# Setup:
# Create a .env file with LOG_PRIVATE_KEY, LOG_PUBLIC_KEY,
# WITNESS_PRIVATE_KEY, and MONITOR_PRIVATE_KEY.
# WITNESS_PRIVATE_KEY, WITNESS_PUBLIC_KEY, MONITOR_PRIVATE_KEY,
# and MONITOR_PUBLIC_KEY.
#
# Generate the witness key with name "local.dev/witness" and the monitor
# key with name "local.dev/monitor" (see README "Key Format"): the name
# embedded in each key must match the witness name configured in
# EXTERNAL_WITNESSES below, and the public keys are pinned so the log can
# verify cosignatures.
#
# Usage:
# docker compose -f docker/docker-compose.yml build
Expand Down Expand Up @@ -31,17 +38,19 @@ services:
- --storage-backend=fs
- --fs-root=/data/tiles
- --origin=local.dev/log
- --private-key=${LOG_PRIVATE_KEY}
- --listen=0.0.0.0:8080
- --checkpoint-interval=1
- --batch-max-size=256
- --batch-max-age-ms=500
- --allow-public-writes
- --vindex-enabled
- --vindex-key-field=name
- --external-witnesses=witness=http://witness:8080,monitor=http://monitor:8080
- --vindex-wal-path=/data/vindex.wal
environment:
RUST_LOG: info,siglog=debug
# Secrets via environment, not argv (argv is visible in docker inspect / ps)
LOG_PRIVATE_KEY: ${LOG_PRIVATE_KEY}
EXTERNAL_WITNESSES: "local.dev/witness=http://witness:8080=${WITNESS_PUBLIC_KEY},local.dev/monitor=http://monitor:8080=${MONITOR_PUBLIC_KEY}"
ports:
- "8080:8080"
volumes:
Expand All @@ -60,11 +69,11 @@ services:
command:
- witness
- --database-url=sqlite:/data/witness.db?mode=rwc
- --private-key=${WITNESS_PRIVATE_KEY}
- --listen=0.0.0.0:8080
- --log=local.dev/log=${LOG_PUBLIC_KEY}
environment:
RUST_LOG: info,witness=debug,siglog=debug
WITNESS_PRIVATE_KEY: ${WITNESS_PRIVATE_KEY}
ports:
- "8081:8080"
volumes:
Expand Down Expand Up @@ -112,11 +121,12 @@ services:
command:
- conda-monitor
- --database-url=sqlite:/data/monitor.db?mode=rwc
- --private-key=${MONITOR_PRIVATE_KEY}
- --listen=0.0.0.0:8080
- --log=local.dev/log=${LOG_PUBLIC_KEY}=http://log:8080
environment:
RUST_LOG: info,conda_monitor=debug,siglog=debug
# conda-monitor reads its signing key from WITNESS_PRIVATE_KEY
WITNESS_PRIVATE_KEY: ${MONITOR_PRIVATE_KEY}
ports:
- "8082:8080"
volumes:
Expand Down
Loading
Loading