This document defines the internal architecture and invariants that all contributors + agents must follow.
smoo consists of two halves:
- Host (desktop/CLI/web): provides block data to be read/written.
- Gadget (device with UDC): exposes a synthetic block device via ublk, serviced entirely over USB using FunctionFS.
USB interface has 4 endpoints:
| Endpoint | Direction | Purpose |
|---|---|---|
| Bulk OUT | host → gadget | read payloads |
| Bulk IN | gadget → host | write payloads |
| Interrupt OUT | host → gadget | control: Response |
| Interrupt IN | gadget → host | control: Request |
Control messages:
- ≤ 1024 bytes
- fixed-size LE structs keyed by
(export_id, request_id) - pipelined; multiple requests may be in flight per export
- Responses may return out-of-order; request_id used for matching
Bulk transfers:
- carry payload bytes
- must be block-aligned
- ublk on gadget emits a command
- gadget → host: send
Requeston interrupt IN - host dispatches to
BlockSource - host performs bulk transfers as needed
- host → gadget: send
Responseon interrupt OUT - gadget completes ublk request
Invariant: each ublk request maps to one logical Request/Response pair.
Gadgets MAY replay a Request after a link/session reset; the wire may see
duplicates, but ublk completes exactly once. Hosts/gadgets SHOULD keep queues
full: multiple outstanding requests per export and across exports are expected
(bounded by queue depth), using (export_id, request_id) as the uniqueness key
while in flight.
-
registers a ublk queue with:
- logical block size (must match BlockSource)
- queue depth (configurable later)
-
maps each ublk command → protocol Request
-
completes requests deterministically
Error handling:
- transport failures or link loss → keep ublk I/O outstanding; park in-flight requests and replay when the link/session returns (no timeouts)
- export removal or shutdown → complete outstanding I/O with
errno - fatal errors → gadget tears down ublk cleanly
Protocol handshake (Ident):
- a setup IN message (from gadget -> host)
- fixed-size, LE
- fields:
- protocol version (major + minor)
Control-plane (Request / Response):
-
fixed-size, LE
-
fields:
- export_id
- op: read/write/flush/discard
- request_id (unique per export while in flight)
- LBA
- byte length (block-aligned)
- flags (future)
-
MUST fit in one interrupt transfer
-
Responses carry the same
(export_id, request_id)and MAY arrive out-of-order
Data (bulk):
- write path: host → gadget (bulk OUT)
- read path: gadget → host (bulk IN)
- MUST send exactly the payload size described in Request
- Bulk ordering follows interrupt serialization per direction, filtered to payload-bearing messages. For gadget → host, bulk IN payloads must appear in the same order as their corresponding Requests were written to interrupt IN. For host → gadget, bulk OUT payloads must appear in the same order as their corresponding Responses were written to interrupt OUT.
If FunctionFS DMA-BUF support exists, the gadget:
- allocates dma-buf buffers from system dma-heap
- attaches them to FunctionFS' bulk endpoint file descriptors (
FUNCTIONFS_DMABUF_ATTACHioctl) - initiates read/write transfers using these buffers (
FUNCTIONFS_DMABUF_TRANSFERioctl) - copies the dma-bufs dma sync fences after transfer (
DMA_BUF_IOCTL_EXPORT_SYNC_FILE) polls the fence to detect completion
Properties:
- gracefully falls back if system dma-heap not present, or buffer attachments to FunctionFS endpoints fail
- nearly zero-copy
- lower CPU load
- higher throughput
If DMA-BUF fast path is unavailable:
- gadget uses classic
read()/write()on bulk ep fds - incurs at most one extra copy
- MUST preserve identical semantics to DMA-BUF mode
Responsible for shuttling control + payload data.
Requirements:
- MUST correlate interrupt + bulk transfers by
(export_id, request_id); do not drop or duplicate - MUST allow pipelining (multiple outstanding Requests per export); Responses may be delivered out-of-order
- MUST preserve bulk ordering as defined in the USB protocol section above
- MUST be cancellation-safe
- MUST be async-first (Tokio)
- MAY block internally if safe
- Each
read_bulk/write_bulkMUST correspond to one payload for one(export_id, request_id)pair
Implementations:
smoo-host-transport-rusbsmoo-host-transport-webusb
Backs actual storage.
Requirements:
-
MUST expose
block_size() -
MUST match gadget ublk block size
-
MUST support async
read()/write()of block-aligned regions -
SHOULD avoid copies
-
MAY wrap:
- files
- raw devices
- future WebUSB fetch backends
Two vendor control requests:
- IDENT (IN): idempotent, side‑effect‑free. Returns protocol version and capability flags.
- CONFIG_EXPORTS (OUT): authoritative replace of the complete export set. Payload describes all exports for this host session.
Each export entry includes:
export_id(u32)block_sizesize_bytes- flags (optional)
Gadget maps export_id → ublk device. CONFIG_EXPORTS creates/removes ublk devices to match payload. Successful CONFIG_EXPORTS MUST update the state file.
On restart:
- If state file exists → RECOVERING: reattach ublk devices. If any fail, delete state file → COLD.
- If no state file → COLD.
- Recovery MUST NOT remove/modify ublk devices until complete.
Host restart = new session:
- Host re-issues IDENT + CONFIG_EXPORTS.
- Gadget treats this as a session boundary for the data plane: forget on-wire in-flight requests/responses and replay any outstanding ublk I/O when the link returns.
- Gadget only rebuilds ublk devices when the export list/geometry changes; otherwise keep existing devices and update the state file to match the new CONFIG_EXPORTS payload.
- Requests are never timed out by the gadget.
- Link loss or data-plane I/O errors cause the gadget to drop transport state, park in-flight ublk requests, and wait for the link to recover.
- Once the host re-establishes the session (IDENT/CONFIG_EXPORTS) and the link
is Online, parked requests are replayed with the same
(export_id, request_id).
Gadget MUST drain ep0 events continuously:
- BIND/UNBIND
- ENABLE/DISABLE
- SUSPEND/RESUME
- SETUP (IDENT/CONFIG_EXPORTS)
Failure to service ep0 promptly leads to EP0 STALL + possible gadget reset.
-
Toolchain: Rust stable (MSRV 1.88)
-
Logging:
tracing -
Unit tests:
cargo test --workspace --locked— uses mock transports. -
Integration tests:
crates/smoo-test-harness/drives a realsmoo-gadget↔smoo-hostsession overdummy_hcdloopback with per-scenariousbmoncapture (.pcapng+ logs landed undertarget/integration-artifacts/<test>/).Local quick-start:
cargo xtask check-test-infra # diagnostic cargo xtask test-infra-setup # one-time per boot (sudo) cargo xtask integration # runs stable privileged scenariosVM integration flow:
cargo xtask vm-image build # build target/vm-images/smoo-integration-vm.qcow2 cargo xtask vm-image download # pull the input-hashed GHCR artifact with oras cargo xtask vm-image ref # print the deterministic GHCR ref cargo xtask vm-integrationThe privileged
smoo-test-harnessscenarios are#[ignore]so ordinarycargo test --workspaceand package checks can run without dummy_hcd/ublk; usecargo xtask integrationorcargo xtask vm-integrationto run the stable scenario set.shutdown_host_lossis intentionally excluded from the stable set while it remains a known-broken reproducer.vm-integrationbuilds the smoo CLIs andsmoo-test-harnesstest binaries on the host, boots a disposable Fedora guest with QEMU/KVM, copies only those binaries plustools/wireshark/smoo.luainto the guest, probes the guest fordummy_hcd,ublk_drv,usbmon, FunctionFS/configfs,fio,dumpcap, andtshark, then runs the harness inside the guest under sudo. The VM image does not contain a Rust toolchain and the runtime path does not download a base image or rundnf; iftarget/vm-images/smoo-integration-vm.qcow2is absent, runcargo xtask vm-image buildonce locally orcargo xtask vm-image downloadto trade bandwidth for compute. SetSMOO_VM_IMAGEto use an arbitrary qcow2 instead. The host must expose writable/dev/kvm(or setSMOO_VM_ACCEL=tcgfor a slow smoke test). For sudo-less local KVM access, seetools/udev/99-smoo-kvm.rules.vm-image buildstarts from the pinned Fedora cloud base image, appliestools/vm-image/guest-setup.sh, validates the required kernel modules and userspace tools, then writes the baked image and SHA256 metadata undertarget/vm-images/. The GHCR tag is the SHA256 of the base image identity (URL + expected checksum) and the guest setup script contents, so the defaultvm-image downloadtarget is deterministic and can be inspected withcargo xtask vm-image ref. Override the full ref withSMOO_VM_IMAGE_REFor the repository prefix withSMOO_VM_IMAGE_REPOSITORYif needed.Each scenario is a
#[tokio::test]incrates/smoo-test-harness/tests/. v1 shipssmoke(handshake + 1 R/W) andrw_modest(fio randwrite- with-md5 against the resulting/dev/ublkbN). On failure the artifact bundle is the source of truth — opencapture.pcapngwith the dissector attools/wireshark/smoo.luato triage wire-level issues.CI runs the same VM flow (
.github/workflows/integration-tests.yml) on GitHub-hostedubuntu-24.04runners: it downloads the baked qcow2 withcargo xtask vm-image download, then runscargo xtask vm-integrationto execute the harness inside the guest..github/workflows/vm-image.ymlrebuilds and pushes the GHCR image when the VM image setup changes. During the VM substrate spike it publishes from bothmainand thetest-infraWIP branch. -
CLIs are thin wrappers; logic in libraries
-
Agents MUST uphold:
- cancellation safety
(export_id, request_id)matching guarantees- all invariants in this document