SPDX-License-Identifier: LGPL-3.0-or-later
https://github.com/GSI-HPC/sind
A CLI tool for running local Slurm clusters using Docker containers, inspired by kind (Kubernetes in Docker).
- Linux host with cgroupv2 and
nsdelegatemount option (mount -o remount,nsdelegate /sys/fs/cgroup) - Docker Engine 28.0+ (required for
--security-opt writable-cgroups=true) - For clusters with 10+ nodes:
fs.inotify.max_user_instances >= 1024(default 128 is too low)
- Slurm 25.11
- OpenMPI 5.0 (with PMIx 6.x, PRRTE 4.x, UCX 1.20)
sind creates and manages containerized Slurm clusters for development, testing, and CI/CD workflows. Each node runs as a separate Docker container with systemd as init, providing a realistic multi-node Slurm environment without requiring bare-metal infrastructure.
While the cluster configuration file resembles a Kubernetes manifest, sind is not a reconciling controller. The configuration is a one-shot, one-way input for cluster creation:
sind create clusterinterprets the manifest once to generate the cluster (via--config FILEor piped to stdin)- sind does not continuously watch or reconcile cluster state
- sind does not automatically repair drift or failures
sind provides commands for inspection (get), modification (create/delete worker), and simulation (power) but these are imperative operations, not declarative state management.
This design is intentional: sind is a development and testing tool that aids the creation of more sophisticated Slurm cluster management tooling, not a production cluster controller.
sind creates cluster resources in a specific order to ensure dependencies are available:
Phase 1: Global Infrastructure
- Create
sind-meshnetwork (if not exists) - Start
sind-dnscontainer (if not exists) - Create
sind-ssh-configvolume and generate keypair (if not exists) - Start
sind-sshcontainer (if not exists)
Phase 2: Cluster Resources (concurrent pipelines, no barriers)
- Create cluster network
- Create config volume → write Slurm configuration
- Create munge volume → generate and write munge key
- Create data volume (if needed)
Phase 3: Node Containers
- Create and start each node container in parallel
- Start per-node systemd D-Bus monitor immediately after each container starts
- Wait for each node to become ready, accelerated by events
There is no barrier between node creation and readiness probing — each node's goroutine creates its container, starts a systemd monitor, and begins probing in a single pipeline. This allows early-starting nodes to be probed while later nodes are still being created.
sind uses two event sources to accelerate readiness detection:
- Docker events — a single
docker eventsstream watches all cluster containers for start/die events - Systemd D-Bus monitors — per-node
busctl monitor --watch-bind=yesstreams watch for unit state changes (e.g., sshd.service becoming active)
When an event arrives, readiness probes re-evaluate immediately instead of waiting for the next poll tick. If the event sources are unavailable, sind falls back to poll-only mode transparently.
| Check | Description |
|---|---|
| Container running | Docker container in running state |
| systemd ready | systemctl is-system-running returns running or degraded |
| sshd listening | Port 22 accepting connections |
| slurmctld ready | scontrol ping succeeds (controller only) |
| slurmd ready | slurmd service active (worker only) |
If any node fails to become ready within the timeout, sind create cluster fails and reports which nodes/checks failed. Partial clusters are not automatically cleaned up—use sind delete cluster to remove.
Phase 4: Mesh Registration and Slurm (concurrent)
After all nodes are ready, sind runs mesh registration (batch DNS + known_hosts) and Slurm enablement concurrently. This is safe because Slurm uses short hostnames (controller, worker-0) resolved by Docker's embedded DNS on the cluster network. The mesh DNS records (*.cluster.realm.sind) are only used for SSH relay access and host-side resolution.
- Familiar UX for kind users
- No root/admin privileges required
- SELinux compatible
- Support for both static and dynamic Slurm node configurations
sind is written in Go and designed for dual use:
- CLI tool - Standalone command-line interface
- Go library - Embeddable package for wrapper tools and integrations
The CLI command structure is reflected in the library API, allowing programmatic access to all sind operations.
sind uses a minimal set of dependencies, following kind's approach of favoring simplicity and compatibility.
| Dependency | Purpose |
|---|---|
github.com/spf13/cobra |
CLI framework |
sigs.k8s.io/yaml |
YAML configuration parsing |
log/slog (stdlib) |
Structured logging interface |
github.com/charmbracelet/log |
Colorized log output (slog handler) |
github.com/mattn/go-isatty |
TTY detection for interactive commands |
github.com/njayp/ophis |
MCP server framework |
github.com/spf13/afero |
Filesystem abstraction for testability |
golang.org/x/sys |
Advisory file locking (flock) for realm locks |
Nodeset expansion (e.g., worker-[0-2,5] → individual hostnames) is implemented internally rather than using an external library, keeping the dependency footprint small.
sind interacts with Docker by shelling out to the docker CLI rather than using the Docker SDK for Go. This approach, proven by kind, provides:
- Simpler maintenance and fewer dependencies
- Wider compatibility across Docker versions
- Avoids tight coupling to Docker daemon internals
sind wraps command execution in a thin abstraction layer (pkg/cmdexec) using Go's os/exec package, with proper output handling and error reporting. The executor interface is shared across pkg/docker, pkg/mesh, and pkg/cluster.
Runtime support: Docker only. Support for alternative runtimes (Podman, nerdctl) may be added later via a provider abstraction pattern.
This project is licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later).
The git history follows Conventional Commits style.
- Fine-grained commits - Each commit should represent a single logical change, sized for easy comprehension when reading history
- Context-free messages - Commit messages state facts about the change, not the development story; they are written for future readers of the history, not as a journal of the development process
- No narrative - Avoid "I tried X, then Y, finally Z worked"; instead state what the commit does
<type>(<scope>): <description>
[optional body]
[optional footer]
Types: feat, fix, docs, style, refactor, test, chore, build, ci
Rules for maintaining consistency when adding new commands, flags, and output.
Commands follow verb-noun ordering with a two-level hierarchy:
sind <verb> <noun> [ARGS] [FLAGS]
- Multi-resource verbs (
create,delete,get,power) group noun subcommands - Single-purpose verbs (
ssh,enter,exec,logs,doctor) stand alone - Standalone verbs are reserved for frequently-used operations that justify a short path
| Pattern | Positional | Default | Examples |
|---|---|---|---|
| Cluster name | [NAME] or [CLUSTER] |
"default" |
get cluster, enter, get nodes |
| Node targets | NODES (required) |
— | power shutdown, delete worker |
| Node format | shortname.cluster |
cluster defaults to "default" |
worker-0.dev, controller |
| Nodeset expansion | bracket patterns | — | worker-[0-2].dev |
| Pass-through | after -- separator |
— | ssh NODE -- cmd, exec -- cmd |
Rules:
- Cluster names are always positional, never flags
- Node targets support nodeset expansion and comma-separated specs
- Use
cobra.MaximumNArgs(1)for optional cluster,cobra.MinimumNArgs(1)for required nodes
- Long-form only by default; add short flags (
-f) only for frequently-typed flags - Kebab-case for multi-word flags:
--tmp-size,--munge-key - Boolean flags for mode switches:
--all,--pull,--unmanaged - One persistent root flag:
--realm(inherited by every subcommand) - One persistent root counter:
-v(repeatable, controls log verbosity; inherited by every subcommand)
| Command type | Output | Target |
|---|---|---|
List resources (get) |
tabwriter table, uppercase headers, 3-space padding | stdout |
Single value (get munge-key) |
raw value, one line | stdout |
Mutations (create, delete, power) |
silent on success | — |
| Errors | structured slog at error level (always visible) | stderr |
| Warnings | Warning: ... prefix |
stderr |
Logs (-v) |
structured key=value, colorized on TTYs | stderr |
Rules:
- Mutations are silent —
exit 0is the confirmation; use-vfor progress - Errors are always visible (slog error level is always enabled, even without
-v) - Command output (tables, status, doctor) is monochrome — no ANSI escapes
- Log output (
-v) is colorized on interactive terminals, plain when piped - Unicode checkmarks (✓/✗) only in
get clusteranddoctoroutput - All
getsubcommands accept--output|-o {human,json}; default ishuman
Logging uses pkg/log with context-based injection. Silent by default. All log lines include millisecond timestamps (HH:MM:SS.mmm) for timing analysis.
| Level | Flag | What to log |
|---|---|---|
| Error | — | Always visible; command failures |
| Info | -v |
Phase transitions: "creating cluster", "nodes ready", "slurm services enabled" |
| Debug | -vv |
Individual operations: "waiting for node", "creating network", "enabling slurmd" |
| Trace | -vvv |
Docker commands, probe retry attempts with error details |
Rules:
- Use
sindlog.From(ctx)to extract the logger — neverslog.Default() - In errgroup goroutines, log with
gctxnot the outerctx - Log messages use lowercase, present tense: "creating network", not "Created network"
- Include identifying attrs:
"node", shortName,"name", netName,"service", svcName
All commands that accept cluster names or node names must set ValidArgsFunction:
- Cluster name commands →
completeClusterNames - Node name commands →
completeNodeNames - Commands with DisableFlagParsing (ssh, exec) →
ValidArgsFunctionwith heuristics (best-effort despite cobra limitations)
When adding a get subcommand with positional arg completion for a second argument
(like logs NODE SERVICE), write a dedicated completion function that switches on len(args).
When introducing a new command:
- Structure: verb-noun ordering, consistent with existing hierarchy
- Args: cluster as optional positional (default "default"), nodes as required positional
- Flags: long-form, kebab-case, minimal short flags
- Completion: add
ValidArgsFunctionfor cluster/node args - Output: table for lists, confirmation for mutations, silence for passthrough
- Logging: info for phases, debug for operations, trace for raw commands
- Errors: wrap with
fmt.Errorf("context: %w", err), no error prefixes - Tests: unit test with mock executor, integration test in lifecycle test
- Docs: update DESIGN.md CLI Commands section, update docs/content/
Development follows Test-Driven Development (TDD) style:
- Write failing test
- Implement minimal code to pass
- Refactor
- High unit test coverage for all packages
- Integration tests for CLI commands and cluster operations
- Tests run in CI for every commit
sind create cluster [NAME] [--config FILE] [--data PATH] [--pull]
sind delete cluster [NAME]
sind delete cluster --all
sind get cluster [NAME]
sind get clusters
sind get node NODE[.CLUSTER]
sind get nodes [CLUSTER]
sind get networks
sind get realms
sind get volumes
sind get mesh
sind get dns
sind get ssh-config
sind get ssh-private-key
sind get ssh-public-key
sind get ssh-known-hosts
sind get munge-key [CLUSTER]All get subcommands accept --output|-o {human,json}. The default is human (tabular text); json emits a machine-readable document.
NAME/CLUSTER defaults to default if omitted.
sind create cluster validates the environment before creating, warning or failing if conflicting resources (containers, networks, volumes with matching names) already exist.
sind delete cluster is idempotent and robust:
- Deleting a non-existent cluster is not an error
- Handles partial/broken clusters (e.g., failed creation)
- Removes all matching Docker resources regardless of state
- Updates
~/.local/state/sind/<realm>/known_hosts(or$XDG_STATE_HOME/sind/<realm>/known_hosts) to remove deleted nodes - Order: stops/removes containers → disconnects/removes networks → removes volumes
Example output:
$ sind get clusters
NAME NODES (S/C/W) SLURM STATUS
default 4 (1/1/2) 25.11 running
dev 3 (0/1/2) 25.11 running
NODES column shows total count and breakdown: Submitter / Controller / Worker.
$ sind get nodes dev
CONTAINER ROLE FQDN IP STATUS
sind-dev-controller controller controller.dev.sind.sind 172.19.0.2 running
sind-dev-worker-0 worker worker-0.dev.sind.sind 172.19.0.3 running
sind-dev-worker-1 worker worker-1.dev.sind.sind 172.19.0.4 running
Without a cluster argument, sind get nodes lists every node in the realm and adds a CLUSTER column. Rows are sorted by (cluster, role, natural-name) so worker-2 precedes worker-10:
$ sind get nodes
CONTAINER CLUSTER ROLE FQDN IP STATUS
sind-default-controller default controller controller.default.sind.sind 172.19.0.2 running
sind-default-worker-0 default worker worker-0.default.sind.sind 172.19.0.3 running
sind-dev-controller dev controller controller.dev.sind.sind 172.20.0.2 running
sind-dev-worker-2 dev worker worker-2.dev.sind.sind 172.20.0.3 running
sind-dev-worker-10 dev worker worker-10.dev.sind.sind 172.20.0.4 running
sind get cluster [NAME] displays detailed health information for a cluster:
$ sind get cluster dev
CLUSTER SLURM STATUS (R/S/P/T)
dev 25.11.4 running (3/0/0/3)
NETWORKS
NAME DRIVER SUBNET GATEWAY STATUS
sind-mesh bridge 172.18.0.0/16 172.18.0.1 ✓
sind-dev-net bridge 172.19.0.0/16 172.19.0.1 ✓
MESH SERVICES
NAME CONTAINER STATUS
dns sind-dns ✓
MOUNTS
MOUNT SOURCE TYPE STATUS
/etc/slurm sind-dev-config volume ✓
/etc/munge sind-dev-munge volume ✓
/data /home/user/project hostPath ✓
NODES
NAME ROLE IP STATUS SERVICES
controller.dev controller 172.19.0.2 running munge ✓ slurmctld ✓ sshd ✓
worker-0.dev worker 172.19.0.3 running munge ✓ slurmd ✓ sshd ✓
worker-1.dev worker 172.19.0.4 running munge ✓ slurmd ✗ sshd ✓
sind get node NODE[.CLUSTER] shows detailed health for a single node. NODE uses the format shortName or shortName.cluster (defaults to cluster "default"). Passing a full DNS FQDN ending in .sind is rejected — use the bare short name or the NODE.CLUSTER form:
$ sind get node controller.dev
CONTAINER ROLE FQDN IP STATUS
sind-dev-controller controller controller.dev.sind.sind 172.19.0.2 running
SERVICES
NAME STATUS
munge ✓
sshd ✓
slurmctld ✓
sind doctor validates host prerequisites for running sind:
sind doctor # check Docker version, cgroupv2, DNS policyChecks the Docker Engine version, that cgroupv2 is mounted with nsdelegate, and that polkit allows host DNS resolution via systemd-resolved. Exits non-zero if any required prerequisite fails.
sind ssh [SSH_OPTIONS] NODE [-- COMMAND] # SSH into a specific node (passthrough)
sind enter [CLUSTER] # Interactive shell on submitter/controller
sind exec [CLUSTER] -- <cmd> # One-shot command on submitter/controllerNODE uses DNS-style naming (see Node Arguments). CLUSTER defaults to default.
sind ssh passes all options and arguments through to the underlying SSH command. See the SSH section for details.
sind create worker [CLUSTER] [FLAGS] # add worker nodes
sind delete worker NODES # remove worker nodes from clustercreate worker flags:
| Flag | Default | Description |
|---|---|---|
--count N |
1 | Number of nodes to add |
--image IMAGE |
cluster default | Container image |
--cpus N |
cluster default (1) | CPU limit per node |
--memory SIZE |
cluster default (512m) | Memory limit |
--tmp-size SIZE |
256m | /tmp tmpfs size |
--unmanaged |
false | Don't start slurmd, don't add to slurm.conf |
--pull |
false | Pull images before creating containers |
--cap-add CAP |
none | Add Linux capability (repeatable; e.g. SYS_ADMIN) |
--cap-drop CAP |
none | Drop Linux capability (repeatable) |
--device PATH |
none | Expose host device (repeatable; e.g. /dev/fuse) |
--security-opt OPT |
none | Security option (repeatable) |
Examples:
sind create worker # 1 managed node with cluster defaults
sind create worker --count 3 # 3 managed nodes
sind create worker --count 2 --unmanaged # 2 unmanaged nodes (slurmd not started)
sind create worker --cpus 2 --memory 1g # 1 managed node with resource limits
sind create worker dev --count 2 # 2 managed nodes in dev clusterManaged node workflow:
By default (without --unmanaged), sind:
- Verifies
sind-nodes.confexists in/etc/slurm(fails if not present) - Creates the worker container(s)
- Appends node definition(s) to
sind-nodes.conf - Reconfigures slurmctld (
scontrol reconfigure) - Starts slurmd on the new node(s)
Managed nodes require the sind-generated Slurm configuration (see Generated Configuration). If sind-nodes.conf is missing (e.g., user replaced the config), the command fails with an error. Use --unmanaged to add nodes without modifying Slurm configuration.
delete worker deletes containers entirely. Works with both managed and unmanaged nodes. For managed nodes, sind removes them from sind-nodes.conf and reconfigures slurmctld before deleting the container.
sind power shutdown NODES # graceful shutdown
sind power cut NODES # hard power off
sind power on NODES # power on
sind power reboot NODES # graceful cycle (shutdown + on)
sind power cycle NODES # hard cycle (cut + on)
sind power freeze NODES # simulate unresponsive node
sind power unfreeze NODES # resume frozen node| Command | Implementation |
|---|---|
| shutdown | docker stop (SIGTERM, then SIGKILL) |
| cut | docker kill (immediate SIGKILL) |
| on | docker start |
| reboot | docker stop + docker start |
| cycle | docker kill + docker start |
| freeze | docker pause (cgroup freezer) |
| unfreeze | docker unpause |
Freeze/unfreeze uses Docker's cgroup freezer to suspend all processes. The container remains "running" but is completely unresponsive, simulating a hung or unreachable node.
sind logs NODE [--follow] # container logs (stdout/stderr)
sind logs NODE SERVICE [--follow] # journalctl for specific serviceExamples:
sind logs controller --follow # tail container logs
sind logs controller slurmctld # slurmctld journal logs
sind logs worker-0 slurmd --follow # follow slurmd logssind version [--json] # print version information
sind doctor # check host prerequisites
sind get realms # list active realms
sind get munge-key [CLUSTER] # output munge key (base64)
sind get ssh-config # show SSH config path for Include
sind get mesh # show mesh infrastructure info
sind get dns # list mesh DNS records
sind get ssh-private-key # output SSH private key
sind get ssh-public-key # output SSH public key
sind get ssh-known-hosts # output SSH known_hostssind version prints version, commit, Go version, and platform. For release builds the output is sind <version> (<commit>). For dev builds git describe --tags --always --dirty is used as the version, embedding tag distance and commit hash directly: sind 0.5.0-3-gabc1234-dirty. The --json flag outputs all fields as JSON.
sind get munge-key outputs the cluster's munge key encoded as base64, suitable for injection into external management tooling.
sind get ssh-config outputs the path to the SSH config file for the current realm. Add it as an Include in ~/.ssh/config to enable direct SSH access to nodes.
sind get mesh shows mesh infrastructure info: network name, DNS container/IP/zone/image, SSH container/volume/image. Useful for external consumers that need to connect to sind networks.
sind get ssh-private-key, sind get ssh-public-key, and sind get ssh-known-hosts dump SSH credentials to stdout. This replaces the need to extract files from Docker volumes.
Commands accepting node arguments use DNS-style names with optional nodeset expansion.
<role>.<cluster>
<role>-<N>.<cluster>
The cluster suffix defaults to .default if omitted.
Nodeset notation (as used in Slurm, pdsh, ClusterShell) is supported for specifying multiple nodes:
| Pattern | Expansion |
|---|---|
worker-[0-3] |
worker-0, worker-1, worker-2, worker-3 |
worker-[0,2,4] |
worker-0, worker-2, worker-4 |
worker-[0-2,5] |
worker-0, worker-1, worker-2, worker-5 |
worker-[0-1].dev |
worker-0.dev, worker-1.dev |
Multiple nodesets can be comma-separated:
sind power shutdown controller,worker-[0-3]
sind power cycle worker-[0-1].dev,worker-[0-3].defaultsind power shutdown controller # controller.default
sind power cycle worker-0 # worker-0.default
sind power freeze worker-[0-3].dev # 4 nodes in dev cluster
sind power reboot controller,worker-[0-1] # multiple nodes in defaultThe simplest valid configuration creates a minimal cluster with 1 controller and 1 worker node using the generic sind-node image:
kind: ClusterThis is equivalent to:
kind: Cluster
name: default
defaults:
image: ghcr.io/gsi-hpc/sind-node:latest
nodes:
- role: controller
- role: workerWhen defaults.image is omitted, sind uses the generic image ghcr.io/gsi-hpc/sind-node:latest.
Nodes can be specified in short form when only role (and optionally count) are needed:
nodes:
- controller # just the role
- submitter # optional roles work too
- worker: 3 # role with countThis is equivalent to:
nodes:
- role: controller
- role: submitter
- role: worker
count: 3The shorthand and full forms can be mixed in the same configuration.
kind: Cluster
name: test-cluster # default: "default"
realm: sind # default: "sind"
defaults:
image: ghcr.io/gsi-hpc/sind-node:25.11.2 # default: sind-node:latest
tmpSize: 256m # per-node /tmp tmpfs size
cpus: 1 # container CPU limit
memory: 512m # container memory limit
storage:
dataStorage:
type: volume # volume | hostPath
hostPath: ./data # only if type=hostPath
mountPath: /data # default: /data
slurm:
main: | # appended to slurm.conf
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
cgroup: | # appended to cgroup.conf
ConstrainCores=yes
nodes:
- role: controller
tmpSize: 512m # override default
cpus: 1
memory: 1g
- role: submitter # optional, at most one
- role: worker
count: 3 # default: 1
cpus: 2
memory: 1g
- role: worker
count: 2
managed: false # slurmd not started, not in slurm.confThe slurm key contains named sections that map to Slurm config files. Each section supports two forms:
- String: content appended directly to the config file
- Map: each key creates a fragment in a
.conf.d/directory, included via explicitincludedirectives per fragment file
| Section | Config file | sind generates defaults |
|---|---|---|
main |
slurm.conf |
yes |
cgroup |
cgroup.conf |
yes |
gres |
gres.conf |
no |
topology |
topology.conf |
no |
plugstack |
plugstack.conf |
yes (always scaffolded) |
String form — content appended to the config file:
slurm:
main: |
SelectType=select/cons_tres
SelectTypeParameters=CR_Core_Memory
cgroup: |
ConstrainCores=yesMap form — named fragments in a .conf.d/ directory:
slurm:
main:
scheduling: |
SchedulerType=sched/backfill
SchedulerParameters=bf_continue
resources: |
SelectType=select/cons_tresThis produces:
/etc/slurm/
├── slurm.conf # sind defaults + explicit includes per fragment
├── slurm.conf.d/
│ ├── resources.conf
│ └── scheduling.conf
├── sind-nodes.conf
├── cgroup.conf
├── plugstack.conf # always: include plugstack.conf.d/*
└── plugstack.conf.d/
plugstack.conf is always created with an include plugstack.conf.d/* directive, and PlugStackConfig is always set in slurm.conf. This allows SPANK plugins to be dropped in without additional configuration.
Standalone sections (gres, topology) are only created when configured. They require enabling in slurm.conf (e.g., GresTypes=gpu, TopologyPlugin=topology/tree) via the main section.
Validation rules:
- Fragment names must be plain filenames (no path separators)
- Fragment names and content must not be empty
| Role | Count | Required | Slurm Daemons | Description |
|---|---|---|---|---|
controller |
exactly 1 | yes | slurmctld | Cluster controller |
submitter |
0-1 | no | none (clients only) | Job submission node |
worker |
1+ | yes | slurmd | Worker nodes |
| Parameter | Scope | Default | Description |
|---|---|---|---|
image |
global + per-node | ghcr.io/gsi-hpc/sind-node:latest |
Container image |
tmpSize |
global + per-node | 256m |
tmpfs size for /tmp |
cpus |
global + per-node | 1 |
CPU limit |
memory |
global + per-node | 512m |
Memory limit |
capAdd |
global + per-node | none | Extra Linux capabilities (e.g. SYS_ADMIN) |
capDrop |
global + per-node | none | Dropped Linux capabilities |
devices |
global + per-node | none | Host devices to expose (e.g. /dev/fuse) |
securityOpt |
global + per-node | none | Extra security options |
count |
worker only | 1 |
Number of worker nodes |
managed |
worker only | true |
Start slurmd and add to slurm.conf |
Per-node scalar values override the defaults section. List fields (capAdd, capDrop, devices, securityOpt) are merged with defaults rather than replacing them.
nodes- optional; if omitted, creates 1 controller + 1 workerrole: controller- exactly one (auto-created if nodes omitted)role: submitter- at most onerole: worker- at least one (auto-created if nodes omitted)count- only valid for worker role
| Type | Name Pattern | Example (sind create cluster dev) |
|---|---|---|
| Network | <realm>-<cluster>-net |
sind-dev-net |
| Controller | <realm>-<cluster>-controller |
sind-dev-controller |
| Submitter | <realm>-<cluster>-submitter |
sind-dev-submitter |
| Worker | <realm>-<cluster>-worker-<N> |
sind-dev-worker-0 |
| Config volume | <realm>-<cluster>-config |
sind-dev-config |
| Munge volume | <realm>-<cluster>-munge |
sind-dev-munge |
| Data volume | <realm>-<cluster>-data |
sind-dev-data |
| Type | Name Pattern | Example |
|---|---|---|
| Mesh network | <realm>-mesh |
sind-mesh |
| DNS container | <realm>-dns |
sind-dns |
| SSH container | <realm>-ssh |
sind-ssh |
| SSH volume | <realm>-ssh-config |
sind-ssh-config |
The default realm is sind and the default cluster name is default, resulting in prefixes like sind-default-*.
| Volume | Mount Point | Controller | Worker | Submitter |
|---|---|---|---|---|
sind-<cluster>-config |
/etc/slurm |
rw | ro | ro |
sind-<cluster>-munge |
/etc/munge |
ro | ro | ro |
sind-<cluster>-data |
/data |
rw | rw | rw |
| tmpfs | /tmp |
per-node | per-node | per-node |
SELinux relabeling (:z) is not used because containers run with --security-opt label=disable. This avoids expensive recursive relabeling of bind-mounted host directories.
Container mount flags:
-v sind-<cluster>-config:/etc/slurm:rw # controller
-v sind-<cluster>-config:/etc/slurm:ro # all others
-v sind-<cluster>-munge:/etc/munge:ro # all nodes
-v sind-<cluster>-data:/data:rw # all nodes
--tmpfs /tmp:rw,nosuid,nodev,size=1g # configurable size
--tmpfs /run:exec,mode=755 # systemd runtime
--tmpfs /run/lock # systemd lock files
By default, sind create cluster bind-mounts the current working directory as /data on all nodes:
-v /absolute/path/to/cwd:/data:rw
The --data flag controls the mount source:
--data .(default) — bind-mount the current working directory--data /path— bind-mount a specific host directory--data volume— use a Docker-managed volume (sind-<cluster>-data)
When a YAML config specifies storage.dataStorage, the config takes precedence over --data.
The resolved host path is stored on each container as the sind.data.hostpath label so that
dynamically added workers (sind create worker) inherit the same mount.
sind applies labels to containers for filtering and metadata:
| Label | Example | Description |
|---|---|---|
sind.realm |
sind |
Realm namespace |
sind.cluster |
dev |
Cluster name |
sind.role |
worker |
Node role |
sind.slurm.version |
25.11.4 |
Slurm version |
sind.data.hostpath |
/home/user/project |
Resolved data mount host path |
sind enter and sind exec run commands directly inside the target container via docker exec
with the working directory set to /data. This means commands operate on the shared data mount.
sind ssh continues to use the SSH relay container for full SSH access (port forwarding, etc.).
Each cluster has an isolated Docker bridge network:
- Name:
sind-<cluster>-net - Nodes can reach each other by container hostname
All clusters automatically join a shared mesh network for cross-cluster communication:
| Event | Result |
|---|---|
| First cluster created | Creates sind-mesh network, starts sind-dns |
| Subsequent clusters | Connects cluster nodes to sind-mesh, updates DNS |
| Cluster deleted | Disconnects cluster nodes, updates DNS |
| Last cluster deleted | Removes sind-dns and sind-mesh network |
The sind-dns container (CoreDNS) provides name resolution across meshed clusters using a realm-aware zone:
<realm>.sind:53
Records follow the pattern:
<role>.<cluster>.<realm>.sind → container IP
Nodes are configured with:
--dns <sind-dns-ip>
--dns-search <cluster>.<realm>.sind
The DNS container is lightweight and does not run systemd/sshd.
The sind-ssh container provides SSH access to all cluster nodes. It is a lightweight container (no systemd) that runs on the mesh network.
| Resource | Purpose |
|---|---|
sind-ssh container |
SSH client for accessing nodes |
sind-ssh-config volume |
SSH keypair and known_hosts |
The sind-ssh-config volume contains:
| File | Description |
|---|---|
id_ed25519 |
Private key (generated on first cluster creation) |
id_ed25519.pub |
Public key (injected into node images) |
known_hosts |
Host keys of all nodes (updated dynamically) |
| Event | Result |
|---|---|
| First cluster created | Creates sind-ssh-config volume, generates keypair, starts sind-ssh container |
| Node created | Collects sshd host key, appends to known_hosts |
| Node deleted | Removes entry from known_hosts |
| Last cluster deleted | Removes sind-ssh container and sind-ssh-config volume |
When sind creates a node, it waits for sshd to start, then collects the host key:
docker exec <node> cat /etc/ssh/ssh_host_ed25519_key.pubThe key is added to known_hosts with the node's DNS name:
controller.dev.sind.sind ssh-ed25519 AAAA...
worker-0.dev.sind.sind ssh-ed25519 AAAA...
The public key from sind-ssh-config is injected into nodes via:
docker exec <node> mkdir -p /root/.ssh
docker exec <node> sh -c 'cat >> /root/.ssh/authorized_keys' < pubkeyThis happens after container start, before host key collection.
sind only configures SSH access for the root user. Additional user management (creating users, distributing SSH keys, configuring sudo, etc.) is left to the user.
sind ssh executes SSH via the sind-ssh container:
sind ssh [SSH_OPTIONS] NODE [-- COMMAND [ARGS...]]Internally:
docker exec -it sind-ssh ssh [SSH_OPTIONS] <node>.<realm>.sind [COMMAND [ARGS...]]All SSH options and arguments are passed through verbatim. Examples:
sind ssh worker-0 # interactive shell
sind ssh worker-0.dev # node in dev cluster
sind ssh -v worker-0 # verbose SSH
sind ssh worker-0 -- hostname # run command
sind ssh -t worker-0 -- top # force TTY allocation
sind ssh -L 8080:localhost:80 controller # port forwardingsind exports SSH configuration per realm to $XDG_STATE_HOME/sind/<realm>/ (defaulting to ~/.local/state/sind/<realm>/) for integration with the user's SSH client:
| File | Description |
|---|---|
ssh_config |
SSH config snippet |
id_ed25519 |
Private key (copy from volume) |
known_hosts |
Host keys (copy from volume) |
The generated ssh_config (for default realm sind):
CanonicalizeHostname yes
CanonicalDomains default.sind.sind sind.sind
CanonicalizeMaxDots 2
Host *.sind.sind
ProxyCommand docker exec -i sind-ssh bash -c 'exec 3<>/dev/tcp/%h/22; cat <&3 & cat >&3; kill $!'
IdentityFile ~/.local/state/sind/sind/id_ed25519
UserKnownHostsFile ~/.local/state/sind/sind/known_hosts
User root
StrictHostKeyChecking yes
The Canonicalize* directives enable short-name resolution for the default realm: ssh controller expands to controller.default.sind.sind, and ssh controller.dev expands to controller.dev.sind.sind. For custom realms, the CanonicalDomains list reflects that realm's clusters.
To find the path for a realm, use sind get ssh-config. Add to the top of ~/.ssh/config (before any Host or Match blocks) for a single realm:
Include ~/.local/state/sind/sind/ssh_config
Or include all realms at once using a wildcard (supported by OpenSSH's Include):
Include ~/.local/state/sind/*/ssh_config
This allows direct use of standard SSH tools:
ssh controller.default.sind.sind
ssh worker-0.dev.sind.sind hostname
scp file.txt controller.dev.sind.sind:/tmp/sind updates these files automatically when clusters or nodes are created/deleted. When the last cluster in a realm is deleted, the files and realm directory are removed.
Interactive sessions are routed based on cluster configuration:
| Command | Target Node |
|---|---|
sind ssh <node> |
explicit node |
sind enter [cluster] |
submitter (if exists) → controller |
sind exec [cluster] -- <cmd> |
submitter (if exists) → controller |
Opens an interactive shell on the submitter (or controller if no submitter configured). Equivalent to sind ssh submitter or sind ssh controller.
One-shot command execution. Equivalent to sind ssh <target> -- <cmd>.
sind provides an generic multi-role image that works for all node types:
ghcr.io/gsi-hpc/sind-node:latest
ghcr.io/gsi-hpc/sind-node:<slurm-version>
This is the default image when defaults.image is not specified in the cluster configuration.
The generic image:
- Based on Rocky Linux 10
- Builds Slurm, OpenMPI, PMIx, PRRTE, and UCX from source
- Contains all Slurm daemons (slurmctld, slurmd) and a full MPI stack
- Slurm is built with
--with-pmixfor native PMIx job launch support - sind enables the appropriate services based on node role
The Dockerfile uses a multi-stage build with a shared builder-base stage. UCX and PMIx build in parallel, PRRTE and Slurm depend on PMIx, and OpenMPI depends on all three. Component versions are pinned as ARG defaults in the Dockerfile and mirrored in docker-bake.hcl.
Custom images must provide:
All roles:
- systemd as init (PID 1)
- sshd service (enabled, sind injects authorized_keys at runtime)
- munge service (enabled)
- Slurm client tools (srun, sbatch, squeue, etc.)
Per-role requirements:
| Role | Additional Requirements |
|---|---|
| controller | slurmctld (installed, not enabled) |
| worker | slurmd (installed, not enabled) |
| submitter | Slurm client tools only |
sind enables Slurm services at container start based on the node's role. Services should be installed but not enabled in the image.
Example Dockerfiles are provided in the images/ directory.
During sind create cluster, before starting any containers, sind generates a random munge key and writes it to the sind-<cluster>-munge volume. This ensures all nodes share the same key from first boot.
sind auto-generates a minimal Slurm configuration based on cluster topology and writes it to the sind-<cluster>-config volume.
sind generates a multi-file configuration structure:
/etc/slurm/
├── slurm.conf # main config
├── sind-nodes.conf # sind-managed node definitions
├── cgroup.conf # cgroupv2 configuration
├── plugstack.conf # SPANK plugin config (always created)
├── plugstack.conf.d/ # SPANK plugin fragments (always created)
├── slurm.conf.d/ # main config fragments (if slurm.main is a map)
├── cgroup.conf.d/ # cgroup fragments (if slurm.cgroup is a map)
├── gres.conf # generic resources (if slurm.gres is set)
└── topology.conf # network topology (if slurm.topology is set)
The main slurm.conf always contains:
include /etc/slurm/sind-nodes.conf
PlugStackConfig=/etc/slurm/plugstack.conf
This file contains node and partition definitions for sind-managed nodes. sind assumes exclusive ownership of this file:
sind create clustergenerates initial node definitions heresind create workerappends new nodes (unless--unmanaged)sind delete workerremoves nodes (for managed nodes)
Users should not edit sind-nodes.conf directly. To add custom node definitions, create a separate file and add an include directive to slurm.conf.
Nodes with managed: false in the cluster config are excluded from sind-nodes.conf.
sind generates a cgroup.conf for cgroupv2 support on worker nodes. This enables resource isolation and accounting for jobs.
sind delivers a working starter configuration. The slurm config key allows extending it declaratively at creation time (see Slurm Configuration Sections above). For post-creation changes, the /etc/slurm volume is writable on the controller node.
Users may:
- Use
slurm.main,slurm.cgroup, etc. to extend config at creation time - Edit config files directly after creation (sind does not modify them after creation)
- Add additional include files for custom configuration
- Replace the entire configuration (but
sind create workerwill then fail for managed nodes)
sind does not manage Slurm versions directly—the version is implicit in the chosen container images. However, sind discovers the Slurm version before cluster creation to:
- Generate version-appropriate configuration (slurm.conf)
- Display version information in CLI output
- Store version metadata on containers and volumes
Before creating any cluster resources, sind runs an ephemeral container to discover the Slurm version:
docker run --rm <image> scontrol --version
# Output: "slurm 25.11.0"This happens once per unique image in the cluster configuration. The discovered version is then stored as labels on cluster resources:
--label sind.slurm.version=25.11.0
When the cluster configuration specifies multiple images (e.g., different images per role), sind discovers the version from each unique image. If images report different Slurm versions, sind logs a warning but continues with cluster creation. The controller image's version is used for configuration generation.
Mismatched Slurm versions can cause subtle runtime issues, but users may have legitimate reasons for mixed versions (e.g., testing rolling upgrades).
sind maintains awareness of version-specific configuration changes and generates compatible slurm.conf. This includes handling deprecated parameters and new required parameters across Slurm versions.
The mesh DNS uses a realm-aware hierarchical namespace:
<role>.<cluster>.<realm>.sind
<role>-<N>.<cluster>.<realm>.sind
The hierarchy is: node . cluster . realm . sind
Each realm gets its own CoreDNS zone (<realm>.sind), and nodes within a cluster are configured with --dns-search <cluster>.<realm>.sind so short names resolve within the cluster.
Examples (default realm sind):
controller.default.sind.sindsubmitter.default.sind.sindworker-0.default.sind.sindworker-1.default.sind.sindcontroller.dev.sind.sind
Examples (custom realm ci-42):
controller.default.ci-42.sindworker-0.dev.ci-42.sind
Within a cluster, short names resolve via the search domain: a node in the dev cluster of realm sind can reach controller without the full controller.dev.sind.sind.
Mutating operations acquire a per-realm advisory lock (flock) to prevent concurrent modifications to shared realm state. The lock file is stored at:
$XDG_STATE_HOME/sind/<realm>/lock # default: ~/.local/state/sind/<realm>/lock
sind create clustersind delete cluster(single and--all)sind create workersind delete worker
Read-only operations (get, logs, ssh, etc.) do not acquire the lock.
- Lock is attempted non-blocking first; if free, the operation proceeds immediately
- If another operation holds the lock, sind logs
"waiting for another operation to complete"(info level) and blocks until the lock is released - Lock is released when the operation completes (success or failure)
- Context cancellation (e.g., Ctrl+C) unblocks a waiting operation
Locks are per-realm. Operations in different realms run concurrently without contention. This makes realm-based CI isolation safe for parallel jobs.
Planned commands for suspending and resuming clusters without destroying them:
sind stop cluster [NAME] # stop all containers, preserve volumes
sind start cluster [NAME] # start previously stopped clusterstop cluster:
- Stops all node containers (
docker stop) - Preserves all volumes (config, munge, data)
- Preserves network configuration
- Cluster appears as "stopped" in
sind get clusters
start cluster:
- Starts previously stopped containers
- Nodes rejoin mesh network
- DNS records restored
- Slurm daemons resume normal operation
This enables resource conservation when clusters are not actively in use without losing cluster state or configuration.
Planned support for a dedicated database node role:
nodes:
- role: db # slurmdbd + MariaDB
- role: controller
- role: worker
count: 3The db role would run slurmdbd and MariaDB for job accounting. sind would:
- Generate
slurmdbd.confwith appropriate settings - Configure
slurm.confto use the accounting database - Initialize the MariaDB database schema
This enables testing of Slurm accounting features and multi-cluster federation scenarios.