Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 1 addition & 9 deletions .github/workflows/build-artifacts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,10 +100,7 @@ jobs:
defaults:
run:
working-directory: runner
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Go
Expand All @@ -126,11 +123,6 @@ jobs:
if [[ "${{ inputs.go-integration-tests }}" == "true" ]]; then
SHORT=""
fi
# Skip failing integration tests on macOS for release builds.
# TODO: https://github.com/dstackai/dstack/issues/3005
if [[ "${{ inputs.staging }}" == "false" && "${{ startsWith(matrix.os, 'macos') }}" == "true" ]]; then
SHORT="-short"
fi
go version
go fmt $(go list ./... | grep -v /vendor/)
go vet $(go list ./... | grep -v /vendor/)
Expand Down
2 changes: 1 addition & 1 deletion mkdocs/docs/concepts/fleets.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ To create a fleet, define its configuration in a YAML file. The filename must en
`dstack apply` automatically connects to on-prem servers, installs the required dependencies, and adds them to the created fleet.

??? info "Host requirements"
1. Hosts must be pre-installed with Docker.
1. Hosts must be Linux-based and have Docker pre-installed.

=== "NVIDIA"
2. Hosts with NVIDIA GPUs must also be pre-installed with CUDA 12.1 and
Expand Down
1 change: 0 additions & 1 deletion mkdocs/docs/reference/env.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,6 @@ For more details on the options below, refer to the [server deployment](../guide
* `DSTACK_SHIM_DOWNLOAD_URL` – Overrides `dstack-shim` binary download URL. The URL can contain `{version}` and/or `{arch}` placeholders,
see `DSTACK_RUNNER_DOWNLOAD_URL` for the details.
* `DSTACK_DEFAULT_CREDS_DISABLED` – Disables default credentials detection if set. Defaults to `None`.
* `DSTACK_LOCAL_BACKEND_ENABLED` – Enables local backend for debug if set. Defaults to `None`.

## CLI

Expand Down
18 changes: 17 additions & 1 deletion runner/.justfile
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ export shim_download_url := "s3://" + s3_bucket + "/" + version + "/binaries/dst
export shim_os := ""
export shim_arch := ""

# Go toolchain image for running tests in a container (keep in sync with go.mod)
export go_version := env("DSTACK_GO_VERSION", "1.25")

# Build runner
[private]
build-runner-binary:
Expand Down Expand Up @@ -78,10 +81,23 @@ clean-runner:
rm -f {{source_directory()}}/cmd/shim/shim
echo "Build artifacts cleaned!"

# Run tests for runner and shim
# Run tests for runner and shim (native; requires a Linux host)
test-runner:
cd {{source_directory()}} && go test -v ./...

# Run tests for runner and shim in a Linux container (use on macOS/Windows, where native builds are not available)
# Examples:
# just test-runner-in-container # short suite, all packages
# just test-runner-in-container -run TestPullImage ./internal/shim/
test-runner-in-container *args="-short ./...":
docker run --rm -t \
-v {{source_directory()}}:/src -w /src \
-v dstack-go-mod:/go/pkg/mod \
-v dstack-go-build:/root/.cache/go-build \
-v /var/run/docker.sock:/var/run/docker.sock \
golang:{{go_version}} \
go test -race {{args}}

# Validate shim is built for linux/amd64
[private]
validate-shim-binary:
Expand Down
66 changes: 18 additions & 48 deletions runner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,19 @@

For overview of `dstack-shim` and `dstack-runner`, see [/contributing/RUNNER-AND-SHIM.md](../contributing/RUNNER-AND-SHIM.md).

## Running locally
`dstack-shim` and `dstack-runner` can be built only for GOOS=linux. Use containers for development on other OS.

Here's the steps to build `dstack-shim` and `dstack-runner` and run `dstack` with them locally:
## Testing locally

Run shim and runner tests on any OS inside a Docker container:

```shell
just test-runner-in-container
```

## Running locally (standalone)

Build `dstack-shim` and `dstack-runner` and run them locally:

1. Build the runner executable

Expand All @@ -13,8 +23,6 @@ Here's the steps to build `dstack-shim` and `dstack-runner` and run `dstack` wit
go build
```

Note: The runner runs inside the Docker container, so ensure it's compiled for linux/amd64. For example, on macOS you'd run `GOOS=linux GOARCH=amd64 go build`.

2. Build the shim executable

```shell
Expand All @@ -40,52 +48,14 @@ Now you can call shim API:
>>> s.submit("","", "ubuntu", None)
```

### Local backend

You can also run `dstack` end-to-end with local shim and runner by enabling the `local` backend on dstack server:

```shell
DSTACK_LOCAL_BACKEND_ENABLED= dstack server --log-level=debug
```

The `local` backend will submit the run to the locally started shim and runner. The CLI will attach to the container just as if it were any other cloud backend:

```shell
✗ dstack apply .
Configuration .dstack.yml
Project main
User admin
Pool name default-pool
Min resources 2..xCPU, 4GB..
Max price -
Max duration 6h
Spot policy auto
Retry policy yes
Creation policy reuse-or-create
Termination policy destroy-after-idle
Termination idle time 300s

# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 local local local 4xCPU, 8GB, 100GB no $0
(disk)
2 azure westeurope Standard_D2s_v3 2xCPU, 8GB, 100GB yes $0.012
(disk)
3 azure westeurope Standard_E2s_v4 2xCPU, 16GB, 100GB yes $0.015246
(disk)
...
Shown 3 of 4041 offers, $56.6266 max

Continue? [y/n]:
```

## Testing remotely
## Running with `dstack`

You can also test the built shim and runner using standard backends (including SSH fleets).
You can test the built shim and runner with `dstack` using standard backends (including SSH fleets).

> [!NOTE]
> To run with standard backends, both the runner and shim must be built for linux/amd64.
> To run with standard backends, both the runner and shim must be built for linux.

Build the runner and shim, and upload them to S3 automatically using `just` (see [`justfile`](justfile)).
Build the runner and shim and upload them to S3 using `just` (see [`justfile`](justfile)).

> [!IMPORTANT]
> Before running any `just` commands that upload to S3, you must set the following environment variables:
Expand All @@ -101,7 +71,7 @@ Build the runner and shim, and upload them to S3 automatically using `just` (see
just upload
```

To use the built shim and runner with the dstack server, pass the URLs via `DSTACK_SHIM_DOWNLOAD_URL` and `DSTACK_RUNNER_DOWNLOAD_URL`:
To use the built shim and runner with the `dstack` server, pass the URLs via `DSTACK_SHIM_DOWNLOAD_URL` and `DSTACK_RUNNER_DOWNLOAD_URL`:

```shell
export DSTACK_SHIM_DOWNLOAD_URL="https://${DSTACK_SHIM_UPLOAD_S3_BUCKET}.s3.amazonaws.com/${DSTACK_SHIM_UPLOAD_VERSION}/binaries/dstack-shim-linux-amd64"
Expand All @@ -112,7 +82,7 @@ dstack server --log-level=debug

## Dependencies (WIP)

These are nonexhaustive lists of external dependencies (executables, libraries) of the `dstack-*` binaries.
These are non-exhaustive lists of external dependencies (executables, libraries) of the `dstack-*` binaries.

**TODO**: inspect codebase, add missing dependencies.

Expand Down
4 changes: 4 additions & 0 deletions runner/cmd/runner/main.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
//go:build linux

// dstack-runner is supported only in Linux environments.

package main

import (
Expand Down
4 changes: 4 additions & 0 deletions runner/cmd/shim/main.go
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
//go:build linux

// dstack-shim is supported only in Linux environments.

package main

import (
Expand Down
32 changes: 8 additions & 24 deletions runner/internal/runner/executor/executor.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ import (
"os/exec"
"path"
"path/filepath"
"runtime"
"strconv"
"strings"
"sync"
Expand Down Expand Up @@ -90,34 +89,19 @@ type RunExecutor struct {
connectionTracker ConnectionTracker
}

// stubConnectionTracker is a no-op implementation for when procfs is not available (only required for tests on darwin)
type stubConnectionTracker struct{}

func (s *stubConnectionTracker) GetNoConnectionsSecs() int64 { return 0 }
func (s *stubConnectionTracker) Track(ticker <-chan time.Time) {}
func (s *stubConnectionTracker) Stop() {}

func NewRunExecutor(tempDir string, dstackDir string, currentUser linuxuser.User, sshd ssh.SshdManager) (*RunExecutor, error) {
mu := &sync.RWMutex{}
timestamp := NewMonotonicTimestamp()

// Try to initialize procfs, but don't fail if it's not available (e.g., on macOS)
var connectionTracker ConnectionTracker

if runtime.GOOS == "linux" {
proc, err := procfs.NewDefaultFS()
if err != nil {
return nil, fmt.Errorf("initialize procfs: %w", err)
}
connectionTracker = connections.NewConnectionTracker(connections.ConnectionTrackerConfig{
Port: uint64(sshd.Port()),
MinConnDuration: 10 * time.Second, // shorter connections are likely from dstack-server
Procfs: proc,
})
} else {
// Use stub connection tracker (only required for tests on darwin)
connectionTracker = &stubConnectionTracker{}
proc, err := procfs.NewDefaultFS()
if err != nil {
return nil, fmt.Errorf("initialize procfs: %w", err)
}
connectionTracker := connections.NewConnectionTracker(connections.ConnectionTrackerConfig{
Port: uint64(sshd.Port()),
MinConnDuration: 10 * time.Second, // shorter connections are likely from dstack-server
Procfs: proc,
})

return &RunExecutor{
tempDir: tempDir,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
//go:build linux

package capabilities

import (
Expand Down
22 changes: 0 additions & 22 deletions runner/internal/runner/linux/capabilities/capabilities_darwin.go

This file was deleted.

7 changes: 0 additions & 7 deletions runner/internal/runner/metrics/metrics_test.go
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
package metrics

import (
"runtime"
"testing"

"github.com/dstackai/dstack/runner/internal/runner/schemas"
"github.com/stretchr/testify/assert"
)

func TestGetAMDGPUMetrics_OK(t *testing.T) {
if runtime.GOOS == "darwin" {
t.Skip("Skipping on macOS")
}
collector, err := NewMetricsCollector(t.Context())
assert.NoError(t, err)

Expand Down Expand Up @@ -43,9 +39,6 @@ func TestGetAMDGPUMetrics_OK(t *testing.T) {
}

func TestGetAMDGPUMetrics_ErrorGPUUtilNA(t *testing.T) {
if runtime.GOOS == "darwin" {
t.Skip("Skipping on macOS")
}
collector, err := NewMetricsCollector(t.Context())
assert.NoError(t, err)
metrics, err := collector.getAMDGPUMetrics("gpu,gfx,gfx_clock,vram_used,vram_total\n0,N/A,N/A,283,196300\n")
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//go:build linux && cgo
//go:build cgo

package dcgm

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//go:build linux && cgo
//go:build cgo

package dcgm

Expand Down
9 changes: 0 additions & 9 deletions runner/internal/shim/dcgm/wrapper_darwin.go

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//go:build linux && !cgo
//go:build !cgo

package dcgm

Expand Down
9 changes: 1 addition & 8 deletions runner/internal/shim/docker.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ import (
"os/exec"
"os/user"
"path/filepath"
rt "runtime"
"strconv"
"strings"
"sync"
Expand Down Expand Up @@ -966,9 +965,6 @@ func (d *DockerRunner) startContainer(ctx context.Context, task *Task) error {
if err != nil {
return fmt.Errorf("inspect container: %w", err)
}
// FIXME: container_.NetworkSettings.Ports values (bindings) are not immediately available
// on macOS, so ports can be empty with local backend.
// Workaround: restart shim after submitting the run.
task.ports = extractPorts(ctx, container_.NetworkSettings.Ports)
return nil
}
Expand Down Expand Up @@ -1083,10 +1079,7 @@ func extractPorts(ctx context.Context, portMap nat.PortMap) []PortMapping {
}

func getNetworkMode(networkMode NetworkMode) container.NetworkMode {
if rt.GOOS == "linux" {
return container.NetworkMode(networkMode)
}
return "default"
return container.NetworkMode(networkMode)
}

func configureGpuDevices(hostConfig *container.HostConfig, gpuDevices []GPUDevice) {
Expand Down
8 changes: 3 additions & 5 deletions runner/internal/shim/docker_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ import (
"context"
"encoding/hex"
"math/rand"
"os"
"runtime"
"sync"
"testing"
"time"
Expand All @@ -18,7 +16,7 @@ import (
// TestDocker_SSHServer pulls ubuntu image (without sshd), installs openssh-server and exits
// Basically, it indirectly tests a shell script generated by getSSHShellCommands
func TestDocker_SSHServer(t *testing.T) {
if testing.Short() || (os.Getenv("CI") == "true" && runtime.GOOS == "darwin") {
if testing.Short() {
t.Skip()
}
t.Parallel()
Expand All @@ -44,7 +42,7 @@ func TestDocker_SSHServer(t *testing.T) {
}

func TestDocker_ShmNoexecByDefault(t *testing.T) {
if testing.Short() || (os.Getenv("CI") == "true" && runtime.GOOS == "darwin") {
if testing.Short() {
t.Skip()
}
t.Parallel()
Expand All @@ -69,7 +67,7 @@ func TestDocker_ShmNoexecByDefault(t *testing.T) {
}

func TestDocker_ShmExecIfSizeSpecified(t *testing.T) {
if testing.Short() || (os.Getenv("CI") == "true" && runtime.GOOS == "darwin") {
if testing.Short() {
t.Skip()
}
t.Parallel()
Expand Down
Loading
Loading