Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions K8S_VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1.28.5
15 changes: 3 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,10 @@
# GRANNY Experiments

This repo contains the experiments for the [GRANNY paper](
Comment thread
lgarithm marked this conversation as resolved.
https://www.usenix.org/conference/nsdi25/presentation/segarra).
This repo contains the experiments for the [Granny paper](https://arxiv.org/abs/2302.11358).

All instructions in this repo assume that you have checked-out the repository,
and activated the python virtual environment (requires `python3-venv`):
When following any instructions in this repository, it is recommended to have a dedicated terminal with virtual environment of this repo activated: (`source ./bin/workon.sh`).

```bash
source ./bin/workon.sh
inv -l # shows the differnt tasks
```

The Granny source-code is merged into the Faasm [repository](
https://github.com/faasm/faasm) tag [`0.27.0`](
https://github.com/faasm/faasm/releases/tag/v0.27.0)
This virtual environment provides commands for provision/deprovision K8s clusters on Azure (with AKS), accessing low-level monitoring tools (we recommend `k9s`), and also commands for deploy Faabric clusters, run the experiments, and plot the results.

## Experiments in this repository

Expand Down
1 change: 1 addition & 0 deletions config/granny_aks_kubelet_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{ "allowedUnsafeSysctls": ["net.*"] }
13 changes: 13 additions & 0 deletions config/granny_aks_os_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"sysctls": {
"netCoreRmemMax": 16777216,
"netCoreWmemMax": 16777216,
"netIpv4TcpRmem": "4096 87380 16777216",
"netIpv4TcpWmem": "4096 65536 16777216",
"netCoreNetdevMaxBacklog": "30000",
"netCoreRmemDefault": 16777216,
"netCoreWmemDefault": 16777216,
"netIpv4TcpMem": "16777216 16777216 16777216",
"netIpv4RouteFlush": 1
}
}
4 changes: 4 additions & 0 deletions tasks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
from invoke import Collection

from . import cluster
from . import docker
from . import format_code
from . import k8s

import logging

Expand All @@ -20,8 +22,10 @@
logging.getLogger().setLevel(logging.DEBUG)

ns = Collection(
cluster,
docker,
format_code,
k8s,
)

ns.add_collection(elastic_ns, name="elastic")
Expand Down
143 changes: 143 additions & 0 deletions tasks/cluster.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
from invoke import task
from os.path import join
from subprocess import run
from tasks.util.env import (
ACR_NAME,
AKS_CLUSTER_NAME,
AKS_NODE_COUNT,
AKS_REGION,
AKS_VM_SIZE,
AZURE_PUB_SSH_KEY,
AZURE_RESOURCE_GROUP,
CONFIG_DIR,
KUBECTL_BIN,
)
from tasks.util.version import get_k8s_version


# AKS commandline reference here:
# https://docs.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest
def _run_aks_cmd(name, az_args=None):
cmd = [
"az",
"aks {}".format(name),
"--resource-group {}".format(AZURE_RESOURCE_GROUP),
]

if az_args:
cmd.extend(az_args)

cmd = " ".join(cmd)
print(cmd)
run(cmd, shell=True, check=True)


@task
def list(ctx):
"""
List all AKS resources
"""
_run_aks_cmd("list")


@task(optional=["sgx"])
def provision(
ctx,
nodes=AKS_NODE_COUNT,
vm=AKS_VM_SIZE,
location=AKS_REGION,
name=AKS_CLUSTER_NAME,
sgx=False,
granny=True,
):
"""
Provision the AKS cluster
"""
k8s_ver = get_k8s_version()
sgx = sgx and (sgx.lower() != "false")
granny_kubelet_config = join(CONFIG_DIR, "granny_aks_kubelet_config.json")
granny_os_config = join(CONFIG_DIR, "granny_aks_os_config.json")

if sgx and "Standard_DC" not in vm:
print(
"Error provisioning SGX cluster: only `Standard_DC` VMs are supported"
)
return

_run_aks_cmd(
"create",
[
"--name {}".format(name),
"--node-count {}".format(nodes),
"--node-vm-size {}".format(vm),
"--os-sku Ubuntu",
"--kubernetes-version {}".format(k8s_ver),
"--ssh-key-value {}".format(AZURE_PUB_SSH_KEY),
"--location {}".format(location),
# Could not create a role assignment for ACR. Are you an Owner on this subscription?
# "--attach-acr {}".format(ACR_NAME.split(".")[0]),
"{}".format(
"--kubelet-config {}".format(granny_kubelet_config)
if granny
else ""
),
"{}".format(
"--linux-os-config {}".format(granny_os_config)
if granny
else ""
),
"{}".format(
"--enable-addons confcom --enable-sgxquotehelper"
if sgx
else ""
),
],
)


@task
def details(ctx):
"""
Show the details of the cluster
"""
_run_aks_cmd(
"show",
[
"--name {}".format(AKS_CLUSTER_NAME),
],
)


@task
def delete(ctx, name=AKS_CLUSTER_NAME):
"""
Delete the AKS cluster
"""
_run_aks_cmd(
"delete",
[
"--name {}".format(name),
"--yes",
],
)


@task
def credentials(ctx, name=AKS_CLUSTER_NAME, out_file=None):
"""
Get credentials for the AKS cluster
"""
# Set up the credentials
_run_aks_cmd(
"get-credentials",
[
"--name {}".format(name),
"--overwrite-existing",
"--file {}".format(out_file) if out_file else "",
],
)

# Check we can access the cluster
cmd = "{} get nodes".format(KUBECTL_BIN)
print(cmd)
run(cmd, shell=True, check=True)
7 changes: 6 additions & 1 deletion tasks/elastic/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Elastic Scaling Micro-Benchmark
# Elastic Scaling Micro-Benchmark (Fig.12)

In this experiment we measure the benefits of elastically scaling-up OpenMP
applications to benefit from idle resources. We run a pipe-lined algorithm
Expand Down Expand Up @@ -44,6 +44,11 @@ You may now plot the results using:
inv elastic.plot
```

the plot will be available in [`/plots/elastic/elastic_speedup.pdf`](/plots/elastic/elastic_speedup.pdf), we also include it below:

![Elastic Scaling Plot](/plots/elastic/elastic_speedup.png)


## Clean-Up

Finally, delete the Granny cluster:
Expand Down
91 changes: 91 additions & 0 deletions tasks/k8s.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
from invoke import task
from os.path import join, exists
from os import makedirs
from shutil import copy, rmtree
from subprocess import run

from tasks.util.env import (
BIN_DIR,
GLOBAL_BIN_DIR,
K9S_VERSION,
)

from tasks.util.version import get_k8s_version


def _download_binary(url, binary_name):
makedirs(BIN_DIR, exist_ok=True)
cmd = "curl -LO {}".format(url)
run(cmd, shell=True, check=True, cwd=BIN_DIR)
run("chmod +x {}".format(binary_name), shell=True, check=True, cwd=BIN_DIR)

return join(BIN_DIR, binary_name)


def _symlink_global_bin(binary_path, name):
global_path = join(GLOBAL_BIN_DIR, name)
if exists(global_path):
print("Removing existing binary at {}".format(global_path))
run(
"sudo rm -f {}".format(global_path),
shell=True,
check=True,
)

print("Symlinking {} -> {}".format(global_path, binary_path))
run(
"sudo ln -s {} {}".format(binary_path, name),
shell=True,
check=True,
cwd=GLOBAL_BIN_DIR,
)


@task
def install_kubectl(ctx, system=False):
"""
Install the k8s CLI (kubectl)
"""
k8s_ver = get_k8s_version()
url = "https://dl.k8s.io/release/v{}/bin/linux/amd64/kubectl".format(
k8s_ver
)

binary_path = _download_binary(url, "kubectl")

# Symlink for kubectl globally
if system:
_symlink_global_bin(binary_path, "kubectl")


@task
def install_k9s(ctx, system=False):
"""
Install the K9s CLI
"""
tar_name = "k9s_Linux_amd64.tar.gz"
url = "https://github.com/derailed/k9s/releases/download/v{}/{}".format(
K9S_VERSION, tar_name
)
print(url)

# Download the TAR
workdir = "/tmp/k9s-csg"
makedirs(workdir, exist_ok=True)

cmd = "curl -LO {}".format(url)
run(cmd, shell=True, check=True, cwd=workdir)

# Untar
run("tar -xf {}".format(tar_name), shell=True, check=True, cwd=workdir)

# Copy k9s into place
binary_path = join(BIN_DIR, "k9s")
copy(join(workdir, "k9s"), binary_path)

# Remove tar
rmtree(workdir)

# Symlink for k9s command globally
if system:
_symlink_global_bin(binary_path, "k9s")
9 changes: 5 additions & 4 deletions tasks/kernels_mpi/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# ParRes Kernels Experiment (MPI)
# ParRes Kernels Experiment - MPI (Fig.9b)

This experiment runs a set of the [ParRes Kernels](https://github.com/ParRes/Kernels)
as a microbenchmark for Granny's MPI implementation.

## Start AKS cluster

In the `experiment-base` terminal, run:
Create a new cluster:

```bash
inv cluster.provision --vm Standard_D8_v5 --nodes 3 cluster.credentials
Expand Down Expand Up @@ -63,8 +63,9 @@ To plot the results, just run:
inv kernels-mpi.plot
```

the plot will be available in [`./plots/kernels-mpi/mpi_kernels_slowdown.pdf`](
./plots/kernels-mpi/mpi_kernels_slowdown.pdf).
the plot will be available in [`/plots/kernels-mpi/mpi_kernels_slowdown.pdf`](/plots/kernels-mpi/mpi_kernels_slowdown.pdf), we also include it below:

![MPI Kernels Slowdown Plot](/plots/kernels-mpi/mpi_kernels_slowdown.png)

## Clean-up

Expand Down
9 changes: 5 additions & 4 deletions tasks/kernels_omp/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# ParRes Kernels Experiment (OpenMP)
# ParRes Kernels Experiment - OpenMP (Fig.10)

This experiment runs a set of the [ParRes Kernels](https://github.com/ParRes/Kernels)
as a microbenchmark for Granny's OpenMP implementation.

## Start AKS cluster

In the `experiment-base` terminal, run:
Create a new cluster:

```bash
inv cluster.provision --vm Standard_D8_v5 --nodes 2 cluster.credentials
Expand Down Expand Up @@ -63,8 +63,9 @@ To plot the results, just run:
inv kernels-omp.plot
```

the plot will be available in [`./plots/kernels-omp/openmp_kernels_slowdown.pdf`](
./plots/kernels-omp/openmp_kernels_slowdown.pdf).
the plot will be available in [`/plots/kernels-omp/openmp_kernels_slowdown.pdf`](/plots/kernels-omp/openmp_kernels_slowdown.pdf), we also include it below:

![OpenMP Kernels Slowdown Plot](/plots/kernels-omp/openmp_kernels_slowdown.png)

## Clean-up

Expand Down
Loading
Loading