These instructions describe how to deploy a DAOS Cluster using the example in terraform/examples/daos_cluster.
Deployment tasks described in these instructions:
- Deploy a DAOS cluster using Terraform
- Log into the first DAOS client instance
- Perform DAOS administrative tasks to prepare the storage
- Mount a DAOS container with DFuse (DAOS FUSE)
- Store files in a DAOS container
- Unmount the container
- Undeploy DAOS cluster (terraform destroy)
The steps in the Pre-Deployment Guide must be completed prior to deploying the DAOS cluster in this example.
The Pre-Deployment Guide describes how to build the DAOS images that are used to deploy server and client instances.
Clone the daos-stack/google-cloud-daos repository and change your working directory to the DAOS Cluster example directory.
cd ~/
git clone https://github.com/daos-stack/google-cloud-daos.git
cd ~/google-cloud-daos/terraform/examples/daos_clusterBefore you run terraform apply to deploy the DAOS cluster you need to create a terraform.tfvars file in the terraform/examples/daos_cluster directory.
The terraform.tfvars file contains the variable values for the configuration.
To ensure a successful deployment of a DAOS cluster there are two pre-configured terraform.tfvars.*.example files that you can choose from.
You will need to decide which of these files to copy to terraform.tfvars.
The terraform.tfvars.tco.example contains variables for a DAOS cluster deployment with
-
16 DAOS Client instances
-
4 DAOS Server instances
Each server instance has sixteen 375GB NVMe SSDs
To use the terraform.tfvars.tco.example file
cp terraform.tfvars.tco.example terraform.tfvarsThe terraform.tfvars.perf.example contains variables for a DAOS cluster deployment with
-
16 DAOS Client instances
-
4 DAOS Server instances
Each server instances has four 375GB NVMe SSDs
To use the terraform.tfvars.perf.example file run
cp terraform.tfvars.perf.example terraform.tfvarsNow that you have a terraform.tfvars file you need to replace the variable placeholders in the file with the values from your active gcloud configuration.
To update the variables in terraform.tfvars run
PROJECT_ID=$(gcloud config list --format 'value(core.project)')
REGION=$(gcloud config list --format 'value(compute.region)')
ZONE=$(gcloud config list --format 'value(compute.zone)')
sed -i "s/<project_id>/${PROJECT_ID}/g" terraform.tfvars
sed -i "s/<region>/${REGION}/g" terraform.tfvars
sed -i "s/<zone>/${ZONE}/g" terraform.tfvarsBilling Notification!
Running this example will incur charges in your project.
To avoid surprises, be sure to monitor your costs associated with running this example.
Don't forget to shut down the DAOS cluster with terraform destroy when you are finished.
To deploy the DAOS cluster
terraform init
terraform plan -out=tfplan
terraform apply tfplanVerify that the daos-client and daos-server instances are running.
gcloud compute instances list \
--filter="name ~ daos" \
--format="value(name,INTERNAL_IP)"Log into the first server instance
gcloud compute ssh daos-client-0001The dmg command is used to perform adminstrative tasks such as formatting storage and managing pools and therefore must be run with sudo.
Use dmg to verify that the DAOS storage system is ready.
sudo dmg system query -vThe State column should display "Joined" for all servers.
Rank UUID Control Address Fault Domain State Reason
---- ---- --------------- ------------ ----- ------
0 0796c576-5651-4e37-aa15-09f333d2d2b8 10.128.0.35:10001 /daos-server-0001 Joined
1 f29f7058-8abb-429f-9fd3-8b13272d7de0 10.128.0.77:10001 /daos-server-0003 Joined
2 09fc0dab-c238-4090-b3f8-da2bd4dce108 10.128.0.81:10001 /daos-server-0002 Joined
3 2cc9140b-fb12-4777-892e-7d190f6dfb0f 10.128.0.30:10001 /daos-server-0004 Joined
View the amount of free NVMe storage.
sudo dmg storage query usageThe output will look different depending on which terraform.tfvars.*.example file you copied to create the terraform.tfvars file.
The output will look similar to this
Hosts SCM-Total SCM-Free SCM-Used NVMe-Total NVMe-Free NVMe-Used
----- --------- -------- -------- ---------- --------- ---------
daos-server-0001 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0002 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0003 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
daos-server-0004 48 GB 48 GB 0 % 1.6 TB 1.6 TB 0 %
This shows how much NVMe-Free space is available for each server.
Create a pool named pool1 that uses the total NVMe-Free for all servers.
sudo dmg pool create --size="100%" pool1View the ACLs on pool1
sudo dmg pool get-acl pool1# Owner: root@
# Owner Group: root@
# Entries:
A::OWNER@:rw
A:G:GROUP@:rw
Here we see that root owns the pool.
Add an ACE that will allow any user to create a container in the pool
sudo dmg pool update-acl -e A::EVERYONE@:rcta pool1For more information about pools see
Create a container in the pool
daos container create --type=POSIX --properties=rf:0 pool1 cont1For more information about containers see
Mount the container with dfuse
MOUNT_DIR="${HOME}/daos/cont1"
mkdir -p "${MOUNT_DIR}"
dfuse --singlethread --pool=pool1 --container=cont1 --mountpoint="${MOUNT_DIR}"
df -h -t fuse.daosYou can now store files in the DAOS container mounted on ${HOME}/daos/cont1.
For more information about DFuse see the DAOS FUSE section of the User Guide.
The cont1 container is now mounted on ${HOME}/daos/cont1
Create a 20GiB file which will be stored in the DAOS filesystem.
cd ${HOME}/daos/cont1
# Create a 20GB file
time LD_PRELOAD=/usr/lib64/libioil.so \
dd if=/dev/zero of=./test20.img bs=1G count=20cd ~/
fusermount -u "${HOME}/daos/cont1"
logoutTo destroy the DAOS cluster run
terraform destroyThis will destroy all DAOS server and client instances.
You have successfully deployed a DAOS cluster using the terraform/examples/daos_cluster example!