An Ansible role to deploy the essentials of a highly available Kubernetes cluster. Includes in-cluster apiserver and ingress controller loadbalancing, dns and flannel.
It was originally loosely based on the kelseyhightower/kubernetes-the-hard-way runbooks, but this version is completely Ansible-ised, and not GCP-specific.
- In common, it downloads and runs the controller componenets (apiserver, controller-manager and scheduler) outside the cluster as systemd services. Similarly, for kubelets on the worker nodes.
- However:
- etcd runs in separate VMs
- The apiserver nodes are load-balanced using cloud-specific tools:
- Libvirt (bare-metal): keepalived (using IPVS real-servers on the same hosts as the directors). The VIP floats between the apiservers and load-balances the request in the kernel for processing by one of the peer apiservers.
- AWS: Network load balancers, configured for internal load-balancing. One per-zone for resilience.
- haproxy-ingress is used as an ingress controller (using the Gateway API). It runs as a daemonset on special node-edge worker nodes with hostNetwork.
- External DNS must be configured to point to the node-edge woker nodes. e.g.:
*.k8s IN A 192.168.1.59*.k8s IN A 192.168.1.60*.k8s IN A 192.168.1.61
- External DNS must be configured to point to the node-edge woker nodes. e.g.:
- flannel provides the layer 3 overlay network
- CoreDNS provides internal DNS
- Some testapps (applied using -e testapps=true) can be deployed with gateway-api enabled:
- headlamp: One of the replacements for the dashboard project.
headlamp.{{cluster_vars.dns_domain}}- Get the token using
kubectl create token headlamp
- nginx-test: just a simple nginx webserver that echoes the host it is running on.
curl nginx-test.{{cluster_vars.dns_domain}}
- pyechoserver: A simple python web server that returns the ip address and host it is on (not really an echo server!).
curl pyechoserver.{{cluster_vars.dns_domain}}
- tcpecho: A TCP echo server.
nc tcpecho.{{cluster_vars.dns_domain}} 3495will echo back what you type.
- headlamp: One of the replacements for the dashboard project.
It supports (at present) AWS, libvirt(KVM/Qemu) and ESXi infrastructure.
This project is designed to operate using clusterverse to manage the base infrastructure. Please see the README.md there for detailed instructions on its usage.
- Tested on Ubuntu 24,04 and AlmaLinux 10.1
- ansible-core >= 2.17.4 (pypi >= 10.4.0)
- See docs/EXAMPLE/Dockerfile for a full list of dependencies.
Please see the EXAMPLE directory in this repository for some basic configuration. This can be copied in the root directory, and used as a starting point for your own configuration.
Clusters are defined as code within Ansible yaml files that are imported at runtime. Because clusters are built from scratch on the localhost, the automatic Ansible group_vars inclusion cannot work with anything except the special all.yml group (actual groups need to be in the inventory, which cannot exist until the cluster is built). The group_vars/all.yml file is instead used to bootstrap merge_vars, and the definitions are hierarchically defined in cluster_defs. Please see the full documentation in the main clusterverse/README.md
- Cluster configuration is stored in
cluster_defs/**/cluster_vars[*].ymlfiles. - Application configuration is stored in
cluster_defs/**/app_vars[*].ymlfiles.
For full clusterverse invocation examples and command-line arguments, please see the example README.md
The role is designed to run in two modes:
- A playbook based on the deploy.yml example will be needed.
- The
deploy.ymlsub-role idempotently deploys a cluster from the config defined above (if it is run again (with no changes to variables), it will do nothing). If the cluster variables are changed (e.g. add a host), the cluster will reflect the new variables (e.g. a new host will be added to the cluster. Note: it will not remove nodes, nor, usually, will it reflect changes to disk volumes - these are limitations of the underlying cloud modules). - Example:
ansible-playbook deploy.yml -e cloud_type=libvirt -e region=dougalab -e buildenv=dev -e testapps=true
- A playbook based on the redeploy.yml example will be needed.
- The
redeploy.ymlsub-role will completely redeploy the cluster; this is useful for example to upgrade the underlying operating system version. - Please see the full documentation
- Example:
ansible-playbook redeploy.yml -e canary=none -e cloud_type=esxifree -e clusterid=dougakube -e region=dougalab -e buildenv=dev -e testapps=true