Skip to content
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
5bd9f54
Initial skc baremetal commit
assumptionsandg Nov 3, 2025
fe97b86
removed commented out tasks
claudia-lola Dec 3, 2025
3ec0a51
Revise README for Sushy Baremetal Environment
claudia-lola Jan 27, 2026
9343466
update to baremetal readme
claudia-lola Feb 3, 2026
f83fe52
Update etc/kayobe/environments/stackhpc-baremetal/kolla/config/ironic…
claudia-lola Feb 5, 2026
f1d2e57
revert ipa.yml change
claudia-lola Feb 5, 2026
ba251a7
README edits
claudia-lola Feb 5, 2026
bf6db34
edits to the sushy baremetal readme and set up playbooks, removed red…
claudia-lola Feb 9, 2026
7dee849
changes to sushy baremetal env
claudia-lola Feb 11, 2026
b1375cf
Initial skc baremetal commit
assumptionsandg Nov 3, 2025
a4d0b9d
removed commented out tasks
claudia-lola Dec 3, 2025
dc24694
Revise README for Sushy Baremetal Environment
claudia-lola Jan 27, 2026
cb82b61
update to baremetal readme
claudia-lola Feb 3, 2026
54120bc
Update etc/kayobe/environments/stackhpc-baremetal/kolla/config/ironic…
claudia-lola Feb 5, 2026
6225426
revert ipa.yml change
claudia-lola Feb 5, 2026
becf1ff
README edits
claudia-lola Feb 5, 2026
e2cb6f1
edits to the sushy baremetal readme and set up playbooks, removed red…
claudia-lola Feb 9, 2026
0c07187
changes to sushy baremetal env
claudia-lola Feb 11, 2026
06b91e1
baremetal README edits and comment out github workflow tasks
claudia-lola Mar 3, 2026
11c241b
Merge branch 'skc-baremetal-environment' of github.com:stackhpc/stack…
claudia-lola Mar 3, 2026
f80ca3b
README edits, baremetal group_vars edits, sushy auto-setup edits
claudia-lola Mar 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions etc/kayobe/environments/stackhpc-baremetal/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
Baremetal Environment
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoying suggestion but maybe we should create an ironic.rst from this and the Sushy readme... and integrate with #1011

=====================

This environment provides playbooks to automate the enrollment, inspection
Comment thread
assumptionsandg marked this conversation as resolved.
and cleaning of baremetal nodes in Ironic. It is designed to be idempotent
and safe to re-run.

The purpose of this environment is to enroll baremetal nodes in Ironic,
verify that a connection can be made to the BMCs of the nodes using Redfish, then perform Redfish and agent-based
inspection, and finally clean the nodes and make them available.

Inventory
---------

Baremetal nodes are defined in the inventory located ``stackhpc-baremetal/inventory/hosts`` file.
This inventory can be hand-written or generated (e.g. from a Python script).
Each node must have the required Ironic and Redfish variables.
These variables can be set in ``inventory/group_vars/baremetal-redfish/ironic``

Enable the Environment
-----------------------

This environment is intended to be layered on top of a base Kayobe environment
(e.g. ``ci-aio``), so that baremetal-specific defaults override those provided
Comment thread
assumptionsandg marked this conversation as resolved.
Outdated
by the base environment.
Create a ``.kayobe-environment`` file in the base of stackhpc-baremetal environment and add your
base environment as a dependency, for example if using CI-AIO as a base environment::
file `.kayobe-environment`

dependencies:
- ci-aio

Activate the environment using ``source kayobe-config/kayobe-env --environment stackhpc-baremetal``

How to Run
----------

Run the full baremetal workflow using::

kayobe playbook run \
etc/kayobe/environments/stackhpc-baremetal/ansible/baremetal-all.yml

Workflow Overview
-----------------

The workflow is executed in the following order when ``baremetal-all.yml`` is run:

1. **Enroll nodes** – create Ironic nodes and move them to ``manageable``
2. **Check BMC is up** – verify Redfish connection
3. **Redfish inspection** – discover hardware
4. **Agent inspection** – collect LLDP
5. **Clean and provide** – clean nodes and move them to ``available``


Progress is tracked using the Ironic node ``extra`` field:

* ``kayobe_bmc_up``
* ``kayobe_redfish_inspect_done``
* ``kayobe_agent_inspect_done``
* ``kayobe_clean_done``

Completed stages are skipped on subsequent runs.

Inspection Notes
----------------

* Redfish is the primary inspection mechanism
* Agent inspection is required for LLDP discovery
* iPXE / IPMI inspection is only supported when using discovery DHCP and *not* Ironic-managed boot
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
---

- name: Register baremetal compute nodes
hosts: "{{ groups['controllers'][0] }}"
vars:
venv: "{{ virtualenv_path }}/openstack-cli"
tasks:
- name: Set up openstack cli virtualenv
pip:
virtualenv: "{{ venv }}"
name:
- python-openstackclient
- python-ironicclient
state: latest
virtualenv_command: "python3.{{ ansible_facts.python.version.minor }} -m venv"
extra_args: "{% if pip_upper_constraints_file %}-c {{ pip_upper_constraints_file }}{% endif %}"

- name: Ensure baremetal nodes are registered in ironic
hosts: baremetal
gather_facts: false
max_fail_percentage: >-
{{ baremetal_compute_register_max_fail_percentage |
default(baremetal_compute_max_fail_percentage) |
default(kayobe_max_fail_percentage) |
default(100) }}
tags:
- baremetal
vars:
venv: "{{ virtualenv_path }}/openstack-cli"
#todo: extract this as a variable
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what this refers to

controller_host: "{{ groups['controllers'][0] }}"
tasks:
- name: Check Ironic variables are defined
ansible.builtin.assert:
that:
- ironic_driver is defined
- ironic_redfish_address is defined
- ironic_properties is defined
- ironic_resource_class is defined
fail_msg: One or more Ironic variables are undefined.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if using with_items: would yield a more helpful message if a variable is not defined? Will it make it clear which one is missing currently? (assuming a single var not defined)


- block:
- name: Show baremetal node
ansible.builtin.command:
cmd: "{{ venv }}/bin/openstack baremetal node show {{ inventory_hostname }}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for cmd

register: node_show
failed_when:
- '"HTTP 404" not in node_show.stderr'
- node_show.rc != 0
changed_when: false

# NOTE: The openstack.cloud.baremetal_node module cannot be used in this
# script due to requiring a MAC address pre-defined, instead, this should
# be discovered by inpsection following this script.
#
# NOTE: IPMI address must be passed with Redfish address to ensure existing
# Ironic nodes match with new nodes during inspection.
- name: Create baremetal nodes
ansible.builtin.shell:
cmd: |
{{ venv }}/bin/openstack baremetal node create \
--name {{ inventory_hostname }} \
--driver {{ ironic_driver }} \
--driver-info redfish_system_id={{ ironic_redfish_system_id }} \
--driver-info redfish_address={{ ironic_redfish_address }} \
{% if ironic_redfish_username %}
--driver-info redfish_username={{ ironic_redfish_username }} \
{% endif %}
{% if ironic_redfish_password %}
--driver-info redfish_password={{ ironic_redfish_password }} \
Comment thread
claudia-lola marked this conversation as resolved.
{% endif %}
--driver-info redfish_verify_ca={{ ironic_redfish_verify_ca }} \
{% for key, value in ironic_properties.items() %}
--property {{ key }}={{ value }} \
{% endfor %}
--resource-class {{ ironic_resource_class }} \
{% if ironic_boot_interface %}
--boot-interface {{ ironic_boot_interface }} \
{% endif %}
{% if ironic_inspect_interface %}
--inspect-interface {{ ironic_inspect_interface }} \
{% endif %}
{% if ironic_management_interface %}
--management-interface {{ ironic_management_interface }} \
{% endif %}
{% if ironic_network_interface %}
--network-interface {{ ironic_network_interface }} \
{% endif %}
{% if ironic_raid_interface %}
--raid-interface {{ ironic_raid_interface }} \
{% endif %}
when:
- node_show.rc != 0

- name: Manage baremetal nodes
ansible.builtin.command:
cmd: "{{ venv }}/bin/openstack baremetal node manage {{ inventory_hostname }} --wait"
when:
- node_show.rc != 0
delegate_to: "{{ controller_host }}"
vars:
# NOTE: Without this, the controller's ansible_host variable will not
# be respected when using delegate_to.
ansible_host: "{{ hostvars[controller_host].ansible_host | default(controller_host) }}"
environment: "{{ openstack_auth_env }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
---
- name: Check baremetal compute node bmc is up
hosts: baremetal
gather_facts: false
max_fail_percentage: >-
{{ baremetal_compute_register_max_fail_percentage |
default(baremetal_compute_max_fail_percentage) |
default(kayobe_max_fail_percentage) |
default(100) }}
tags:
- baremetal
vars:
venv: "{{ virtualenv_path }}/openstack-cli"
controller_host: "{{ groups['controllers'][0] }}"

tasks:
- name: Check Ironic variables are defined
ansible.builtin.assert:
that:
- ironic_driver is defined
- ironic_redfish_address is defined
- ironic_properties is defined
- ironic_resource_class is defined
fail_msg: One or more Ironic variables are undefined.

- name: Show and check baremetal node
delegate_to: "{{ controller_host }}"
vars:
# NOTE: Without this, the controller's ansible_host variable will not
# be respected when using delegate_to.
ansible_host: "{{ hostvars[controller_host].ansible_host | default(controller_host) }}"
environment: "{{ openstack_auth_env }}"
block:

- name: Show baremetal node
ansible.builtin.command:
cmd: "{{ venv }}/bin/openstack baremetal node show {{ inventory_hostname }} -f json"
register: node_show
failed_when:
- node_show.rc != 0
changed_when: false

- name: Check if bmc is up
ansible.builtin.set_fact:
kayobe_bmc_up: "{{ (node_show.stdout | from_json)['extra'].get('kayobe_bmc_up') }}"
provision_state: "{{ (node_show.stdout | from_json)['provision_state'] }}"

- name: Output when bmc last up run
ansible.builtin.debug:
msg: "BMC for node {{ inventory_hostname }} was up at {{ kayobe_bmc_up }}."
when: kayobe_bmc_up != ""

- name: Check BMC is up
ansible.builtin.uri:
url: "{{ ironic_redfish_address + '/redfish/v1' }}"
method: GET
status_code: 200
validate_certs: false
timeout: 10

# #TODO(ClaudiaWatson): add an optional BMC reboot into the flow
# - name: Reboot BMC
# community.general.redfish_command:
# category: Manager
# command: PowerReboot
# username: "{{ ironic_redfish_username }}"
# password: "{{ ironic_redfish_password }}"
# when:
# - kayobe_bmc_up == ""
# - ironic_redfish_username is defined

- name: Check BMC back up again
ansible.builtin.uri:
url: "{{ ironic_redfish_address + '/redfish/v1' }}"
method: GET
status_code: 200
validate_certs: false
timeout: 10
register: uri_output
until: uri_output.status == 200
delay: 2
retries: 5 # Retries for 5 * 2 seconds = 10 seconds

- name: Note when we are able to reach the bmc, the first time
ansible.builtin.command:
cmd: |
{{ venv }}/bin/openstack baremetal node set {{ inventory_hostname }} --extra kayobe_bmc_up={{ now(utc=true, fmt='%Y-%m-%dT%H:%M:%SZ') }}
register: node_set
failed_when:
- node_set.rc != 0
changed_when: true
when: kayobe_bmc_up == ""

- name: Try move from enroll to manageable
ansible.builtin.command:
cmd: |
{{ venv }}/bin/openstack baremetal node manage {{ inventory_hostname }} --wait 300
register: node_set
failed_when:
- node_set.rc != 0
changed_when: true
when:
- provision_state == "enroll"
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
- name: Check baremetal compute node bmc is up
hosts: baremetal
gather_facts: false
max_fail_percentage: >-
{{ baremetal_compute_register_max_fail_percentage |
default(baremetal_compute_max_fail_percentage) |
default(kayobe_max_fail_percentage) |
default(100) }}
tags:
- baremetal
vars:
venv: "{{ virtualenv_path }}/openstack-cli"
controller_host: "{{ groups['controllers'][0] }}"

tasks:
- name: Show and check baremetal node
delegate_to: "{{ controller_host }}"
vars:
# NOTE: Without this, the controller's ansible_host variable will not
# be respected when using delegate_to.
ansible_host: "{{ hostvars[controller_host].ansible_host | default(controller_host) }}"
redfish_inspect_timeout: 120
environment: "{{ openstack_auth_env }}"
block:

- name: Show baremetal node
ansible.builtin.command:
cmd: "{{ venv }}/bin/openstack baremetal node show {{ inventory_hostname }} -f json"
register: node_show
failed_when:
- node_show.rc != 0
changed_when: false

- name: Check BMC is up
ansible.builtin.uri:
url: "{{ ironic_redfish_address + '/redfish/v1' }}"
method: GET
status_code: 200
validate_certs: false
timeout: 10

- name: Check for redfish inspection details
ansible.builtin.set_fact:
kayobe_redfish_inspect_done: "{{ (node_show.stdout | from_json)['extra'].get('kayobe_redfish_inspect_done') }}"
inspect_interface: "{{ (node_show.stdout | from_json)['inspect_interface'] }}"
provision_state: "{{ (node_show.stdout | from_json)['provision_state'] }}"

- name: Output when redfish inspection was done
ansible.builtin.debug:
msg: "{{ inventory_hostname }} inspected at {{ kayobe_redfish_inspect_done }}."
when: kayobe_redfish_inspect_done != ""

- name: Fail if not redfish inspection
ansible.builtin.fail:
msg: "{{ inventory_hostname }} has the wrong inspect_interface: {{ inspect_interface }}"
when:
- inspect_interface != "redfish"
- kayobe_redfish_inspect_done == ""

- name: Fail if not in manageable state
ansible.builtin.fail:
msg: "{{ inventory_hostname }} has the wrong provision_state: {{ provision_state }}"
when:
- provision_state != "manageable"
- kayobe_redfish_inspect_done == ""

- name: Wait for inspection
ansible.builtin.command:
cmd: |
{{ venv }}/bin/openstack baremetal node inspect {{ inventory_hostname }} --wait {{ redfish_inspect_timeout }}
register: node_inspect
failed_when:
- node_inspect.rc != 0
changed_when: true
when: kayobe_redfish_inspect_done == ""

- name: Note when redfish inspection is done
ansible.builtin.command:
cmd: |
{{ venv }}/bin/openstack baremetal node set {{ inventory_hostname }} --extra kayobe_redfish_inspect_done={{ now(utc=true, fmt='%Y-%m-%dT%H:%M:%SZ') }}
register: node_set
failed_when:
- node_set.rc != 0
changed_when: true
when: kayobe_redfish_inspect_done == ""
Loading
Loading