Adds support to deploy cyborg controlplane services#1102
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: amoralej The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Build failed (check pipeline). Post https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/3a20fb346aa44e9d80abc0ba1ff8cdf5 ✔️ openstack-meta-content-provider SUCCESS in 2h 58m 29s |
|
check-rdo |
| labels: | ||
| app.kubernetes.io/name: nova-operator | ||
| app.kubernetes.io/managed-by: kustomize | ||
| name: cyborg-cyborg-admin-role |
There was a problem hiding this comment.
the existing roles for nova do not repeat the service name, e.g nova_admin_role, novaconductor_admin_role. If possible I think it would be good to use the same pattern for the cyborg roles
| // ensureTopology - when a Topology CR is referenced, remove the | ||
| // finalizer from a previous referenced Topology (if any), and retrieve the | ||
| // newly referenced topology object | ||
| func ensureTopology( |
There was a problem hiding this comment.
this function seems to be an exact duplicate of
. This might be a question more for the nova-operator maintainers, but is there a way to reuse code between the controllers of different services?There was a problem hiding this comment.
IIUC the approach is to not share code among the different groups in the operator (nova, placement and cyborg). Let's see what nova-operator maintainers think about it.
Using operator-sdk command: operator-sdk create api --group cyborg --version v1beta1 --kind Cyborg --resource --controller operator-sdk create api --group cyborg --version v1beta1 --kind CyborgAPI --resource --controller operator-sdk create api --group cyborg --version v1beta1 --kind CyborgConductor --resource --controller Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Define CRD specs for Cyborg, CyborgAPI and CyborgConductor resources: - Add CyborgSpec with DB, RabbitMQ, Keystone and TLS configuration - Add CyborgAPISpec and CyborgConductorSpec with configSecret, replicas, resources, nodeSelector and TLS fields - Implement defaulting and validation webhooks for all three CRDs - Register CRDs in the operator scheme - Update CRD YAML manifests and CSV for OLM Reconcile and configuration logic will be created in next commits. Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the Cyborg CR: - Manage RBAC resources (ServiceAccount, Role, RoleBinding) - Validate input password secret and RabbitMQ TransportURL secret - Create MariaDB database and run DB sync job via a batch Job - Register Cyborg service in Keystone - Create a sub-level secret aggregating DB credentials, transport URL and service password to be consumed by CyborgAPI and CyborgConductor - Track readiness via structured conditions on CyborgStatus - Add functional tests covering the full reconcile flow Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the CyborgConductor CR: - Validate input from the config secret created by the Cyborg controller - Generate conductor config from templates (00-default.conf) - Create a StatefulSet to run cyborg-conductor pods - Track readiness (ReadyCount, conditions, hash, topology) - Expose IsReady and topology helpers on CyborgConductor type - Update CyborgConductorStatus with structured conditions and hash - Extend Cyborg controller to propagate conductor and check readiness upwards - Add functional tests for the conductor reconcile loop Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add full reconcile logic for the CyborgAPI CR: - Validate input from config secret provided by the Cyborg controller - Render WSGI/httpd and cyborg-api configuration templates - Create a StatefulSet for cyborg-api pods with TLS support - Register Keystone endpoints (public and internal) for the API - Track readiness (ReadyCount, conditions, hash, topology) - Expose IsReady and topology helpers on CyborgAPI type - Extend Cyborg controller to create CyborgAPI and check readiness upwards - Add functional tests covering the full API reconcile flow Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Add an end-to-end kuttl test suite for the Cyborg operator: - Cleanup step to delete any pre-existing Cyborg CR before the test - Deploy step creating a full Cyborg CR (cyborg-kuttl) - Assert step verifying all conditions are True on Cyborg, CyborgAPI, CyborgConductor and MariaDBDatabase CRs - Error step covering missing-dependency failure scenarios - Register cyborg container images (api, conductor, agent) as default RELATED_IMAGE env vars in the manager deployment - Enable ENABLE_CYBORG=true in the CI webhook deploy script Assisted-By: Claude Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Deployment using httpd is not longer supported in kolla upstream images since 2026.1 release [1]. [1] https://review.opendev.org/c/openstack/kolla/+/986488 Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
Similar to for CyborgAPI and CyborgConductor and other OpenStack CRDs. Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
The Cyborg controller now generates a `{name}-agent-config` secret
containing the rendered configuration for the cyborg-agent service
running on EDPM compute nodes. This secret is consumed by the
edpm-ansible cyborg role to configure the agent on the dataplane.
The shared 00-default.conf template is updated to guard the
[database] section with a conditional, allowing reuse for the
agent config without a separate template.
Assisted-By: claude
Signed-off-by: Alfredo Moralejo <amoralej@redhat.com>
|
Build failed (check pipeline). Post ✔️ openstack-meta-content-provider SUCCESS in 3h 45m 37s |
|
check-rdo |
|
This change depends on a change that failed to merge. Change #1121 is needed. |
|
check-rdo |
|
This change depends on a change that failed to merge. Change #1121 is needed. |
|
check-rdo |
|
Build failed (check pipeline). Post ✔️ openstack-meta-content-provider SUCCESS in 4h 23m 58s |
Add support for OpenStack Cyborg (accelerator lifecycle management service) in nova-operator, introducing three new CRDs and their controllers.
oc get cyborg/cyborgapi/cyborgconductor.Assisted-By: Claude
Jira: OSPRH-27674
Depends-On: #1121