feat(cluster_healthcheck): add cluster health validation role#39
Open
stevefulme1 wants to merge 1 commit into
Open
feat(cluster_healthcheck): add cluster health validation role#39stevefulme1 wants to merge 1 commit into
stevefulme1 wants to merge 1 commit into
Conversation
sabre1041
requested changes
May 22, 2026
Contributor
sabre1041
left a comment
There was a problem hiding this comment.
Review the issues that are being reported.
Also, please review conflicted files
| kind: Pod | ||
| namespace: "{{ cluster_healthcheck_kubevirt_namespace }}" | ||
| label_selectors: | ||
| - "app=cdi-operator" |
Contributor
There was a problem hiding this comment.
This label does not match what is deployed
| kind: Pod | ||
| namespace: "{{ cluster_healthcheck_kubevirt_namespace }}" | ||
| label_selectors: | ||
| - "app=cdi-deployment" |
Contributor
There was a problem hiding this comment.
This label does not match what is deployed
|
|
||
| - name: mtv_health | Evaluate Provider readiness | ||
| ansible.builtin.set_fact: | ||
| __cluster_healthcheck_providers_not_ready: >- |
Contributor
There was a problem hiding this comment.
This is not reporting correctly. Both providers are Ready in my testing environment
| | selectattr('status.phase', 'equalto', 'Running') | ||
| | list | length) }} | ||
|
|
||
| - name: network_health | Check migration network configuration |
Contributor
There was a problem hiding this comment.
This should only be checked if one has been defined in the HyperConverged CR
| kubernetes.core.k8s_info: | ||
| api_version: k8s.cni.cncf.io/v1 | ||
| kind: NetworkAttachmentDefinition | ||
| namespace: "{{ cluster_healthcheck_mtv_namespace }}" |
Contributor
There was a problem hiding this comment.
This should check in the openshift-cnv namespace
Adds a cluster_healthcheck role that validates OpenShift cluster health for virtualization migration readiness across six categories: OCP nodes, KubeVirt, MTV, storage, network, and post-migration VMs. Generates an HTML summary report with pass/fail/warning status. Review feedback addressed: - Fix CDI pod labels to use app.kubernetes.io/component selectors - Fix Provider readiness to correctly detect Ready condition status - Make migration network check conditional on HyperConverged CR config - Check migration NAD in openshift-cnv namespace, not openshift-mtv - Drop unrelated scaffolding file changes (CODE_OF_CONDUCT, etc.)
d4928cd to
51d077e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a new
cluster_healthcheckrole that validates the health of an OpenShift cluster for virtualization migration readiness. The role performs comprehensive checks across six categories and generates an HTML summary report with pass/fail/warning status and actionable recommendations.Health checks included
Files added
Design decisions
validate_migrationrole patterns (task naming, k8s_info usage, variable prefixing)cluster_healthcheck_per collection convention__cluster_healthcheck_double-underscore prefixkubernetes.core.k8s_info,ansible.builtin.*)cluster_healthcheck_checksdefaultcluster_healthcheck_post_migration_vmsTesting
ansible-lint --profile productionpasses with 0 errors on the role (playbook FQCN resolution matches existing collection behavior)