Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion deploy/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,17 @@ fencing-kcli:

patch-nodes:
@./openshift-clusters/scripts/patch-nodes.sh

ssh-node:
ifndef node
@echo "Usage: make ssh-node node=<0|1> [cmd=\"command\"]"
@echo "Examples:"
@echo " make ssh-node node=0"
@echo " make ssh-node node=1 cmd=\"sudo pcs status\""
@exit 1
endif
@./openshift-clusters/scripts/ssh-node.sh $(node) $(cmd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the $(cmd) might be expanded as an "empty" argument to make if it's not provided. To allow for no command safely, might need something like this:

@./openshift-clusters/scripts/ssh-node.sh $(node) $(if $(cmd),$(cmd))


get-tnf-logs:
@./openshift-clusters/scripts/get-tnf-logs.sh

Expand Down Expand Up @@ -117,5 +128,7 @@ help:
@echo " patch-nodes - Build resource-agents RPM and patch cluster nodes (default version: 4.11)"
@echo ""
@echo "Cluster Utilities:"
@echo " get-tnf-logs - Collect pacemaker and etcd logs from cluster nodes"
@echo " ssh-node node=<0|1> [cmd=\"..\"] - SSH into cluster node"
@echo " e.g. make ssh-node node=0 cmd=\"sudo pcs status\""
@echo " get-tnf-logs - Collect pacemaker and etcd logs from cluster nodes"

92 changes: 92 additions & 0 deletions deploy/openshift-clusters/scripts/ssh-node.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#!/bin/bash

set -euo pipefail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to make a few lines of helpful errors unreachable, I'll point them out below


SCRIPT_DIR=$(dirname "$0")
DEPLOY_DIR="$(cd "${SCRIPT_DIR}/../.." && pwd)"
INVENTORY_FILE="${DEPLOY_DIR}/openshift-clusters/inventory.ini"

usage() {
echo "Usage: $0 <node> [command]"
echo ""
echo "SSH into a cluster node via the hypervisor jump host."
echo "If a command is provided, execute it and return."
echo ""
echo "Node can be specified as:"
echo " master-0, master_0, node0, 0 -> first master node"
echo " master-1, master_1, node1, 1 -> second master node"
echo ""
echo "Examples:"
echo " $0 master-0 # Interactive SSH session"
echo " $0 0 uptime # Run 'uptime' on master-0"
echo " $0 1 'pcs status' # Run 'pcs status' on master-1"
exit 1
}

if [[ $# -lt 1 ]]; then
usage
fi

NODE_ARG="$1"
shift

# Check if inventory file exists
if [[ ! -f "${INVENTORY_FILE}" ]]; then
echo "Error: Inventory file not found at ${INVENTORY_FILE}"
echo "Run 'make inventory' first."
exit 1
fi

# Normalize node argument to inventory name
case "${NODE_ARG}" in
master-0|master_0|node0|0)
NODE_PATTERN="master_0"
;;
master-1|master_1|node1|1)
NODE_PATTERN="master_1"
;;
*)
echo "Error: Unknown node '${NODE_ARG}'"
usage
;;
esac

# Extract node IP from inventory
NODE_IP=$(grep -E "master_0|master_1" "${INVENTORY_FILE}" | grep "${NODE_PATTERN}" | grep -oP "ansible_host='\\K[^']+")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this'll happen, but maybe we need to parse for the full word (master_0 or master_1, as this will match master10, master100, and also things like ostest_master_10)

Suggestion might be changing grep -E to grep -W to parse only full words


if [[ -z "${NODE_IP}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be unreachable: if line 55 grep doesn't find anything, it will have exit code 1 and script should exit.

Adding "|| true" to the command on 55 should be enough, we're checking for content of grep output anyway

echo "Error: Could not find ${NODE_PATTERN} in inventory"
echo "Make sure the cluster is deployed and inventory is updated."
exit 1
fi

# Extract hypervisor info from inventory
HYPERVISOR=$(grep -E "^[^#]*@" "${INVENTORY_FILE}" | head -1 | awk '{print $1}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great to parse the ProxyJump, but it might also hit the @ in "ansible_ssh_common_args" if the inventory were ordered differently, so although it works with the current inventory structure, it's a bit fragile.
Claude suggests parsing just in the metal_machine subsection
HYPERVISOR=$(awk '/^\[metal_machine\]/{found=1;next} /^\[/{found=0} found && /@/{print $1; exit}' "${INVENTORY_FILE}")


if [[ -z "${HYPERVISOR}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be unreachable: if line 64 grep doesn't find anything, it will have exit code 1 and script should exit

echo "Error: Could not find hypervisor in inventory"
exit 1
fi

# Common SSH options to avoid known_hosts issues after redeploys
SSH_OPTS=(
-o StrictHostKeyChecking=no
-o UserKnownHostsFile=/dev/null
-o LogLevel=ERROR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this to WARN to make debugging easier

)

# ProxyJump with same options for the jump host
PROXY_CMD="ssh ${SSH_OPTS[*]} -W %h:%p ${HYPERVISOR}"

if [[ $# -gt 0 ]]; then
# Run command and return
ssh "${SSH_OPTS[@]}" \
-o "ProxyCommand=${PROXY_CMD}" \
"core@${NODE_IP}" "$@"
else
# Interactive session
echo "Connecting to ${NODE_PATTERN} (${NODE_IP}) via ${HYPERVISOR}..."
ssh "${SSH_OPTS[@]}" \
-o "ProxyCommand=${PROXY_CMD}" \
"core@${NODE_IP}"
fi