Skip to content

Commit 99a120f

Browse files
committed
nxos-vxlan - lab cleanup
Signed-off-by: Harald Jensås <hjensas@redhat.com>
1 parent ed604ce commit 99a120f

9 files changed

Lines changed: 186 additions & 234 deletions

File tree

scenarios/networking-lab/devstack-nxsw-vxlan/TROUBLESHOOTING.md

Lines changed: 118 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
# Troubleshooting Guide: Cisco NX-OS VXLAN EVPN
22

3-
This guide provides useful Cisco NX-OS commands for troubleshooting the spine-and-leaf VXLAN EVPN topology.
3+
This guide provides useful Cisco NX-OS commands for troubleshooting the spine-and-leaf VXLAN EVPN topology, as well as packet capture techniques for debugging network connectivity.
44

55
## Table of Contents
6+
- [Packet Capture and Traffic Analysis](#packet-capture-and-traffic-analysis)
67
- [BGP EVPN Control Plane](#bgp-evpn-control-plane)
78
- [VXLAN Overlay](#vxlan-overlay)
89
- [VLAN Configuration](#vlan-configuration)
@@ -12,6 +13,122 @@ This guide provides useful Cisco NX-OS commands for troubleshooting the spine-an
1213
- [Port Configuration](#port-configuration)
1314
- [General System Information](#general-system-information)
1415

16+
## Packet Capture and Traffic Analysis
17+
18+
### Devstack Node - Monitoring ARP Traffic
19+
20+
Monitor ARP traffic on the devstack node to troubleshoot connectivity between the overcloud and baremetal nodes.
21+
22+
#### Monitor ARP on br-ex (OVS Bridge)
23+
```bash
24+
stack@devstack:~$ sudo tcpdump -i br-ex -envv arp
25+
```
26+
Shows ARP traffic on the OVS external bridge. You should see VLAN-tagged ARP requests from baremetal nodes coming in, and ARP replies from the router going out.
27+
28+
**Expected output:**
29+
```
30+
22:dd:04:01:1b:08 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 78: vlan 103, p 0, ethertype ARP (0x0806), Request who-has 10.0.5.1 tell 10.0.5.87
31+
fa:16:3e:12:24:7c > 22:dd:04:01:1b:08, ethertype 802.1Q (0x8100), length 78: vlan 103, p 0, ethertype ARP (0x0806), Reply 10.0.5.1 is-at fa:16:3e:12:24:7c
32+
```
33+
34+
#### Monitor ARP on trunk0 (Physical Interface)
35+
```bash
36+
stack@devstack:~$ sudo tcpdump -i trunk0 -envv arp
37+
```
38+
Shows ARP traffic on the physical trunk interface that connects to leaf01. This helps verify if packets are actually leaving/entering the physical interface with correct VLAN tags.
39+
40+
**What to look for:**
41+
- **Ingress**: ARP requests from baremetal nodes should arrive with VLAN tags
42+
- **Egress**: ARP replies from router should leave with VLAN tags
43+
- If you see traffic on br-ex but NOT on trunk0, there's an OVS flow issue
44+
- If VLAN tags are present on trunk0 but missing when they arrive at the switch, there's an undercloud Neutron trunk issue
45+
46+
#### Monitor All Traffic on trunk0
47+
```bash
48+
stack@devstack:~$ sudo tcpdump -i trunk0 -envv
49+
```
50+
Shows all traffic including STP, LLDP, and data packets. Useful for verifying physical connectivity.
51+
52+
### Cisco Switch - Monitoring Traffic
53+
54+
Monitor traffic on Cisco NX-OS switches using the built-in ethanalyzer tool.
55+
56+
#### Monitor ARP on a Specific Interface
57+
```bash
58+
leaf01# ethanalyzer local interface front-panel eth1/3 display-filter arp
59+
leaf01# ethanalyzer local interface front-panel eth1/4 display-filter arp
60+
```
61+
Captures and displays ARP traffic on the specified front-panel interface (e.g., Ethernet1/3, Ethernet1/4).
62+
63+
**Expected output:**
64+
```
65+
2026-03-17 23:16:05.641404 fa:16:3e:12:24:7c -> 22:dd:04:01:1b:08 ARP 10.0.5.1 is at fa:16:3e:12:24:7c
66+
```
67+
68+
**What to check:**
69+
- Are ARP packets arriving at the switch?
70+
- Are VLAN tags present or stripped?
71+
- Is bidirectional ARP traffic visible (both requests and replies)?
72+
73+
#### Monitor All Traffic on an Interface
74+
```bash
75+
leaf01# ethanalyzer local interface front-panel eth1/3
76+
```
77+
Shows all traffic including STP, ARP, and data packets. Press Ctrl+C to stop.
78+
79+
#### Limit Number of Packets Captured
80+
```bash
81+
leaf01# ethanalyzer local interface front-panel eth1/3 limit-captured-frames 20
82+
```
83+
Captures only 20 frames and then stops automatically.
84+
85+
#### Monitor Specific Protocol
86+
```bash
87+
# Monitor only ICMP traffic
88+
leaf01# ethanalyzer local interface front-panel eth1/3 display-filter icmp
89+
90+
# Monitor only IPv4 traffic
91+
leaf01# ethanalyzer local interface front-panel eth1/3 display-filter ip
92+
```
93+
94+
### Debugging VLAN Tag Issues
95+
96+
If you suspect VLAN tags are being stripped or not applied correctly:
97+
98+
1. **On devstack, capture on trunk0:**
99+
```bash
100+
sudo tcpdump -i trunk0 -envv 'vlan 103'
101+
```
102+
Verify that outbound traffic has VLAN tags
103+
104+
2. **On leaf01, capture on the corresponding interface:**
105+
```bash
106+
ethanalyzer local interface front-panel eth1/4 display-filter arp
107+
```
108+
Check if the same traffic arrives with or without tags
109+
110+
3. **Compare**:
111+
- If traffic leaves trunk0 tagged but arrives at the switch untagged → undercloud Neutron trunk issue
112+
- If traffic doesn't leave trunk0 at all → OVS flow issue on devstack
113+
- If traffic arrives tagged but isn't forwarded → switch VLAN configuration issue
114+
115+
### Check MAC Learning on Switches
116+
117+
After seeing ARP traffic, verify that the switches learned the MAC addresses:
118+
119+
```bash
120+
# Check specific VLAN
121+
show mac address-table vlan 103
122+
123+
# Check specific interface
124+
show mac address-table interface ethernet 1/3
125+
```
126+
127+
If MAC addresses aren't being learned, check:
128+
- Interface is in correct VLAN
129+
- VLAN exists in NVE peer list
130+
- BGP EVPN is advertising the routes
131+
15132
## BGP EVPN Control Plane
16133

17134
### Check BGP EVPN Session Status

scenarios/networking-lab/devstack-nxsw-vxlan/automation-vars.yml

Lines changed: 4 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ stages:
3232
exit 1
3333
fi
3434
35-
ssh -o StrictHostKeyChecking=no stack@devstack.netlab.example.com "
35+
ssh -o StrictHostKeyChecking=no stack@devstack.stack.lab "
3636
ROUTES=\$(ip -j r)
3737
ROUTE_EXISTS=\$(echo \"\$ROUTES\" | python3 -c '
3838
import sys, json
@@ -102,28 +102,15 @@ stages:
102102
Transitions nodes from 'enroll' to 'manageable' state. This validates
103103
basic hardware connectivity and prepares nodes for further operations.
104104
shell: |
105-
set -xe -o pipefail
105+
set -x -o pipefail
106106
107107
# Get list of node UUIDs
108108
node_uuids=$(openstack --os-cloud devstack-admin baremetal node list -f value -c UUID)
109109
110-
# Manage each node
110+
# Manage each node with --wait (300 second timeout)
111111
for uuid in $node_uuids; do
112112
echo "Managing node: $uuid"
113-
openstack --os-cloud devstack-admin baremetal node manage $uuid
114-
done
115-
116-
# Wait for manageable state
117-
counter=0
118-
max_retries=60
119-
until ! openstack --os-cloud devstack-admin baremetal node list -f value -c "Provisioning State" | grep -v "manageable"; do
120-
((counter++))
121-
if (( counter > max_retries )); then
122-
echo "ERROR: Timeout waiting for nodes to reach manageable state"
123-
openstack --os-cloud devstack-admin baremetal node list
124-
exit 1
125-
fi
126-
sleep 5
113+
openstack --os-cloud devstack-admin baremetal node manage --wait 300 $uuid
127114
done
128115
129116
echo "All nodes successfully reached manageable state"

0 commit comments

Comments
 (0)