Skip to content

Commit ed9306d

Browse files
committed
document hostBGP subnets and container
Signed-off-by: Emanuele Di Pascale <emanuele@githedgehog.com>
1 parent 589e44c commit ed9306d

2 files changed

Lines changed: 214 additions & 0 deletions

File tree

docs/user-guide/host-settings.md

Lines changed: 191 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,3 +88,194 @@ kubectl fabric vpc attach --vpc-subnet vpc-2/default --connection server-1--leaf
8888

8989
[bonding]: https://www.kernel.org/doc/html/latest/networking/bonding.html
9090

91+
## HostBGP container
92+
93+
If using [HostBGP subnets](vpcs.md#hostbgp-subnets), BGP should be running on the host server and
94+
an appropriate configuration should be applied. To facilitate these steps, Hedgehog provides a
95+
docker container which automatically starts [FRR](https://docs.frrouting.org/en/latest/) with
96+
a valid configuration to join the Fabric.
97+
98+
As a first step, users should download the docker image from our registry:
99+
```bash
100+
docker pull ghcr.io/githedgehog/host-bgp
101+
```
102+
103+
The container should then be run with host networking (so that FRR can communicate with the leaves
104+
using the host's interfaces) and in privileged mode. Additionally, a few input parameters are required:
105+
106+
- an optional ASN to use in BGP. If present, it should be the first parameter; if not specified, the container will use ASN 64999
107+
- one or more VPC subnets with their related parameters, in the format
108+
`<VPC-SUBNET-NAME>:v=<VLAN>:i=<INTERFACE1>[:i=<INTERFACE2>...]:a=<ADDRESS1>[:a=<ADDRESS2>...]`, where:
109+
- `<VPC-SUBNET-NAME>` is just a mnemonic ID for the VPC subnet we want to attach to.
110+
It can be anything as long as it is a legal name for a route-map or prefix-list in FRR.
111+
- `v=<VLAN>` is the VLAN ID to be used for the VPC; use 0 for untagged.
112+
- `i=<INTERFACE1>` is an interface to be used to establish a BGP unnumbered session with a
113+
Fabric leaf; if a VLAN ID was specified, a corresponding VLAN interface will be created using
114+
the provided interface as the master device.
115+
- `a=<ADDRESS1>` is the Virtual IP (or VIP) to be advertised to the leaves; it should have
116+
a prefix length of /32 and be part of the subnet the host is attaching to.
117+
118+
As an example, the command might look something like this:
119+
```bash
120+
docker run --network=host --privileged --rm --detach --name hostbgp ghcr.io/githedgehog/host-bgp 64307 vpc-01:v=1001:i=enp2s1:i=enp2s2:a=10.100.34.5/32
121+
```
122+
!!! note
123+
With the above command, any output produced by the container will not be visible from the terminal
124+
where it was started. Verify that the container is running correctly with `docker ps`, or examine
125+
the logs of the container with `docker logs hostbgp` to investigate a failure.
126+
127+
With the above command:
128+
129+
- VLAN interfaces `enp2s1.1001` and `enp2s2.1001` would be created, if not already existing
130+
- BGP unnumbered sessions would be created on those same interfaces, using ASN 64307
131+
- the address `10.100.34.5/32` would be configured on the loopback of the host server and it would be advertised to the leaves
132+
133+
To further modify the configuration or to troubleshoot the state of the system, an
134+
expert user can invoke the FRR CLI using the following command:
135+
```bash
136+
docker exec -it hostbgp vtysh
137+
```
138+
139+
For example, one could use vtysh to verify the configuration generated with the above command:
140+
```bash
141+
$ docker exec -t hostbgp vtysh -c "show run"
142+
Building configuration...
143+
144+
Current configuration:
145+
!
146+
frr version 10.5.1_git
147+
frr defaults traditional
148+
hostname server-04
149+
service integrated-vtysh-config
150+
!
151+
ip prefix-list vpc-01 seq 5 permit 10.100.34.5/32
152+
!
153+
route-map vpc-01 permit 10
154+
match ip address prefix-list vpc-01
155+
exit
156+
!
157+
interface lo
158+
ip address 10.100.34.5/32
159+
exit
160+
!
161+
router bgp 64307
162+
no bgp ebgp-requires-policy
163+
bgp bestpath as-path multipath-relax
164+
timers bgp 3 9
165+
neighbor enp2s1.1001 interface remote-as external
166+
neighbor enp2s2.1001 interface remote-as external
167+
!
168+
address-family ipv4 unicast
169+
network 10.100.34.5/32
170+
neighbor enp2s1.1001 route-map vpc-01 out
171+
neighbor enp2s2.1001 route-map vpc-01 out
172+
maximum-paths 4
173+
exit-address-family
174+
exit
175+
!
176+
end
177+
```
178+
179+
To stop the container, just run the following command:
180+
```bash
181+
docker stop -t 1 hostbgp
182+
```
183+
184+
Note that stopping the docker container does not currently remove the VIPs from the loopback, nor
185+
does it delete the VLAN interfaces. If needed, these should be removed manually; for example,
186+
using iproute2 and the reference command above, one could run:
187+
```bash
188+
sudo ip address delete dev lo 10.100.34.5/32
189+
sudo ip link delete dev enp2s1.1001
190+
sudo ip link delete dev enp2s2.1001
191+
```
192+
193+
Users should consider automating the startup of the hostbgp container at system boot up, to make
194+
sure that connectivity is restored in case of a reboot.
195+
196+
### Example: multi-VPC multi-homed server
197+
198+
Let's assume that `server-03` is attached to both `leaf-01` and `leaf-02` with unbundled connections
199+
`server-03--unbundled--leaf-01` and `server-03--unbundled--leaf-02`, and that we want it to be part
200+
of two separate VPCs using host-BGP. We can create the VPCs and attachments e.g. from the control node
201+
using the Fabric `kubectl` plugin:
202+
203+
```bash
204+
core@control-1 ~ $ kubectl fabric vpc create --name=vpc-01 --subnet=10.0.1.0/24 --vlan=1001 --host-bgp=true
205+
10:04:09 INF VPC created name=vpc-01
206+
core@control-1 ~ $ kubectl fabric vpc create --name=vpc-02 --subnet=10.0.2.0/24 --vlan=1002 --host-bgp=true
207+
10:04:24 INF VPC created name=vpc-02
208+
core@control-1 ~ $ kubectl fabric vpc attach --name=s3-v1-l1 --conn=server-03--unbundled--leaf-01 --subnet=vpc-01/default
209+
10:05:59 INF VPCAttachment created name=s3-v1-l1
210+
core@control-1 ~ $ kubectl fabric vpc attach --name=s3-v1-l2 --conn=server-03--unbundled--leaf-02 --subnet=vpc-01/default
211+
10:06:08 INF VPCAttachment created name=s3-v1-l2
212+
core@control-1 ~ $ kubectl fabric vpc attach --name=s3-v2-l1 --conn=server-03--unbundled--leaf-01 --subnet=vpc-02/default
213+
10:06:24 INF VPCAttachment created name=s3-v2-l1
214+
core@control-1 ~ $ kubectl fabric vpc attach --name=s3-v2-l2 --conn=server-03--unbundled--leaf-02 --subnet=vpc-02/default
215+
10:06:33 INF VPCAttachment created name=s3-v2-l2
216+
```
217+
218+
Then we can configure `server-03` using the provided container:
219+
220+
```bash
221+
docker run --network=host --privileged --rm --detach --name hostbgp ghcr.io/githedgehog/host-bgp vpc-01:v=1001:i=enp2s1:i=enp2s2:a=10.0.1.3/32 vpc-02:v=1002:i=enp2s1:i=enp2s2:a=10.0.2.3/32
222+
```
223+
224+
This will generate the following FRR configuration:
225+
```
226+
!
227+
ip prefix-list vpc-01 seq 5 permit 10.0.1.3/32
228+
ip prefix-list vpc-02 seq 5 permit 10.0.2.3/32
229+
!
230+
route-map vpc-01 permit 10
231+
match ip address prefix-list vpc-01
232+
exit
233+
!
234+
route-map vpc-02 permit 10
235+
match ip address prefix-list vpc-02
236+
exit
237+
!
238+
interface lo
239+
ip address 10.0.1.3/32
240+
ip address 10.0.2.3/32
241+
exit
242+
!
243+
router bgp 64999
244+
no bgp ebgp-requires-policy
245+
bgp bestpath as-path multipath-relax
246+
timers bgp 3 9
247+
neighbor enp2s1.1001 interface remote-as external
248+
neighbor enp2s1.1002 interface remote-as external
249+
neighbor enp2s2.1001 interface remote-as external
250+
neighbor enp2s2.1002 interface remote-as external
251+
!
252+
address-family ipv4 unicast
253+
network 10.0.1.3/32
254+
network 10.0.2.3/32
255+
neighbor enp2s1.1001 route-map vpc-01 out
256+
neighbor enp2s1.1002 route-map vpc-02 out
257+
neighbor enp2s2.1001 route-map vpc-01 out
258+
neighbor enp2s2.1002 route-map vpc-02 out
259+
maximum-paths 4
260+
exit-address-family
261+
exit
262+
!
263+
```
264+
265+
And we can verify on either of the leaves attached to `server-03` that VIPs are only
266+
learned in the VPC they belong to:
267+
```
268+
leaf-01# show ip route vrf VrfVvpc-01
269+
Codes: K - kernel route, C - connected, S - static, B - BGP, O - OSPF, A - attached-host
270+
> - selected route, * - FIB route, q - queued route, r - rejected route, b - backup
271+
Destination Gateway Dist/Metric Last Update
272+
--------------------------------------------------------------------------------------------------------------------------------
273+
B>* 10.0.1.3/32 via fe80::e20:12ff:fefe:401 Ethernet1.1001 20/0 00:09:43 ago
274+
leaf-01# show ip route vrf VrfVvpc-02
275+
Codes: K - kernel route, C - connected, S - static, B - BGP, O - OSPF, A - attached-host
276+
> - selected route, * - FIB route, q - queued route, r - rejected route, b - backup
277+
Destination Gateway Dist/Metric Last Update
278+
--------------------------------------------------------------------------------------------------------------------------------
279+
B>* 10.0.2.3/32 via fe80::e20:12ff:fefe:401 Ethernet1.1002 20/0 00:09:47 ago
280+
leaf-01#
281+
```

docs/user-guide/vpcs.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,11 @@ spec:
5656
subnet: 10.10.100.0/24
5757
vlan: 1100
5858
59+
bgp-on-host: # Another subnet with hosts peering with leaves via BGP
60+
subnet: 10.10.50.0/25
61+
hostBGP: true
62+
vlan: 1050
63+
5964
permit: # Defines which subnets of the current VPC can communicate to each other, applied on top of subnets "isolated" flag (doesn't affect VPC peering)
6065
- [subnet-1, subnet-2, subnet-3] # 1, 2 and 3 subnets can communicate to each other
6166
- [subnet-4, subnet-5] # Possible to define multiple lists
@@ -108,6 +113,24 @@ packet:
108113
Fabric and will be in `VrfV<VPC-name>` format, for example `VrfVvpc-1` for a VPC named `vpc-1` in the Fabric API.
109114
* _CircuitID_ (suboption 1) identifies the VLAN which, together with the VRF (VPC) name, maps to a specific VPC subnet.
110115

116+
### HostBGP subnets
117+
118+
At times, it is useful to have BGP running directly on the host and peering with the Fabric: one such case is
119+
to support active-active multi-homed servers, or simply to have redundancy when other techniques such
120+
as MCLAG or ESLAG are not available, for example because of hardware limitations.
121+
122+
Consider this scenario: `server-1` is connected to two different Fabric switches `sw-1` and `sw-2`, and attached to
123+
`vpc-1/subnet-1` on both of them. This subnet is configured as `hostBGP`; the switches will be configured to peer with
124+
`server-1` using unnumbered BGP (IPv4 unicast address family), only importing /32 prefixes in the subnet of the VPC and
125+
exporting routes learned from other VPC peers. Similarly, BGP is running on `server-1`, unnumbered BGP sessions are
126+
established with each leaf, and one or more Virtual IPs (VIPs) in the VPC subnet are advertised. With this setup, the
127+
host is part of the VPC and can be reached via one of the advertised VIPs from either link to the Fabric.
128+
129+
It is important to keep in mind that Hedgehog Fabric does not directly operate the host servers attached to it;
130+
running subnets in HostBGP mode requires running a routing suite and configuring it accordingly. To facilitate this
131+
process, however, we do provide a container image which can autogenerate a valid configuration, given some input parameters.
132+
For more details, see [the related section in the Host Settings page](host-settings.md#hostbgp-container).
133+
111134
## VPCAttachment
112135

113136
A VPCAttachment represents a specific VPC subnet assignment to the `Connection` object which means a binding between an

0 commit comments

Comments
 (0)