Skip to content

Hairpin

Jean-François Roy edited this page Jul 27, 2025 · 2 revisions

https://discord.com/channels/673534664354430999/1097925444566794260/1274037461609218202

09:40]Perihelion: I used a 10.82.2.0/24 cidr for my BGP services. If I don't have a LAN with that CIDR on the unifi side, it'll NAT 
[09:58]wavefront ʲᶠʳᵒʸ[HOPS]
: Really. I need to check that. How did you tell or confirm or find out it was doing that? I certainly don’t have a network with my pod or service cidr.
[10:02]wavefront ʲᶠʳᵒʸ[HOPS]
: I don’t even know how that’s supposed to work — AFAIK when you create a unifi network it insists on assigning a unique vlan ID to it. But I don’t want to vlan that traffic?
[11:06]wavefront ʲᶠʳᵒʸ[HOPS]
: I don't see any rules in the nat table that would apply to traffic on the cluster service cidr.
[11:14]wavefront ʲᶠʳᵒʸ[HOPS]
: There are rules for the WAN interfaces, from miniupnpd and that's it.
[11:46]Rodent: This wasn't pod cidr, it was loadbalancers routing via bgp
[12:14]wavefront ʲᶠʳᵒʸ[HOPS]
: Yeah he was talking about NAT happening on the UDM when routing a (virtual) LB address flow with routes configured via BGP.
[12:14]wavefront ʲᶠʳᵒʸ[HOPS]
: And I just don't see any entires in the nat table indicating this. Can nat happen in other ways?
[12:30]Perihelion: I'm guessing it can but I'm too lazy to go through all of the routers iptables chains to figure out why
[12:30]Perihelion: Just make a fake network and all's good
[12:31]Perihelion: >How did you check
Whoami would show me all connection would originate from the router IP instead of actual client IPs 
[12:32]Perihelion: unifi vs vyos. Exactly same cilium config and mostly same bgpd config 
Image
Image
[12:36]DrAg0n141: I dont have that, when i access my echo server it shows me as real-ip my ip and not the one of my udm. 
[12:48]Perihelion: Huh, indeed 😐
[12:48]wavefront ʲᶠʳᵒʸ[HOPS]
: Yeah I am seeing my client IP in X-Forwarded-For
[12:48]Perihelion: And you don't have a network with that cidr defined in unifi itself?
[12:48]wavefront ʲᶠʳᵒʸ[HOPS]
: Nope
[12:49]Perihelion: Wait, i have an idea maybe? Do you have a port forward rule for 80/443 from internet or do you use cloudflare tunnels?
[12:51]wavefront ʲᶠʳᵒʸ[HOPS]
: I don't have port forwarding, I do have a cloudflare tunnel, but I did not use it for the tests I just ran. Let me check on my phone using its mobile connection.
[12:51]Perihelion: no no
[12:51]Perihelion: make a portforward rule from internet to 80/443
[12:51]Perihelion: it's the hairpin i bet
[12:52]DrAg0n141: But then you have a port forward rule that does the nat
[12:52]Perihelion: indeed!
[12:52]wavefront ʲᶠʳᵒʸ[HOPS]
: Ah, yeah.
[12:52]Perihelion: it's thinking "ah you're going into 80/443, so i have to hairpin, because i don't recognize that subnet as my own"
[12:53]Perihelion: old edgeos had a specific switch for that
[12:53]Perihelion: this one doesn't
[12:53]Perihelion: and vyos just didn't do hairpin at all 😄
[12:54]wavefront ʲᶠʳᵒʸ[HOPS]
: It's quite possible the unifios control plane looks at the networks when it sets up the NAT table rules.
[12:54]Perihelion: i could prove it real fast...
[12:55]Perihelion: YEP
[12:55]Perihelion: removed portfowards, removed the fake network
[12:55]Perihelion: success
Image
[12:55]wavefront ʲᶠʳᵒʸ[HOPS]
: I have simple port forward for plex, it's this: -A UBIOS_POSTROUTING_USER_HOOK -d 10.10.1.0/32 -p tcp -m tcp --dport 32400 -m comment --comment 00000000004294967300 -j MASQUERADE
[12:56]wavefront ʲᶠʳᵒʸ[HOPS]
: 10.10.1.0/32 is a LB service for plex, so anything matching that destination address and port will get masqueraded, no matter its origin.
[12:58]wavefront ʲᶠʳᵒʸ[HOPS]
: But there is a separate DNAT rule in prerouting that only applies to the WAN interface: -A UBIOS_PREROUTING_USER_HOOK -p tcp -m set --match-set UBIOS_KEY_ADDRv4_eth8 dst -m tcp --dport 32400 -m comment --comment 00000000004294967299 -j DNAT --to-destination 10.10.1.0:32400
[13:00]Perihelion: well then, mystery solved
[13:38]wavefront ʲᶠʳᵒʸ[HOPS]
: Yeah so if the port mapping goes to a unifi network address, the MASQUERADE rule has an extra match in its specification: -A UBIOS_POSTROUTING_USER_HOOK -d 192.168.1.200/32 -p tcp -m set --match-set UBIOS_ALL_NETv4_br0 src -m tcp --dport 32400 -m comment --comment 00000000004294967300 -j MASQUERADE
[13:41]wavefront ʲᶠʳᵒʸ[HOPS]
: So it's only going to apply if the packet src is on the same network as the destination, which presumably means that the sender used the WAN IP address and the packet went through the DNAT rule during PREROUTING, and so now its src needs to be changed (hairpin situation). Presumably if the source sent the packet to the LAN address, on the same network, it would not hit the router at all and would not get DNAT'ed. And if that packet with a LAN src and dst got processed by the router, it would not get DNAT'ed (it won't match the match-set param on the DNAT rule) and changing the src would not break the flow -- the router would undo the MASQUERADE for reply packets and send them along. 
[13:42]wavefront ʲᶠʳᵒʸ[HOPS]
: But if there is no network that matches, then that extra match-set spec is not added, and every packet is going to get MASQUERADE'ed.
[13:49]wavefront ʲᶠʳᵒʸ[HOPS]
: This actually explains why I was seeing my router IP in plex!
[13:53]wavefront ʲᶠʳᵒʸ[HOPS]
: This kind of sucks. Maybe I need to manually insert the port forward rules. Bleh
[13:53]wavefront ʲᶠʳᵒʸ[HOPS]
: Adding a fake network sounds like it will cause way more problems than it will solve.
[14:06]wavefront ʲᶠʳᵒʸ[HOPS]
: Sooooo, the latest Network app can configure NAT rules directly. Settings > Routing > NAT > Create Entry. A Destination rule with Interface set to the primary or secondary WAN, Destination set to "Main (public IP)", Destination Port set, and Translated IP Address will create the DNAT rule, but it will not create the MASQUERADE rule. 
[14:08]wavefront ʲᶠʳᵒʸ[HOPS]
: Replacing the port forward entry (delete it).
[14:09]Rodent: I just have my nginx load balancer on the same vlan as the k8s nodes, and my other loadbalancer services on a second services vlan. Not seen any issues from it
[14:10]Rodent: I'm not sure I'd call it a fake network - it's just a network like any other
[14:11]wavefront ʲᶠʳᵒʸ[HOPS]
: You won't see issues like lost / not routed packets. It's more a performance optimization.
[14:15]wavefront ʲᶠʳᵒʸ[HOPS]
: Well OK, my issue is that I have to assign a VLAN ID. I guess it won't exactly do anything with it. Still feels gross?
[14:17]Rodent: To me it just felt logical
[14:17]wavefront ʲᶠʳᵒʸ[HOPS]
: And it's going to setup a bridge and routes that may conflict with the BGP routes?
[14:17]wavefront ʲᶠʳᵒʸ[HOPS]
: I guess the BGP routes will take priority.
[14:18]Rodent:
root@UDMPRO:~# ip route show
10.1.0.0/24 dev br0 proto kernel scope link src 10.1.0.1
10.1.1.0/24 dev br10 proto kernel scope link src 10.1.1.1
10.1.1.151 via 10.1.1.33 dev br10 proto bgp metric 20
10.1.1.152 via 10.1.1.33 dev br10 proto bgp metric 20
10.1.2.0/24 dev br20 proto kernel scope link src 10.1.2.1
10.1.2.10 via 10.1.1.33 dev br10 proto bgp metric 20
10.1.2.11 via 10.1.1.33 dev br10 proto bgp metric 20
10.1.2.12 via 10.1.1.31 dev br10 proto bgp metric 20
10.1.2.14 via 10.1.1.33 dev br10 proto bgp metric 20
10.1.3.0/24 dev br30 proto kernel scope link src 10.1.3.1
10.1.4.0/24 dev br40 proto kernel scope link src 10.1.4.1
10.1.8.0/24 dev br80 proto kernel scope link src 10.1.8.1
10.1.9.0/24 dev br90 proto kernel scope link src 10.1.9.1
88.88.209.0/24 dev eth8 proto kernel scope link src xx.xx.xx.xx
[14:19]Rodent: Seems fine, I think? 10.1.2.0/24 is the "fake" services vlan
[14:20]Rodent: 10.1.1.151 + 152 are nginx
[14:21]wavefront ʲᶠʳᵒʸ[HOPS]
: yeah so br20 is useless, but the more specific prefix routes (e.g. 10.1.2.10 via 10.1.1.33) will take precedence
[14:21]wavefront ʲᶠʳᵒʸ[HOPS]
: (e.g. ip route get to 10.1.2.11 should pick via 10.1.1.33 dev br10 and not dev br20)
[14:22]Rodent: ye
[14:22]wavefront ʲᶠʳᵒʸ[HOPS]
: So it's.... fine. I think I prefer the destination NAT rule approach. 
[14:23]wavefront ʲᶠʳᵒʸ[HOPS]
: It's a more targeted solution to the problem, so in theory fewer side effects.
[14:24]Rodent: Feels like it requires more setup/tinkering
[14:24]Rodent: Mine just works
[14:24]wavefront ʲᶠʳᵒʸ[HOPS]
: No more than adding a port forward entry?
[14:24]Rodent: Also I could have put the ip pool on the 10.1.1.0/24 network if I wanted to
[14:24]Rodent: and not have had the second network
[14:26]wavefront ʲᶠʳᵒʸ[HOPS]
: A port forward entry is literally just a DNAT + MASQUERADE pair of rules in the nat table. But for a private peer network (eg the kubernetes service network), you just need the DNAT rule on flows coming to the WAN interface. And now there's a way to add that directly.
[14:59]wavefront ʲᶠʳᵒʸ[HOPS]
: I suppose if some program on the cluster connects to <WAN interface IP:port>, then you'd be back to needing both the DNAT and MASQUERADE rules... bah
[01:20]Perihelion: How so? What are the downsides?
[10:49]wavefront ʲᶠʳᵒʸ[HOPS]
: I don't know! But unifi's control plane does a bunch of stuff when you add a network, and could do more or change behavior in the future.
[10:49]wavefront ʲᶠʳᵒʸ[HOPS]
: I don't know how it will change or interact when unifios 4.1 arrives with bgp support either.
[10:49]wavefront ʲᶠʳᵒʸ[HOPS]
: but in any case, your method is noted and maybe I'll adopt it!

Clone this wiki locally