zebra: clear NEXTHOP_FLAG_LINKDOWN when interface comes up#20397
Open
mike-dubrovsky wants to merge 1 commit intoFRRouting:masterfrom
Open
zebra: clear NEXTHOP_FLAG_LINKDOWN when interface comes up#20397mike-dubrovsky wants to merge 1 commit intoFRRouting:masterfrom
mike-dubrovsky wants to merge 1 commit intoFRRouting:masterfrom
Conversation
When the kernel installs a route while the nexthop interface is down, it sets RTNH_F_LINKDOWN on the route. Zebra copies this flag to NEXTHOP_FLAG_LINKDOWN. However, when the interface comes back up, the kernel does not send a route update to clear this flag. This causes kernel routes to remain marked as "linkdown" in zebra even after the nexthop interface is operational. Fix this by having zebra track the interface operational state (IFF_LOWER_UP) and update NEXTHOP_FLAG_LINKDOWN accordingly in nexthop_active_check() for kernel and system routes. Also add NEXTHOP_FLAG_LINKDOWN to the NHE hash comparison so that changes to this flag result in a new NHE being created, and track linkdown changes to trigger ROUTE_ENTRY_CHANGED for proper NHE updates. Signed-off-by: Mike Dubrovsky <mdubrovs@cisco.com>
mjstapp
reviewed
Jan 8, 2026
Contributor
mjstapp
left a comment
There was a problem hiding this comment.
it sounds as if this should be handled in interface-change processing, not by adding some linux-specific code to nexthop_active_check() ?
Member
|
I agree w/ Mark, this is state that can come from any dplane. In any event I would like to see a topotest that shows that this behavior is now working. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When the kernel installs a route while the nexthop interface is down, it sends RTM_NEWROUTE with RTNH_F_LINKDOWN. Zebra copies this flag to NEXTHOP_FLAG_LINKDOWN. However, when the interface comes back up, the kernel does not send a netlink route update to clear this flag ; it sends RTM_NEWLINK.
This causes kernel routes to remain marked as "linkdown" in zebra even after the nexthop interface is operational.
Root Cause
The flag NEXTHOP_FLAG_LINKDOWN was added via c704cb4. It is passed from the kernel but is never cleared in FRR.
Fix
Have zebra track the interface operational state (IFF_LOWER_UP) and update NEXTHOP_FLAG_LINKDOWN accordingly in nexthop_active_check() for kernel and system routes. Also add NEXTHOP_FLAG_LINKDOWN to the NHE hash comparison so that changes to this flag result in a new NHE being created, and track linkdown changes to trigger ROUTE_ENTRY_CHANGED for proper NHE updates.
Big Picture
The nexthop kernel flag RTNH_F_LINKDOWN is set when the link goes down, and the nexthop is skipped during FIB lookup (if the sysctl flag ignore_routes_with_linkdown is set).
This is useful for cases like:
default via a.a.b.1 dev enp0s10 metric 20 onlink linkdown
default via x.x.x.49 dev wwx001e101f0000 metric 30
From FRR's point of view, this flag can be used to program hardware via dplane to stay in sync with kernel behavior.
Currently, however, it is just cosmetic (show command only).
Repro
This issue is easy to reproduce:
Create a p2p link in a down state
ip link add veth0 type veth peer name veth1
ip link set veth0 up
Add route while link is down
ip addr add 192.168.122.94/24 dev veth0
ip route add 192.168.100.0/24 via 192.168.122.1 dev veth0
Bring link up
ip link set veth1 up
Observe the stale linkdown flag:
K>* 192.168.100.0/24 [0/0] via 192.168.122.1, veth0 linkdown, weight 1, 00:00:54