Skip to content

Vrf fib conf rework#552

Merged
rjarry merged 4 commits intoDPDK:mainfrom
maxime-leroy:vrf_fib_conf_rework
Mar 16, 2026
Merged

Vrf fib conf rework#552
rjarry merged 4 commits intoDPDK:mainfrom
maxime-leroy:vrf_fib_conf_rework

Conversation

@maxime-leroy
Copy link
Copy Markdown
Collaborator

@maxime-leroy maxime-leroy commented Mar 13, 2026

Summary by CodeRabbit

  • New Features

    • Per‑VRF FIB settings exposed: set default max IPv4/IPv6 routes and per‑AF tbl8 values when creating or updating VRFs; VRF add/set CLI includes these options.
    • Interface-level commands can set VRF FIB defaults (interface set … rib4-routes/rib6-routes).
  • Bug Fixes

    • CLI help/validation simplified; default route counts may now be zero.
  • Chores

    • Tests updated to cover VRF create/reconfigure/resize and default inheritance.

@maxime-leroy
Copy link
Copy Markdown
Collaborator Author

still under test (don't merge)

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Per-VRF FIB state was moved into per-interface (iface) structures. New API types were added: struct gr_iface_info_vrf and struct gr_iface_info_vrf_fib; macros GR_VRF_SET_FIB and GR_VLAN_SET_MAC were introduced. A per-address-family FIB ops registry and callbacks (struct vrf_fib_ops, vrf_fib_ops_register) plus accessors (vrf_fib_get, vrf_fib_ptr_get/set) were added. IPv4/IPv6 FIB requests were simplified to default-only variants (GR_IP4_FIB_DEFAULT_SET, GR_IP6_FIB_DEFAULT_SET). Many FIB lifecycle and API functions were converted to use VRF iface context (init/reconfig/fini take struct iface *), and CLI wiring was updated to set FIB params via interface add/set and VRF attribute options.

📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can enable review details to help with troubleshooting, context usage and more.

Enable the reviews.review_details setting to include review details such as the model used, the time taken for each step and more in the review comments.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/cli/route.c`:
- Around line 178-187: The grammar change removed the "route config set vrf ..."
path that smoke tests still call; either restore that grammar entry or update
the tests. To fix in-code, add a second CLI grammar entry that maps the "set vrf
<name> ..." path back to the existing handler (route_config_set) so existing
scripts keep working — mirror the existing "set default ..." declaration but use
the "set vrf" token and vrf name node, and ensure the same with_help argument
structure and route_config_set callback; alternatively, update
smoke/fib4_fullview_manualtest.sh and smoke/fib6_fullview_manualtest.sh to call
the new "interface set vrf main ..." command.

In `@modules/infra/control/vrf.c`:
- Around line 253-260: The cleanup loop at label fib_fail incorrectly uses af as
the upper bound and may call fib_ops[i]->fini on uninitialized entries; change
the logic to iterate only over the FIBs that were successfully initialized by
tracking a count (e.g., initialized_fibs) when you populate fib_ops and then use
for (unsigned i = 0; i < initialized_fibs; i++) { if (fib_ops[i])
fib_ops[i]->fini(iface); } in the fib_fail block (or alternatively use the
actual number-of-fibs variable used elsewhere) so that fib_ops is only finalized
for entries that were set up. Ensure the new counter is updated where fib_ops
entries are created.

In `@modules/ip6/control/route.c`:
- Around line 50-56: The helpers fib6_get_max_routes and fib6_get_num_tbl8
currently dereference vrf_fib_get(vrf, GR_AF_IP6) without a NULL check; update
both to defensively handle a NULL return from vrf_fib_get by checking the
pointer (result of vrf_fib_get(...)) and returning a safe default (e.g., 0) when
NULL, otherwise returning ->max_routes / ->num_tbl8; keep the functions inline
and preserve existing return types and semantics when the FIB exists.

In `@smoke/config_vrf_test.sh`:
- Around line 83-91: The test never verifies that a VRF created after changing
the defaults inherits those defaults: after running "grcli route config set
default rib4-routes 256 rib6-routes 256" and "grcli interface add vrf newvrf"
you must assert that newvrf actually got the default FIB sizes before you later
change it; add an immediate check after creating newvrf that queries "grcli -j
interface show name newvrf" and asserts ".rib4_max_routes == 256" (and similarly
for rib6 if desired) so the creation-time inheritance is validated prior to the
subsequent "grcli interface set vrf newvrf rib4-routes 128" call.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 10ea66c8-dc76-4c93-af08-64a8c3f00dde

📥 Commits

Reviewing files that changed from the base of the PR and between 4b2eb2c and 8bc9e1c.

📒 Files selected for processing (12)
  • modules/infra/api/gr_infra.h
  • modules/infra/cli/route.c
  • modules/infra/cli/vrf.c
  • modules/infra/control/gr_vrf.h
  • modules/infra/control/vrf.c
  • modules/ip/api/gr_ip4.h
  • modules/ip/cli/route.c
  • modules/ip/control/route.c
  • modules/ip6/api/gr_ip6.h
  • modules/ip6/cli/route.c
  • modules/ip6/control/route.c
  • smoke/config_vrf_test.sh

Comment thread modules/infra/cli/route.c
Comment thread modules/infra/control/vrf.c
Comment thread modules/ip6/control/route.c
Comment thread smoke/config_vrf_test.sh
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
modules/ip/control/route.c (1)

50-56: Potential NULL dereference in helper functions.

vrf_fib_get() can return NULL if the AF is not found. These inline helpers dereference the result without checking. While the current call flow ensures the FIB slot exists (init runs after the slot is populated in iface_vrf_init), this is fragile.

Defensive fix
 static inline uint32_t fib4_get_max_routes(const struct iface *vrf) {
-	return vrf_fib_get(vrf, GR_AF_IP4)->max_routes;
+	const struct gr_iface_info_vrf_fib *fib = vrf_fib_get(vrf, GR_AF_IP4);
+	return fib ? fib->max_routes : 0;
 }

 static inline uint32_t fib4_get_num_tbl8(const struct iface *vrf) {
-	return vrf_fib_get(vrf, GR_AF_IP4)->num_tbl8;
+	const struct gr_iface_info_vrf_fib *fib = vrf_fib_get(vrf, GR_AF_IP4);
+	return fib ? fib->num_tbl8 : 0;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/ip/control/route.c` around lines 50 - 56, The two inline helpers
fib4_get_max_routes and fib4_get_num_tbl8 currently dereference vrf_fib_get(...)
without NULL checks; update each to first store the result of vrf_fib_get(vrf,
GR_AF_IP4) into a local pointer, check if it's NULL, and if so return a safe
default (e.g., 0) or handle error path (optional log/assert) otherwise return
the requested field; mention vrf_fib_get, fib4_get_max_routes, fib4_get_num_tbl8
and keep iface_vrf_init behavior in mind so the change is defensive only.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@modules/ip/control/route.c`:
- Around line 50-56: The two inline helpers fib4_get_max_routes and
fib4_get_num_tbl8 currently dereference vrf_fib_get(...) without NULL checks;
update each to first store the result of vrf_fib_get(vrf, GR_AF_IP4) into a
local pointer, check if it's NULL, and if so return a safe default (e.g., 0) or
handle error path (optional log/assert) otherwise return the requested field;
mention vrf_fib_get, fib4_get_max_routes, fib4_get_num_tbl8 and keep
iface_vrf_init behavior in mind so the change is defensive only.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b0578b5e-91b2-49f8-99e9-a439cdac03df

📥 Commits

Reviewing files that changed from the base of the PR and between 8bc9e1c and ab74f0a.

📒 Files selected for processing (14)
  • modules/infra/api/gr_infra.h
  • modules/infra/cli/route.c
  • modules/infra/cli/vrf.c
  • modules/infra/control/gr_vrf.h
  • modules/infra/control/vrf.c
  • modules/ip/api/gr_ip4.h
  • modules/ip/cli/route.c
  • modules/ip/control/route.c
  • modules/ip6/api/gr_ip6.h
  • modules/ip6/cli/route.c
  • modules/ip6/control/route.c
  • smoke/config_vrf_test.sh
  • smoke/fib4_fullview_manualtest.sh
  • smoke/fib6_fullview_manualtest.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • smoke/config_vrf_test.sh

@maxime-leroy maxime-leroy force-pushed the vrf_fib_conf_rework branch 2 times, most recently from 9cb30c9 to 6258b7d Compare March 13, 2026 13:51
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/control/vrf.c`:
- Around line 274-275: The current check in vrf.c rejects any set_attrs bits
other than GR_IFACE_SET_NAME and GR_VRF_SET_FIB, causing a description-only
update to be treated as EOPNOTSUPP even though iface_reconfig() already saved
GR_IFACE_SET_DESCR; update the mask to allow GR_IFACE_SET_DESCR by including it
in the allowed bits (i.e., change the condition that tests set_attrs & ~(...) to
also permit GR_IFACE_SET_DESCR), so the function (where set_attrs is validated
in vrf.c) no longer rejects description-only VRF reconfig requests.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6128bfcf-64c7-4601-bfec-39cffdadd9bd

📥 Commits

Reviewing files that changed from the base of the PR and between ab74f0a and 6258b7d.

📒 Files selected for processing (14)
  • modules/infra/api/gr_infra.h
  • modules/infra/cli/route.c
  • modules/infra/cli/vrf.c
  • modules/infra/control/gr_vrf.h
  • modules/infra/control/vrf.c
  • modules/ip/api/gr_ip4.h
  • modules/ip/cli/route.c
  • modules/ip/control/route.c
  • modules/ip6/api/gr_ip6.h
  • modules/ip6/cli/route.c
  • modules/ip6/control/route.c
  • smoke/config_vrf_test.sh
  • smoke/fib4_fullview_manualtest.sh
  • smoke/fib6_fullview_manualtest.sh
🚧 Files skipped from review as they are similar to previous changes (3)
  • modules/ip/cli/route.c
  • smoke/fib4_fullview_manualtest.sh
  • smoke/fib6_fullview_manualtest.sh

Comment thread modules/infra/control/vrf.c Outdated
Comment thread modules/infra/cli/vrf.c Outdated
Comment thread modules/infra/control/vrf.c Outdated
Comment thread modules/infra/cli/route.c
Comment thread modules/ip/control/route.c
Comment thread modules/infra/api/gr_infra.h Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/ip/control/route.c`:
- Around line 666-672: old_fib returned by vrf_fib_ptr_get can be NULL but is
passed unguarded into rte_fib_get_rib; before creating new_fib and calling
rib4_iter_one, check old_fib for NULL (same pattern as fib4_fini) and skip the
migration/rib4_iter_one call if old_fib is NULL (or handle appropriately),
ensuring fib4_migrate_ctx is only populated and rib4_iter_one invoked when
old_fib != NULL; update the code around vrf_fib_ptr_get, create_fib,
fib4_migrate_ctx, and the rib4_iter_one call to perform this guard.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6ea8fc10-8e16-44f0-a00b-c00f6f725a1b

📥 Commits

Reviewing files that changed from the base of the PR and between 6258b7d and dae1ae1.

📒 Files selected for processing (14)
  • modules/infra/api/gr_infra.h
  • modules/infra/cli/route.c
  • modules/infra/cli/vrf.c
  • modules/infra/control/gr_vrf.h
  • modules/infra/control/vrf.c
  • modules/ip/api/gr_ip4.h
  • modules/ip/cli/route.c
  • modules/ip/control/route.c
  • modules/ip6/api/gr_ip6.h
  • modules/ip6/cli/route.c
  • modules/ip6/control/route.c
  • smoke/config_vrf_test.sh
  • smoke/fib4_fullview_manualtest.sh
  • smoke/fib6_fullview_manualtest.sh
🚧 Files skipped from review as they are similar to previous changes (4)
  • smoke/fib6_fullview_manualtest.sh
  • modules/infra/api/gr_infra.h
  • modules/ip/api/gr_ip4.h
  • modules/infra/cli/vrf.c

Comment thread modules/ip/control/route.c Outdated
Comment on lines +666 to +672
old_fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);
new_fib = create_fib(vrf);
if (new_fib == NULL)
return api_out(errno, 0, NULL);
return errno_log(errno, "create_fib");

struct fib4_migrate_ctx ctx = {.new_fib = new_fib};
rib4_iter_one(rte_fib_get_rib(vrf_fibs[req->vrf_id]), req->vrf_id, fib4_migrate_cb, &ctx);
rib4_iter_one(rte_fib_get_rib(old_fib), vrf->id, fib4_migrate_cb, &ctx);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing NULL check on old_fib before use.

vrf_fib_ptr_get can return NULL, but old_fib is passed directly to rte_fib_get_rib() at line 672 without a guard. If reconfig is called on a VRF that hasn't been initialized (or was already finalized), this will crash.

The fib4_fini function correctly guards against this at line 754. The same pattern should be applied here.

Proposed fix
 	old_fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);
+	if (old_fib == NULL)
+		return errno_set(ENOENT);
 	new_fib = create_fib(vrf);
 	if (new_fib == NULL)
 		return errno_log(errno, "create_fib");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
old_fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);
new_fib = create_fib(vrf);
if (new_fib == NULL)
return api_out(errno, 0, NULL);
return errno_log(errno, "create_fib");
struct fib4_migrate_ctx ctx = {.new_fib = new_fib};
rib4_iter_one(rte_fib_get_rib(vrf_fibs[req->vrf_id]), req->vrf_id, fib4_migrate_cb, &ctx);
rib4_iter_one(rte_fib_get_rib(old_fib), vrf->id, fib4_migrate_cb, &ctx);
old_fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);
if (old_fib == NULL)
return errno_set(ENOENT);
new_fib = create_fib(vrf);
if (new_fib == NULL)
return errno_log(errno, "create_fib");
struct fib4_migrate_ctx ctx = {.new_fib = new_fib};
rib4_iter_one(rte_fib_get_rib(old_fib), vrf->id, fib4_migrate_cb, &ctx);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/ip/control/route.c` around lines 666 - 672, old_fib returned by
vrf_fib_ptr_get can be NULL but is passed unguarded into rte_fib_get_rib; before
creating new_fib and calling rib4_iter_one, check old_fib for NULL (same pattern
as fib4_fini) and skip the migration/rib4_iter_one call if old_fib is NULL (or
handle appropriately), ensuring fib4_migrate_ctx is only populated and
rib4_iter_one invoked when old_fib != NULL; update the code around
vrf_fib_ptr_get, create_fib, fib4_migrate_ctx, and the rib4_iter_one call to
perform this guard.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@modules/infra/api/gr_infra.h`:
- Around line 128-141: The single GR_VRF_SET_FIB flag is too coarse causing
gr_iface_info_vrf_fib values {0,0} to be ambiguous; update the API to carry
per-AF presence/validity flags so callers can explicitly clear or set IPv4/IPv6
FIBs: add either per-AF bits (e.g. GR_VRF_SET_FIB_IPV4 and GR_VRF_SET_FIB_IPV6)
or a validity bitfield inside struct gr_iface_info_vrf (or inside
gr_iface_info_vrf_fib) and update the GR_IFACE_SET handling to check those
presence bits when applying or clearing max_routes/num_tbl8 for
gr_iface_info_vrf.ipv4 and .ipv6 so 0 is treated as a real value when the
corresponding presence flag is set and as “no change” when it is not.

In `@modules/infra/control/vrf.c`:
- Around line 298-342: The loop that resizes per-address-family FIBs (using
vrf_fib_api, vrf_fib_conf, fib_ops[af] and calling fib_ops[af]->reconfig(iface))
can partially commit: if a later reconfig() fails you only restore that AF’s
snapshot (old) and return, leaving earlier AFs changed; fix by recording
successful AF indices and their pre-change snapshots (e.g. an array of struct
gr_iface_info_vrf_fib old_configs indexed by af or a bitmap/list of succeeded
afs) before calling reconfig(), and if any reconfig() fails iterate the list of
previously succeeded AFs in reverse to restore their vrf_fib_conf entries and
call their fib_ops[af]->reconfig(iface) to roll them back (preserve/propagate
errno semantics and restore LOG messages accordingly) so the function leaves the
VRF in its original state on error.

In `@modules/ip/control/route.c`:
- Around line 699-708: The gr_fib4_info payload is sending uninitialized tbl8
counters when vrf_fib_ptr_get(vrf, GR_AF_IP4) returns NULL; ensure
info.used_tbl8 and info.num_tbl8 are initialized (e.g., set to 0) before calling
rte_fib_tbl8_get_stats or before sending via gr_fib4_info in route.c (symbols:
info, vrf_fib_ptr_get, rte_fib_tbl8_get_stats, gr_fib4_info,
HAVE_RTE_FIB_TBL8_GET_STATS); specifically, initialize those fields to 0 prior
to the vrf_fib_ptr_get() result check and only overwrite them if a valid struct
rte_fib* is returned.
- Around line 750-759: fib4_fini (and likewise fib6_fini) removes the published
FIB pointer with vrf_fib_ptr_set but immediately calls rte_fib_free, which can
race with in-flight lookups; add the same QSBR synchronization used in
fib4_reconfig: after calling vrf_fib_ptr_set(vrf, GR_AF_IP4, NULL) invoke the
QSBR grace-wait function (the same QSBR wait/advance used in fib4_reconfig) to
ensure all readers have exited before calling rte_fib_free(fib) and performing
route4_vrf_rm, ensuring the wait occurs between vrf_fib_ptr_set and rte_fib_free
(repeat the analogous change in fib6_fini).
- Around line 666-676: The migration currently iterates old_fib with
rib4_iter_one calling fib4_migrate_cb which may call nexthop_decref() while
old_fib is still live for readers; to avoid a use-after-free, change the flow so
migration of routes happens first but any cleanup that can free nexthops (the
failed-route cleanup currently in fib4_migrate_cb that invokes nexthop_decref())
is deferred until after vrf_fib_ptr_set(vrf, GR_AF_IP4, new_fib) and a full
rte_rcu_qsbr_synchronize(gr_datapath_rcu(), RTE_QSBR_THRID_INVALID); in
practice, modify fib4_migrate_cb (or the iterator callback) to record
failed-route nexthops into a temporary list/vec instead of calling
nexthop_decref() immediately, then after the pointer swap and rcu synchronize
iterate that collected list and call nexthop_decref() and finally call
rte_fib_free(old_fib).

In `@modules/ip6/control/route.c`:
- Around line 740-749: The tbl8 counters info.used_tbl8 and info.num_tbl8 can
remain uninitialized when vrf_fib_ptr_get(vrf) returns NULL; modify the block
around vrf_fib_ptr_get/vrf_fib_ptr_get(vrf, GR_AF_IP6) and
rte_fib6_tbl8_get_stats so that when vrf_fib_ptr_get returns NULL you explicitly
set info.used_tbl8 = 0 and info.num_tbl8 = 0 (or initialize them to 0 before the
HAVE_RTE_FIB_TBL8_GET_STATS ifdef) so gr_fib6_info always sends defined values
instead of stack garbage.
- Around line 791-800: After unpublishing the IPv6 FIB with vrf_fib_ptr_set in
fib6_fini, wait for the RCU grace period (use the same RCU synchronization call
used in fib6_reconfig) before calling rte_fib6_free and releasing nexthop
references; apply the same change to fib4_fini so both functions perform an RCU
synchronize after vrf_fib_ptr_set and prior to freeing the FIB.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6706fb2a-4cc7-45a9-b59f-44765b8897d7

📥 Commits

Reviewing files that changed from the base of the PR and between dae1ae1 and 75fd061.

📒 Files selected for processing (14)
  • modules/infra/api/gr_infra.h
  • modules/infra/cli/route.c
  • modules/infra/cli/vrf.c
  • modules/infra/control/gr_vrf.h
  • modules/infra/control/vrf.c
  • modules/ip/api/gr_ip4.h
  • modules/ip/cli/route.c
  • modules/ip/control/route.c
  • modules/ip6/api/gr_ip6.h
  • modules/ip6/cli/route.c
  • modules/ip6/control/route.c
  • smoke/config_vrf_test.sh
  • smoke/fib4_fullview_manualtest.sh
  • smoke/fib6_fullview_manualtest.sh
🚧 Files skipped from review as they are similar to previous changes (4)
  • smoke/fib6_fullview_manualtest.sh
  • modules/ip6/api/gr_ip6.h
  • modules/infra/control/gr_vrf.h
  • modules/infra/cli/route.c

Comment on lines +128 to +141
// VRF reconfiguration attribute flags.
#define GR_VRF_SET_FIB GR_BIT64(32)

// Per-AF FIB configuration.
struct gr_iface_info_vrf_fib {
uint32_t max_routes; // 0 = default
uint32_t num_tbl8; // 0 = auto
};

// Info structure for GR_IFACE_TYPE_VRF interfaces.
struct gr_iface_info_vrf {
struct gr_iface_info_vrf_fib ipv4;
struct gr_iface_info_vrf_fib ipv6;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

GR_VRF_SET_FIB is too coarse to support resetting an AF back to defaults.

struct gr_iface_info_vrf_fib makes 0 a real value (default / auto), but the request only carries one GR_VRF_SET_FIB bit for the whole struct gr_iface_info_vrf. On GR_IFACE_SET, ipv4 = {0, 0} is indistinguishable from “leave IPv4 unchanged”, so once a VRF has an explicit override there is no way to clear it back to inherited defaults/auto through this public API. You need per-AF presence bits or explicit validity flags.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/infra/api/gr_infra.h` around lines 128 - 141, The single
GR_VRF_SET_FIB flag is too coarse causing gr_iface_info_vrf_fib values {0,0} to
be ambiguous; update the API to carry per-AF presence/validity flags so callers
can explicitly clear or set IPv4/IPv6 FIBs: add either per-AF bits (e.g.
GR_VRF_SET_FIB_IPV4 and GR_VRF_SET_FIB_IPV6) or a validity bitfield inside
struct gr_iface_info_vrf (or inside gr_iface_info_vrf_fib) and update the
GR_IFACE_SET handling to check those presence bits when applying or clearing
max_routes/num_tbl8 for gr_iface_info_vrf.ipv4 and .ipv6 so 0 is treated as a
real value when the corresponding presence flag is set and as “no change” when
it is not.

Comment on lines +298 to +342
if (set_attrs & GR_VRF_SET_FIB) {
for (unsigned af = 0; af < ARRAY_DIM(fib_ops); af++) {
const struct gr_iface_info_vrf_fib *api;
struct gr_iface_info_vrf_fib *fib_conf;
struct gr_iface_info_vrf_fib old;

api = vrf_fib_api(info, af);
if (api == NULL)
continue;
if (api->max_routes == 0 && api->num_tbl8 == 0)
continue;

if (fib_ops[af] == NULL)
return errno_set(ENOENT);

fib_conf = vrf_fib_conf(vrf, af);
old = *fib_conf;

if (api->max_routes) {
fib_conf->max_routes = api->max_routes;
if (!api->num_tbl8)
fib_conf->num_tbl8 = 0;
}
if (api->num_tbl8)
fib_conf->num_tbl8 = api->num_tbl8;

if (fib_conf->max_routes == old.max_routes
&& fib_conf->num_tbl8 == old.num_tbl8)
continue;

if (fib_ops[af]->reconfig(iface) < 0) {
*fib_conf = old;
return -errno;
}

LOG(INFO,
"resized %s FIB VRF %s(%u) max_routes %u -> %u num_tbl8 %u -> %u",
gr_af_name(af),
iface->name,
iface->id,
old.max_routes,
fib_conf->max_routes,
old.num_tbl8,
fib_conf->num_tbl8);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Rollback earlier AF changes when a later reconfig() fails.

This loop commits IPv4 and IPv6 one by one. If one FIB resize succeeds and a later one fails, the function returns an error after restoring only the failing AF’s config snapshot; the earlier AF stays resized, so a single multi-AF interface set vrf ... request can partially commit and drop routes while reporting failure. Track the AFs that already succeeded and roll them back before returning.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/infra/control/vrf.c` around lines 298 - 342, The loop that resizes
per-address-family FIBs (using vrf_fib_api, vrf_fib_conf, fib_ops[af] and
calling fib_ops[af]->reconfig(iface)) can partially commit: if a later
reconfig() fails you only restore that AF’s snapshot (old) and return, leaving
earlier AFs changed; fix by recording successful AF indices and their pre-change
snapshots (e.g. an array of struct gr_iface_info_vrf_fib old_configs indexed by
af or a bitmap/list of succeeded afs) before calling reconfig(), and if any
reconfig() fails iterate the list of previously succeeded AFs in reverse to
restore their vrf_fib_conf entries and call their fib_ops[af]->reconfig(iface)
to roll them back (preserve/propagate errno semantics and restore LOG messages
accordingly) so the function leaves the VRF in its original state on error.

Comment thread modules/ip/control/route.c Outdated
Comment thread modules/ip/control/route.c
Comment on lines +750 to +759
static void fib4_fini(struct iface *vrf) {
struct rte_fib *fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);

if (iface->type != GR_IFACE_TYPE_VRF)
return;

memset(&fib4_conf[iface->id], 0, sizeof(fib4_conf[0]));
fib = vrf_fibs[iface->id];
vrf_fibs[iface->id] = NULL;
vrf_fib_ptr_set(vrf, GR_AF_IP4, NULL);
if (fib != NULL) {
LOG(INFO, "destroying IPv4 FIB for VRF %s(%u)", iface->name, iface->id);
rib4_iter_one(rte_fib_get_rib(fib), iface->id, route4_vrf_rm, (void *)iface);
LOG(INFO, "destroying IPv4 FIB for VRF %s(%u)", vrf->name, vrf->id);
rib4_iter_one(rte_fib_get_rib(fib), vrf->id, route4_vrf_rm, (void *)vrf);
rte_fib_free(fib);
}
memset(route_counts[iface->id], 0, sizeof(route_counts[iface->id]));
memset(route_counts[vrf->id], 0, sizeof(route_counts[vrf->id]));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's find and inspect the fib4_fini and fib4_reconfig functions
rg -A 20 "^static void fib4_fini" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 815


🏁 Script executed:

# Now let's look at fib4_reconfig to see how it uses QSBR
rg -A 30 "^static.*fib4_reconfig" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 1027


🏁 Script executed:

# Check for QSBR usage patterns in the file
rg "rte_rcu_qsbr_synchronize|gr_datapath_rcu" modules/ip/control/route.c -B 3 -A 3

Repository: DPDK/grout

Length of output: 534


🏁 Script executed:

# Check the fib6_fini for comparison
rg -A 20 "^static void fib6_fini" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Let's also check route4_vrf_rm to understand what it does
rg -B 5 -A 20 "^static.*route4_vrf_rm" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 605


🏁 Script executed:

# Check if fib6_fini exists
rg -A 20 "fib6_fini" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Let's check the broader context of when fib4_fini is called
rg -B 5 -A 5 "fib4_fini" modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 569


🏁 Script executed:

# Verify the datapath RCU mechanism is used elsewhere for similar teardown
rg "vrf_fini|interface.*fini" modules/ip/control/ -A 10

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Search for fib6_fini more broadly
rg "fib6_fini|fib6_ops" modules/ip/control/route.c -A 15

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Check the vrf_fib_ops structure definition to understand the interface
rg -B 5 -A 10 "struct vrf_fib_ops" modules/

Repository: DPDK/grout

Length of output: 4213


🏁 Script executed:

# Look for where vrf_fini or fib4_fini is called from
rg -B 3 -A 3 "\.fini\(" modules/ -A 5

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Let's find where the fib_ops.fini is actually called
rg "fib_ops.*fini|ops->fini" modules/infra/control/ -B 5 -A 5

Repository: DPDK/grout

Length of output: 1807


🏁 Script executed:

# Check for vrf_fini or interface deletion patterns
rg "vrf_fini|iface.*destroy|interface.*delete" modules/infra/control/ -B 3 -A 10

Repository: DPDK/grout

Length of output: 9987


🏁 Script executed:

# Look at fib6_fini for comparison
rg -A 20 "^static void fib6_fini" modules/ip6/control/route.c

Repository: DPDK/grout

Length of output: 818


Add QSBR synchronization before tearing down the IPv4 FIB.

The FIB pointer is unpublished at line 753, but rte_fib_free() at line 757 can race with in-flight datapath lookups. Unlike fib4_reconfig() which waits for QSBR after publishing the new FIB and before freeing the old one, fib4_fini() has no synchronization barrier. In-flight lookups that acquired a reference to the old FIB or its nexthops before the pointer was NULLed can hit freed memory. The same issue exists in fib6_fini().

Suggested fix
 static void fib4_fini(struct iface *vrf) {
 	struct rte_fib *fib = vrf_fib_ptr_get(vrf, GR_AF_IP4);

 	vrf_fib_ptr_set(vrf, GR_AF_IP4, NULL);
 	if (fib != NULL) {
 		LOG(INFO, "destroying IPv4 FIB for VRF %s(%u)", vrf->name, vrf->id);
+		rte_rcu_qsbr_synchronize(gr_datapath_rcu(), RTE_QSBR_THRID_INVALID);
 		rib4_iter_one(rte_fib_get_rib(fib), vrf->id, route4_vrf_rm, (void *)vrf);
 		rte_fib_free(fib);
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/ip/control/route.c` around lines 750 - 759, fib4_fini (and likewise
fib6_fini) removes the published FIB pointer with vrf_fib_ptr_set but
immediately calls rte_fib_free, which can race with in-flight lookups; add the
same QSBR synchronization used in fib4_reconfig: after calling
vrf_fib_ptr_set(vrf, GR_AF_IP4, NULL) invoke the QSBR grace-wait function (the
same QSBR wait/advance used in fib4_reconfig) to ensure all readers have exited
before calling rte_fib_free(fib) and performing route4_vrf_rm, ensuring the wait
occurs between vrf_fib_ptr_set and rte_fib_free (repeat the analogous change in
fib6_fini).

Comment on lines +740 to +749
const struct iface *vrf = get_vrf_iface(v);
if (vrf == NULL)
continue;
info.vrf_id = v;
info.max_routes = fib6_get_max_routes(v);
info.max_routes = fib6_get_max_routes(vrf);
info.used_routes = fib6_total_routes(v);
#ifdef HAVE_RTE_FIB_TBL8_GET_STATS
rte_fib6_tbl8_get_stats(vrf_fibs[v], &info.used_tbl8, &info.num_tbl8);
struct rte_fib6 *fib = vrf_fib_ptr_get(vrf, GR_AF_IP6);
if (fib != NULL)
rte_fib6_tbl8_get_stats(fib, &info.used_tbl8, &info.num_tbl8);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Initialize tbl8 stats before sending gr_fib6_info.

When HAVE_RTE_FIB_TBL8_GET_STATS is enabled and vrf_fib_ptr_get() returns NULL, info.num_tbl8 and info.used_tbl8 are left uninitialized and still sent at Line 754. That turns a missing FIB into bogus stats and leaks stack garbage through the API.

Possible fix
 		info.vrf_id = v;
 		info.max_routes = fib6_get_max_routes(vrf);
 		info.used_routes = fib6_total_routes(v);
+		info.num_tbl8 = fib6_get_num_tbl8(vrf);
+		info.used_tbl8 = 0;
 `#ifdef` HAVE_RTE_FIB_TBL8_GET_STATS
 		struct rte_fib6 *fib = vrf_fib_ptr_get(vrf, GR_AF_IP6);
 		if (fib != NULL)
 			rte_fib6_tbl8_get_stats(fib, &info.used_tbl8, &info.num_tbl8);

Also applies to: 751-754

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/ip6/control/route.c` around lines 740 - 749, The tbl8 counters
info.used_tbl8 and info.num_tbl8 can remain uninitialized when
vrf_fib_ptr_get(vrf) returns NULL; modify the block around
vrf_fib_ptr_get/vrf_fib_ptr_get(vrf, GR_AF_IP6) and rte_fib6_tbl8_get_stats so
that when vrf_fib_ptr_get returns NULL you explicitly set info.used_tbl8 = 0 and
info.num_tbl8 = 0 (or initialize them to 0 before the
HAVE_RTE_FIB_TBL8_GET_STATS ifdef) so gr_fib6_info always sends defined values
instead of stack garbage.

Comment on lines +791 to +800
static void fib6_fini(struct iface *vrf) {
struct rte_fib6 *fib = vrf_fib_ptr_get(vrf, GR_AF_IP6);

if (iface->type != GR_IFACE_TYPE_VRF)
return;

memset(&fib6_conf[iface->id], 0, sizeof(fib6_conf[0]));
fib = vrf_fibs[iface->id];
vrf_fibs[iface->id] = NULL;
vrf_fib_ptr_set(vrf, GR_AF_IP6, NULL);
if (fib != NULL) {
LOG(INFO, "destroying IPv6 FIB for VRF %s(%u)", iface->name, iface->id);
rib6_iter_one(rte_fib6_get_rib(fib), iface->id, route6_vrf_rm, (void *)iface);
LOG(INFO, "destroying IPv6 FIB for VRF %s(%u)", vrf->name, vrf->id);
rib6_iter_one(rte_fib6_get_rib(fib), vrf->id, route6_vrf_rm, (void *)vrf);
rte_fib6_free(fib);
}
memset(route_counts[iface->id], 0, sizeof(route_counts[iface->id]));
memset(route_counts[vrf->id], 0, sizeof(route_counts[vrf->id]));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the fib6_fini and fib6_reconfig functions
find . -name "route.c" -path "*/ip6/control/*" -type f

Repository: DPDK/grout

Length of output: 83


🏁 Script executed:

# Read the route.c file to find these functions and understand the context
wc -l modules/ip6/control/route.c

Repository: DPDK/grout

Length of output: 85


🏁 Script executed:

# Get the fib6_fini, fib6_reconfig functions and nearby context
sed -n '780,850p' modules/ip6/control/route.c

Repository: DPDK/grout

Length of output: 1702


🏁 Script executed:

# Search for fib6_reconfig to see the full implementation
rg -n "fib6_reconfig|fib6_fini" modules/ip6/control/route.c -A 20

Repository: DPDK/grout

Length of output: 2410


🏁 Script executed:

# Also check the IPv4 counterpart to understand the pattern
rg -n "fib_fini|fib_reconfig" modules/ip/control/route.c -A 20 | head -80

Repository: DPDK/grout

Length of output: 36


🏁 Script executed:

# Search for rte_rcu_qsbr_synchronize usage in the codebase
rg -n "rte_rcu_qsbr_synchronize" --type c --type h

Repository: DPDK/grout

Length of output: 1716


🏁 Script executed:

# Check the IPv4 fib_fini implementation
sed -n '680,750p' modules/ip/control/route.c

Repository: DPDK/grout

Length of output: 1752


🏁 Script executed:

# Check where fib_fini is called from - likely in VRF teardown
rg -n "\.fini\s*=" modules/ip*/control/route.c -B 5 -A 5

Repository: DPDK/grout

Length of output: 1233


🏁 Script executed:

# Search for callers of fib_fini through vrf_fib_ops
rg -n "vrf_fib_ops|fib_ops|\.fini" modules/infra/control/iface.c -B 2 -A 2 | head -100

Repository: DPDK/grout

Length of output: 159


🏁 Script executed:

# Find where vrf_fib_ops functions are invoked
rg -n "vrf_fib_ops|fib_ops" modules/infra/control/*.c -B 3 -A 3

Repository: DPDK/grout

Length of output: 4095


🏁 Script executed:

# Look for the actual fib4_fini implementation to compare with fib6_fini
rg -n "^static void fib4_fini" modules/ip/control/route.c -A 10

Repository: DPDK/grout

Length of output: 496


🏁 Script executed:

# Search for where .fini callbacks are called
rg -n "->fini\(" modules --type c --type h -B 2 -A 2

Repository: DPDK/grout

Length of output: 458


Add RCU grace period before freeing the IPv6 FIB.

After unpublishing the table at line 794, fib6_fini() immediately frees the FIB and releases nexthop references. Without waiting for in-flight lookups to complete, freed memory can still be accessed. This mirrors the RCU synchronization already present in fib6_reconfig() (line 716).

Suggested fix
 static void fib6_fini(struct iface *vrf) {
 	struct rte_fib6 *fib = vrf_fib_ptr_get(vrf, GR_AF_IP6);

 	vrf_fib_ptr_set(vrf, GR_AF_IP6, NULL);
 	if (fib != NULL) {
 		LOG(INFO, "destroying IPv6 FIB for VRF %s(%u)", vrf->name, vrf->id);
+		rte_rcu_qsbr_synchronize(gr_datapath_rcu(), RTE_QSBR_THRID_INVALID);
 		rib6_iter_one(rte_fib6_get_rib(fib), vrf->id, route6_vrf_rm, (void *)vrf);
 		rte_fib6_free(fib);
 	}

Note: fib4_fini() has the same issue.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modules/ip6/control/route.c` around lines 791 - 800, After unpublishing the
IPv6 FIB with vrf_fib_ptr_set in fib6_fini, wait for the RCU grace period (use
the same RCU synchronization call used in fib6_reconfig) before calling
rte_fib6_free and releasing nexthop references; apply the same change to
fib4_fini so both functions perform an RCU synchronize after vrf_fib_ptr_set and
prior to freeing the FIB.

Comment thread modules/infra/control/vrf.c Outdated
return 0;

fib_fail:
for (unsigned i = 0; i < af; i++)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do the same as line 235:

	for (af = 0; af < ARRAY_DIM(fib_ops); af++) {
		if (fib_ops[af] == NULL)
			continue;

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looping over all registered AFs (i < ARRAY_DIM(fib_ops)) would call fini() on AFs that were never init()'d -- both the one that failed and any registered AF that comes after it in the array. It works by accident because fib4_fini/fib6_fini check for a NULL fib pointer, but we'd be relying on every future fini() implementation being equally defensive.

i < af makes the contract explicit: only tear down what was successfully set up. The failing init() cleans up after itself.

Comment thread modules/infra/control/vrf.c Outdated
Comment thread modules/ip/control/route.c
Comment thread modules/ip6/control/route.c Outdated
Comment thread modules/ip6/control/route.c Outdated
fib6_conf_set used rib6_iter() to migrate routes during FIB resize,
which iterates all VRFs to find the matching one. Use rib6_iter_one()
directly with the old FIB's RIB, like the IPv4 counterpart already
does in fib4_conf_set.

Fixes: ac81860 ("ip6: add configurable FIB size per VRF")
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Robin Jarry <rjarry@redhat.com>
Instead of configuring FIB capacity through a separate per-VRF API
request (GR_IP4_FIB_CONF_SET / GR_IP6_FIB_CONF_SET), store FIB
parameters (max_routes, num_tbl8) directly in the VRF interface info.

This allows setting FIB capacity at VRF creation time and reconfiguring
it through the standard interface set API. A new vrf_fib_ops callback
mechanism lets IP modules hook into VRF init/reconfig/fini, replacing
the previous IFACE_REMOVE event subscription and lazy FIB creation.

The per-VRF rte_fib/rte_fib6 pointers are now stored in the private
VRF iface struct (fib_ptr[]) instead of separate global arrays, trading
one extra indirection on the lookup path for removing two
GR_MAX_IFACES-sized allocations. A fixed enum (GR_VRF_FIB_IP4,
GR_VRF_FIB_IP6) indexes both the FIB config and opaque pointer arrays.

The "route config set default" command now only sets the default
max_routes for new VRFs. The tbl8 default is always auto-derived.
Per-VRF tbl8 override is still available through "iface add/set vrf".

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Robin Jarry <rjarry@redhat.com>
When requesting FIB info for a specific VRF, the default entry
(with VRF ID 0 and global default values) was unconditionally
sent before the per-VRF entry. This caused `route config show
vrf <name>` to always display a spurious default line.

Guard the default entry emission with a check on vrf_id so it
is only included when listing all VRFs.

Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Robin Jarry <rjarry@redhat.com>
The VRF reconfig handler rejects any set_attrs bits it does not
recognize. When description support was added, GR_IFACE_SET_DESCR was
not included in the VRF allowed mask, causing description-only updates
to return EOPNOTSUPP even though iface_reconfig() already saved the new
description before calling the type-specific handler.

Fixes: fd1097d ("infra: add interface description field")
Signed-off-by: Maxime Leroy <maxime@leroys.fr>
Reviewed-by: Robin Jarry <rjarry@redhat.com>
@grout-bot grout-bot force-pushed the vrf_fib_conf_rework branch from 7f332a3 to a26e19e Compare March 16, 2026 14:01
@rjarry rjarry merged commit 3e075e6 into DPDK:main Mar 16, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants