Skip to content

Warm-reload for port and VLAN ACL configuration changes#4807

Open
courtland wants to merge 1 commit into
faucetsdn:mainfrom
courtland:warm-reload-acl
Open

Warm-reload for port and VLAN ACL configuration changes#4807
courtland wants to merge 1 commit into
faucetsdn:mainfrom
courtland:warm-reload-acl

Conversation

@courtland
Copy link
Copy Markdown

@courtland courtland commented May 19, 2026

Warm-reload for changes to a port or VLAN's acls_in list.

This is a fresh PR following the closure of #4799, which combined this
work with VIP, router, and VLAN-base warm-restart. Per @gizmoguy's
feedback on scope and complexity, the new PR narrows to ACL only.

Mechanism

Adopts the per-ACL design @gizmoguy proposed in his
2026-05-01 review of #4733;
supersedes #4733 with @hieunt79 co-authored. For each port/VLAN whose
only change is acls_in, iterate the old and new ACL lists and call
del/add per-ACL with a cold-start priority decrement.
remove_overlap_ofmsgs cancels unchanged del+add pairs at flow-emit.

add_port_acl / del_port_acl gain @gizmoguy's suggested priority
kwarg (defaults to auth_priority, so dot1x callers unchanged).
add_vlan_acl / del_vlan_acl are the new VLAN-side methods
@gizmoguy noted were missing for per-ACL VLAN warm-reload.

What was required to make the per-ACL approach work

  • force_port_vlan rules were landing in the wrong table. The
    per-ACL install path hardcoded the same goto-instruction for both
    allow and force_port_vlan rules. Fine for dot1x (never uses
    force_port_vlan: 1) but wrong for arbitrary user ACLs. Fixed by
    deriving both instructions from the pipeline the way cold-start
    install does.

  • Stale flows when a port loses its last ACL. Empty↔non-empty
    transitions need the wildcard rule flipped, which the per-ACL loop
    can't do, so they fall back to cold_start_port. That path had a
    latent bug @hieunt79 found in VLAN ACL change detection to be able to warm reload #4733: its flowdel and the wildcard
    flowmod shared a flowmodkey, so remove_overlap_ofmsgs cancelled
    the delete and old ACL flows stayed on the switch. Fixed by using
    a priority-less flowdel.

  • Old acls_in lists weren't available where the loop runs.
    dp_init swaps self.dp partway through _apply_config_changes,
    so the per-ACL loop snapshots the old lists into local dicts before
    that point.

  • Priority decrement matches cold-start. The loop's
    priority -= len(acl.rules) mirrors cold-start install's priority
    assignment so unchanged ACLs' del+add pairs have matching
    flowmodkeys and remove_overlap_ofmsgs can cancel them. That's how
    the optimization saves wire traffic.

  • Separate changed_acl_vlans bucket. VLAN ACL changes used to
    fold into changed_vlans (forcing cold restart).
    _get_vlan_config_changes now signals them separately so they
    reach the warm-reload path.

Tests

@hieunt79's two integration tests are included verbatim.
FaucetConfigReloadPortAclRemoveAllTest and a unit-level
ValveRemoveAllPortACLsTestCase cover the wildcard-cancellation case
fixed in the second bullet above. Adding or removing the
first/last ACL on the DP still cold-starts because the port_acl
table itself is conditionally allocated.

Closes #4733.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

❌ Patch coverage is 96.07843% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.43%. Comparing base (9bc5528) to head (b202eee).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
faucet/valve_acl.py 88.24% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4807      +/-   ##
==========================================
+ Coverage   91.42%   91.43%   +0.02%     
==========================================
  Files          46       46              
  Lines        8923     8962      +39     
==========================================
+ Hits         8157     8194      +37     
- Misses        766      768       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@courtland courtland marked this pull request as ready for review May 19, 2026 17:20
Adopts the per-ACL design @gizmoguy proposed in his 2026-05-01 review of
faucetsdn#4733: for each port/VLAN whose only change is acls_in, iterate the old
and new ACL lists and call del/add per-ACL with a cold-start priority
decrement. remove_overlap_ofmsgs in valve_flowreorder cancels unchanged
del+add pairs at flow-emit, so unchanged ACL rules don't churn on the
wire. Inserting an ACL mid-list shifts subsequent priorities -- those
re-emit (acceptable per gizmoguy's fallback in the linked review).

dp.py
- _get_vlan_config_changes returns a new changed_acl_vlans set; ACL-only
  VLAN changes are no longer folded into changed_vlans so they
  warm-reload instead of cold

valve.py _apply_config_changes
- Snapshot the old acls_in lists before dp_init swaps self.dp
- Per-ACL del+add loop replaces the prior cold_start_port handling
  for changed_acl_ports, and adds the equivalent for changed_acl_vlans
- Empty<->non-empty port transitions need the wildcard rule flipped,
  so fall back to cold_start_port

valve_acl.py
- add_port_acl / del_port_acl get a priority kwarg defaulting to
  self.auth_priority so dot1x callers are unchanged
- build_acl_port_of_msgs derives allow/force_port_vlan instructions
  from the pipeline (matching cold-start build_acl_ofmsgs), so rules
  with force_port_vlan: 1 land in the right table; dot1x ACLs never
  carry force_port_vlan: 1 so its behavior is preserved
- New add_vlan_acl / del_vlan_acl symmetric to the port helpers
  (the methods gizmoguy noted were missing for per-ACL VLAN reload)
- del_port's flowdel is now priority-less so its flowmodkey differs
  from the acl_priority wildcard that add_port emits for a port with
  no acls_in; without this, remove_overlap_ofmsgs cancels the delete
  and old ACL flows stay on the switch (the @hieunt79 fix from faucetsdn#4733)

Tests
- FaucetConfigReloadVlanAclChangeTest and FaucetConfigReloadPortAclRemoveTest
  come from @hieunt79's faucetsdn#4733 (the latter exercises the bug gizmoguy
  reproduced reviewing faucetsdn#4733: removing one ACL from a port that has
  multiple ACLs must leave the others installed)
- FaucetConfigReloadPortAclRemoveAllTest covers removing the last ACL
  from a port; without the del_port fix above this hits the
  remove_overlap_ofmsgs cancellation and the old flow stays stale.
  Both ports share the same ACL so removing it from one port doesn't
  change the overall pipeline match set (which would otherwise force
  cold-start for table reconfiguration)
- ValveRemoveAllPortACLsTestCase covers the same scenario at unit
  level
- test_vlan_acl_update now expects warm reload (was cold)
- ValveChangeVLANACLTestCase.test_change_vlan_acl now expects warm

Co-Authored-By: hieunt79 <nguyenhieu264996@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant