rfcs: add RFC-18 link classification flex-algo#3288
rfcs: add RFC-18 link classification flex-algo#3288ben-malbeclabs wants to merge 2 commits intomainfrom
Conversation
Introduces onchain link color model using IS-IS Flexible Algorithm (RFC 9350) to separate VPN unicast and multicast forwarding topologies. Defines LinkColorInfo PDA, link_color field on Link, FlexAlgo feature flag, and controller changes for admin-group tagging, flex-algo definitions, system-colored-tunnel-rib BGP resolution, and per-tunnel color extended community stamping. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://datatracker.ietf.org/doc/html/rfc2119). | ||
|
|
||
| DoubleZero contributors operate links with different physical characteristics — low latency, high bandwidth, or both. Today all traffic uses the same IS-IS topology, so every service follows the same paths regardless of what those paths are optimized for. This RFC introduces a link classification model that allows DZF to assign named color labels to links onchain and use IS-IS Flexible Algorithm (flex-algo) to compute separate constraint-based forwarding topologies per color. Different traffic classes — VPN unicast and IP multicast — can then use different topologies. |
There was a problem hiding this comment.
I assume you just mean that MTU could be used to defined a topology, rather than concerns about MTU size based on additional labels?
There was a problem hiding this comment.
Yes, I meant mtu as one of the link characteristics
There was a problem hiding this comment.
I think there are multiple potential use-cases, I am not sure they are worth explicitly calling out in this RFC.
| DZF creates a `LinkColorInfo` PDA per color. It stores the color's name and auto-assigned routing parameters. The program MUST auto-assign the next available admin-group bit (starting at 0) and the corresponding flex-algo number and EOS color value using the formula: | ||
|
|
||
| ``` | ||
| admin_group_bit = next available bit in 0–127 |
There was a problem hiding this comment.
We would need to define a tracking mechanism for admin-group bits; a persistent bitmap or counter on GlobalState or something. Otherwise we'd have to scan all existing LinkColorInfo accounts at instruction time.
|
|
||
| The program MUST validate `admin_group_bit <= 127` on `create` and MUST return an explicit error if all 128 slots are exhausted. This is a hard constraint: EOS supports bits 0–127 only, and `128 + 127 = 255` is the maximum representable value in `flex_algo_number: u8`. | ||
|
|
||
| Admin-group bits from deleted colors MUST NOT be reused. Color deletion is not supported in this RFC, so this constraint applies to any future deletion implementation: reusing a bit before all devices have had their config updated would cause those devices to apply the new color's constraints to interfaces still carrying the old bit's admin-group. At current scale (128 available slots), exhaustion is not a practical concern. |
There was a problem hiding this comment.
Once delete removes the PDA (as in line#160), we cannot enforce the no-reuse requirement without a persistent record of previously allocated bits that would survive PDA deletion.
| **Scope:** | ||
| - Delivers traffic-class-level segregation: multicast vs. VPN unicast at the network level | ||
| - All unicast tenants share a single constrained topology today — the architecture is forward-compatible with per-tenant path differentiation without rework | ||
| - Per-tenant steering (directing one tenant to a different constrained topology) requires adding a `topology_color` field to the `Tenant` account — deferred to a future RFC that builds on the link color model defined here |
There was a problem hiding this comment.
If in the future we want to allow a tenant to have their traffic avoid a link color, then topology_color should maybe be called include_topology_colors, and then in the future we could add exclude_topology_colors. Note the plural since we should make these vectors in case we want to allow multiple colors in the future.
| #[derive(BorshSerialize, BorshDeserialize, Debug)] | ||
| pub struct LinkColorInfo { | ||
| pub name: String, // e.g. "unicast-default" | ||
| pub admin_group_bit: u8, // auto-assigned, 0–127 |
There was a problem hiding this comment.
auto-assigned from global ResourceExtension "AdminGroupBits"
…ulti-color, cleanup - Replace onchain feature flag with controller features.yaml config file - Add LinkColorInfo account with AdminGroupBits ResourceExtension for persistent bit allocation; bits never reused after deletion - Change link_color: Pubkey to link_colors: Vec<Pubkey> (cap 8) - Add include_topology_colors: Vec<Pubkey> on Tenant for per-tenant color assignment; defaults to UNICAST-DEFAULT (color 1) - Redesign interface admin-group cleanup: overwrite remaining colors on deletion rather than targeted named no command - Add full revert: enabled: false removes all flex-algo config - Pin UNICAST-DEFAULT as protocol invariant (bit 0, first color created) - Add controller startup check blocking enabled: true if any Vpn4v loopback has unset flex_algo_node_segment_idx - Clarify clear sweep atomicity and idempotency - Address all PR review comments (nikw9944, vihu, elitegreg) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
RFC-18 introduces a link classification model for DoubleZero using IS-IS
Flexible Algorithm (flex-algo). DZF assigns named color labels to links
onchain; the controller translates these into IS-IS TE admin-groups and
flex-algo topology definitions on Arista EOS devices. BGP color extended
communities steer VPN unicast traffic onto constrained topologies, while
multicast continues to use all links via IS-IS algo 0.
What this RFC specifies:
LinkColorInfoonchain account — defines a color with auto-assignedadmin-group bit (from a new
AdminGroupBitsResourceExtension),flex-algo number, EOS color value, and include/exclude constraint
link_colors: Vec<Pubkey>on theLinkaccount — assigns one or morecolors to a link; controller renders all assigned colors as a single
overwrite
traffic-engineering administrative-groupcommandinclude_topology_colors: Vec<Pubkey>on theTenantaccount — assignsspecific topology colors to a tenant; defaults to color 1
(UNICAST-DEFAULT) if unset
features.yaml— gates flex-algo topology config, linktagging, and BGP color community stamping independently for staged rollout
enabled: falseremoves all flex-algo config from all devicesflex_algo_node_segment_idx; controller blocks enablement until complete