Skip to content

Commit ff68fa5

Browse files
committed
Update three-layer architecture docs, knowledge graph and RSS spec to match the actual thash_adjust and IPv6 reverse-proxy fixes.
Correct ff_rss_adjust_sport/6 signatures (first,last), record the FF_RSS_DIAG diagnostic gating, add external-research references, and merge valuable _rf_work draft findings into the formal spec while removing the drafts.
1 parent 9ba1e7d commit ff68fa5

19 files changed

Lines changed: 261 additions & 1233 deletions

docs/01-LAYER1-ARCHITECTURE.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -371,6 +371,8 @@ F-Stack fully leverages modern NIC hardware capabilities:
371371

372372
- **Hardware RSS**: Based on 5-tuple (src-ip, dst-ip, src-port, dst-port, proto)
373373
- **Benefits**: Same connection always routed to the same RX queue → avoids TCP reordering
374+
- **Connect-side RSS affinity (reverse-proxy path)**: On outbound connect, F-Stack reverse-computes an ephemeral source port so the **inbound reply (SYN-ACK)** lands on the local RX queue. The runtime NIC RSS key (KEY_FINAL) is built and published **before** `dev_configure` by `ff_rss_thash_build_key()`, and `ff_rss_adjust_sport[6]()` solves the port within the kernel's ephemeral range `[first,last]` via `rte_thash_adjust_tuple` (reply field order). Gated by `[rss_check] thash_adjust` (default on, decoupled from `rss_check.enable`); diagnostics gated by compile macro `FF_RSS_DIAG` (default off, no dataplane impact). Details: `docs/ff_rss_check_opt_spec/zh_cn/`.
375+
- **IPv6 reverse-proxy address fix (FreeBSD 15, `lib/ff_veth.c`)**: VIP6 configured as a /128 host address (no on-link prefix route), link-local gateway scoped via `in6_setscope`, and DAD skipped (`ND6_IFF_NO_DAD`) because FreeBSD 15 `ip6_input` drops unicast to `NOTREADY`/`TENTATIVE` addresses and user-space has no timer to complete DAD.
374376

375377
## Summary
376378

docs/02-LAYER2-INTERFACES.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ FF_TCPHPTS=1 make
466466
| `ff_write()` returns -1 | Send queue full | Retry later, or increase memory pool |
467467
| Segmentation fault | Cross-lcore socket operation | Ensure each socket operates within a single lcore |
468468
| Connection dropped | Network issue or timeout | Check RST/FIN flags, reconnect |
469-
| Uneven RSS flow distribution | NIC not supported or misconfigured | Check `ff_rss_tbl_init()` logs |
469+
| Uneven RSS flow distribution | NIC not supported or misconfigured; or thash reverse path disabled | Check `ff_rss_tbl_init()` logs; ensure `[rss_check] thash_adjust=1` so connect-side reverse computation aligns the NIC RSS key (KEY_FINAL) and the reply lands on the local queue — see `docs/ff_rss_check_opt_spec/zh_cn/` (R-F/R-G) |
470470

471471
## 8. Best Practices
472472

docs/03-LAYER3-FUNCTIONS.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,10 @@ int ff_rss_tbl_init(void);
222222
```
223223
224224
> **RSS lport optimization (see `ff_rss_check_opt_spec`)**: the connect-side RSS source-port selection has been extended with three optimizations — (0.1) IPv4 kernel-side port-range hooks migrated back to FreeBSD 15.0 (`freebsd/netinet/in_pcb.c`), (0.3) a dynamic fast path that reverse-calculates the source port via `rte_thash_adjust_tuple` with a forced soft re-verify (`ff_rss_thash_ctx_init` / `ff_rss_adjust_sport`), and (0.2) an independent IPv6 path (`ff_rss_check6` / `ff_rss_tbl6_init` / `ff_rss_tbl6_set/get_portrange` / `ff_rss_adjust_sport6`) that leaves the IPv4 structures/signatures untouched. A read-only helper `ff_rss_self_queue_info()` exposes the current process's queue id / nb_queues / reta_size. Details and verification: `docs/ff_rss_check_opt_spec/zh_cn/`. R-D (2026-06, spec 10 §R-D): the secondary soft re-verify in `ff_rss_adjust_sport` / `ff_rss_adjust_sport6` is now runtime-gated via `config.ini [rss_check] recheck=0`/`=1`, off by default to realize the ~100 ns/call performance saving; `recheck=1` is for debug re-verify only. R-E (2026-06, spec 10 §6, commit `ff9e3c449`): IP_BIND_ADDRESS_NO_PORT bind-then-connect RSS 端口选择移植到 FreeBSD 15.0;`freebsd/netinet/in_pcb.c` 在 `in_pcbbind`/`in_pcbbind_setup` 加 `#ifdef`/`#ifndef FSTACK` 门控,bind(addr,0) 时延迟端口分配并跳过入 hash,让后续 connect 走 R-A `INPLOOKUP_LPORT_RSS_CHECK` 路径选 RSS 亲和源端口;`freebsd/netinet6/in6_pcb.c` 同步 v6(路径 B:`in6_pcbconnect` 外层 if 在 FSTACK 下放宽为 `unspec || lport==0`,内层 `in6p_laddr` 赋值加 `IN6_IS_ADDR_UNSPECIFIED` 守卫保用户地址)。+16 / -1,零 lib 改动;FSTACK off 退回原生 15.0;REUSEPORT_LB MPASS 与 bind(addr,N) 零回归。
225+
>
226+
> **R-F (2026-07, thash 反向路径修复 + `thash_adjust` 开关)**:`ff_rss_adjust_sport` / `ff_rss_adjust_sport6` 的签名**新增 `uint16_t first, uint16_t last` 两个形参**(临时端口范围,`in_pcb.c` 传入),反算时把候选端口对齐到 `[first,last]` 内的 reta_size 对齐块,解出端口后 defensively 校验其落在 `[first,last]`(`lib/ff_dpdk_if.c:3252` / `:3689`;调用点 `in_pcb.c` L962-964 / L899-901)。反算 tuple 采用 **reply(inbound SYN-ACK)字段序**(remote/local/dport=80/localPort),确保回程报文落本队列。NIC RSS KEY_FINAL 由 `ff_rss_thash_build_key(port_id, reta_size)`(`ff_dpdk_if.c:3027`)在 `dev_configure` **之前**构建并发布到全局 `rsskey`(v4 sport helper offset=80、v6=272、helper len=16);`ff_rss_thash_ctx_init(void)`(`ff_dpdk_if.c:3189`,仅 primary)启动后读回 NIC key/RETA 做诊断核对。新增 `[rss_check] thash_adjust`(默认 1,与 `rss_check.enable` 解耦)门控 build_key / ctx_init / adjust_sport 的 route② 软扫描回退;`thash_adjust=0` 时仅走内核侧软扫描。诊断 dump(`ff_rss_diag_dump_key` 等)由编译宏 **`FF_RSS_DIAG` 门控,默认关闭,不影响数据面**。
227+
228+
> **IPv6 反向代理地址修复 (2026-07, `lib/ff_veth.c`)**:修复 FreeBSD 15.0 下 IPv6 反代 VIP 不通的三处地址/邻居配置问题(见 `docs/ff_rss_check_opt_spec/zh_cn/11-IPv6反代VIP-onlink修复.md`):(1) **VIP6 作为 /128 host addr**——`ff_veth_setvaddr6` 将 `ifra_prefixmask` 全置 `0xff`(`memset(..., 0xff, 16)`,`ff_veth.c:879-880`),避免安装 on-link 前缀路由,使发往 VIP 同子网其它地址的流量正确经网关;(2) **链路本地网关 scope**——`ff_veth_setgateway6` 对 `fe80::/10` 网关调用 `in6_setscope(&gw.sin6_addr, sc->ifp, NULL)`(`ff_veth.c:849-851`)补全 zone,使链路本地网关可解析;(3) **跳过 DAD**——初始化时对接口置 `ND6_IFF_NO_DAD`(`ff_veth.c:980`),因用户态无定时器驱动会令地址永久 `IN6_IFF_TENTATIVE`,而 FreeBSD 15 `ip6_input` 会静默丢弃发往 `NOTREADY`/`TENTATIVE` 单播地址的报文。均 `#ifdef INET6` 门控,纯 v4 与既有行为零回归。
225229
226230
### 2.5 ff_msg_ring Structure (Inter-Process Communication)
227231

docs/F-Stack_Architecture_Layer1_System_Overview.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -550,6 +550,8 @@ Effect:
550550
✓ Completely avoids TCP reordering and cache pollution
551551
```
552552
553+
**Connect-side reverse path + IPv6 reverse-proxy fix (2026-07, R-F/R-G).** For outbound connect, F-Stack reverse-computes the ephemeral source port so the **inbound reply (SYN-ACK)** lands on the initiating process's RX queue: the NIC RSS KEY_FINAL is built/published before `dev_configure` (`ff_rss_thash_build_key`), and `ff_rss_adjust_sport[6]` solves the port within the kernel's ephemeral range `[first,last]` using reply field order (`rte_thash_adjust_tuple`). Root cause of the earlier ~22-27% mis-queue was **inconsistent keys across adjust/soft-check/NIC** (LFSR key rewrite in `rte_thash_add_helper`), not byte order — fixed by publishing the unified KEY_FINAL to the NIC. Gated by `[rss_check] thash_adjust` (default on); diagnostics gated by compile macro `FF_RSS_DIAG` (default off). Separately, IPv6 reverse-proxy VIP addressing on FreeBSD 15 is fixed in `lib/ff_veth.c` (VIP6 /128 host addr, link-local gateway `in6_setscope`, `ND6_IFF_NO_DAD` because `ip6_input` drops unicast to NOTREADY/TENTATIVE). Details: `docs/ff_rss_check_opt_spec/zh_cn/`.
554+
553555
### 4.4 Initialization Flow
554556
555557
```

docs/F-Stack_Architecture_Layer2_Interface_Specification.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -641,6 +641,8 @@ Priority: per-socket marker > `config.ini [stack] kernel_coexist` > F-Stack. Whe
641641

642642
**R10 — residual-entry coexistence.** `ff_readv`/`ff_writev` (kernel fd → `ff_host_readv/writev`, mimic read/write, connection fds single-stack hot path), `ff_ioctl` (kernel fd uses the **raw Linux request** straight to `ff_host_ioctl`, NOT via `linux2freebsd_ioctl`; dual-stack fd same-driver since R10.1 syncs `FIONBIO`/`FIOASYNC` to the paired host fd (query ioctls like `FIONREAD` not forwarded, to avoid clobbering argp)), `ff_dup` (kernel fd → `ff_host_dup`+encode), `ff_dup2` (both-kernel → `ff_host_dup2`+encode; cross-stack rejected `errno=EINVAL`). Adds 5 host bridges `ff_host_readv/writev/ioctl/dup/dup2`. Known limitation: `ff_select` (encode kernel fd ≫ `FD_SETSIZE` hard limit) / `ff_poll` (conservatively not implemented) do not support kernel-fd coexistence — use `ff_epoll_*`/`ff_kqueue`.
643643

644+
**R-F/R-G — RSS connect-side reverse path + IPv6 reverse-proxy fix (internal, no public API change).** The internal RSS source-port reverse-computation helpers `ff_rss_adjust_sport` / `ff_rss_adjust_sport6` gained two range parameters `uint16_t first, uint16_t last` (ephemeral range from `freebsd/netinet/in_pcb.c`), and now build the tuple in **reply (inbound SYN-ACK) field order** so the reply lands on the local RX queue; the NIC RSS KEY_FINAL is built and published before `dev_configure` by `ff_rss_thash_build_key`. Gated by `[rss_check] thash_adjust` (default on, decoupled from `rss_check.enable`); diagnostics gated by compile macro `FF_RSS_DIAG` (default off). No public `ff_*` socket API signature changes. Separately, `lib/ff_veth.c` fixes IPv6 reverse-proxy addressing on FreeBSD 15 (VIP6 as /128 host addr, link-local gateway `in6_setscope`, `ND6_IFF_NO_DAD` to skip DAD since `ip6_input` drops unicast to NOTREADY/TENTATIVE). Details: `docs/ff_rss_check_opt_spec/zh_cn/`.
645+
644646
---
645647

646648
## 4. Multi-Process and Multi-Thread Interfaces

docs/F-Stack_Architecture_Layer3_Function_Index.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1195,6 +1195,17 @@ bash start.sh -c config.ini -b ./app
11951195

11961196
---
11971197

1198+
## Recent Code Delta (2026-07, RSS thash reverse path + IPv6 reverse-proxy fix)
1199+
1200+
> `01/02/03-LAYER*` 文档同步;权威细节见 `docs/ff_rss_check_opt_spec/zh_cn/`
1201+
1202+
- **`ff_rss_adjust_sport` / `ff_rss_adjust_sport6` 签名新增 `uint16_t first, uint16_t last`**(临时端口范围,`freebsd/netinet/in_pcb.c` 传入):反算把候选端口对齐到 `[first,last]` 内的 reta_size 对齐块,解出端口后校验落在 `[first,last]`;tuple 采用 reply(inbound SYN-ACK)字段序确保回程落本队列(`lib/ff_dpdk_if.c:3252` / `:3689`)。
1203+
- **NIC RSS KEY_FINAL 构建/发布**`ff_rss_thash_build_key(port_id, reta_size)``ff_dpdk_if.c:3027`)在 `dev_configure` **之前**构建 v4/v6 thash ctx 并发布全局 `rsskey`(v4 sport offset=80、v6=272、helper len=16);`ff_rss_thash_ctx_init(void)``:3189`,primary)启动后读回 NIC key/RETA 诊断核对。
1204+
- **`[rss_check] thash_adjust` 开关**(默认 1,与 `rss_check.enable` 解耦):门控 build_key / ctx_init / adjust_sport 的路线②软扫描回退。诊断 dump(`ff_rss_diag_dump_key`)由编译宏 **`FF_RSS_DIAG` 门控,默认关闭,不影响数据面**
1205+
- **IPv6 反代地址修复(`lib/ff_veth.c`**:VIP6 作为 /128 host addr(prefixmask 全 `0xff`,避免 on-link 前缀路由)、链路本地网关 `in6_setscope` 补 zone、`ND6_IFF_NO_DAD` 跳过 DAD(FreeBSD 15 `ip6_input` 丢 NOTREADY/TENTATIVE 单播)。
1206+
1207+
---
1208+
11981209
**Related Documents**:
11991210
- [Layer 1: System Architecture Overview](./F-Stack_Architecture_Layer1_System_Overview.md)
12001211
- [Layer 2: Interface Definition and Specification](./F-Stack_Architecture_Layer2_Interface_Specification.md)

docs/KNOWLEDGE_GRAPH_WIKI.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,29 @@ Traceability: `docs/kernel_event_support_spec/` and `docs/kernel_event_support_s
6363

6464
---
6565

66+
## 2B. Post-index code delta: RSS thash reverse path (R-F) + IPv6 reverse-proxy address fix
67+
68+
> **Manual addendum (not yet re-indexed)**: documented against current source (2026-07). Covers the RSS connect-side reverse-computation fix (`ff_rss_check_opt_spec` R-F) and the IPv6 reverse-proxy VIP/gateway/DAD fix (`ff_rss_check_opt_spec/zh_cn/11-*`). Re-index with `npx gitnexus analyze` to fold into the graph.
69+
70+
The feature makes the **inbound reply (SYN-ACK)** of an outbound connect land on the local RX queue, and fixes IPv6 reverse-proxy VIP reachability on FreeBSD 15. Reverse computation + NIC RSS key sync are gated by `[rss_check] thash_adjust` (default on, **decoupled from `rss_check.enable`**); diagnostics are gated by the compile macro `FF_RSS_DIAG` (default off, no dataplane impact).
71+
72+
| Symbol / Area | Where in source |
73+
|---------------|-----------------|
74+
| `ff_rss_adjust_sport` (**signature +`first,last`**) | `lib/ff_dpdk_if.c:3252` `int ff_rss_adjust_sport(void *softc, uint32_t saddr, uint32_t daddr, uint16_t dport, uint16_t *out_sport, uint16_t first, uint16_t last)` — aligns candidate to a reta_size-aligned block inside `[first,last]` (L3297-3308), builds the tuple in **reply field order** (remote/local/dport=80/localPort), calls `rte_thash_adjust_tuple`, then a defensive `[first,last]` range guard (L3352). Caller `freebsd/netinet/in_pcb.c` L962-964 passes `first,last`. |
75+
| `ff_rss_adjust_sport6` (**signature +`first,last`**) | `lib/ff_dpdk_if.c:3689` `int ff_rss_adjust_sport6(void *softc, const uint8_t *saddr6, const uint8_t *daddr6, uint16_t dport, uint16_t *out_sport, uint16_t first, uint16_t last)`. Caller `in_pcb.c` L899-901. v6 addrs are `const uint8_t *` (not `struct in6_addr *`). |
76+
| `recheck` re-verify (reply order) | On adjust success: `if (!recheck \|\| ff_rss_check(softc, saddr, daddr, dport, sport))` (`ff_dpdk_if.c:3362-3363`) — note `dport`/`sport` are the **reply** src/dst. `recheck` from `[rss_check] recheck` (default 0). |
77+
| `ff_rss_thash_build_key(port_id, reta_size)` | `lib/ff_dpdk_if.c:3027`, declared L168. Built **before** `dev_configure` (`init_port_start` L758-761 calls it when `nb_queues>1 && thash_adjust`), constructs v4 ctx (`rte_thash_add_helper "sport"` at `FF_RSS_THASH_V4_SPORT_OFF=80`, L152) + v6 ctx seeded from v4-rewritten key (`FF_RSS_THASH_V6_SPORT_OFF=272`, L163; helper len `FF_RSS_THASH_SPORT_HELPER_LEN=16`, L153), then publishes KEY_FINAL into the global `rsskey` (L3159+). |
78+
| `ff_rss_thash_ctx_init(void)` | `lib/ff_dpdk_if.c:3189` (primary only). Post-start diagnostic read-back of NIC key/RETA for cross-check; gated by `thash_adjust` at `ff_dpdk_if.c:1491-1493`. |
79+
| `thash_adjust` switch | `lib/ff_config.c`: default set `rcc->thash_adjust = 1` (L946), parsed `thash_adjust=` (L956-957). Gates `build_key` (`ff_dpdk_if.c:757-760`), `ctx_init` (L1491-1493), and the route② soft-scan fallback guard in `adjust_sport[6]` (L3262-3264 / L3701-3703). NULL cfg ⇒ treated as 1. |
80+
| `FF_RSS_DIAG` gating | `ff_rss_diag_dump_key` (`ff_dpdk_if.c:2990`) and its call sites (e.g. L3148-3157 in `build_key`, NIC-readback in `ctx_init`) wrapped in `#ifdef FF_RSS_DIAG`, **default off**. |
81+
| IPv6 VIP6 /128 host addr | `lib/ff_veth.c:861` `ff_veth_setvaddr6`: `memset(&ifr6.ifra_prefixmask.sin6_addr, 0xff, 16)` (L879-880) — /128, avoids on-link prefix route so same-subnet traffic goes via the gateway. |
82+
| IPv6 link-local gateway scope | `lib/ff_veth.c:829` `ff_veth_set_gateway6`: `if (IN6_IS_ADDR_LINKLOCAL(&gw.sin6_addr)) in6_setscope(&gw.sin6_addr, sc->ifp, NULL)` (L849-850) — completes the zone id for a `fe80::/10` gateway. |
83+
| IPv6 skip DAD | `lib/ff_veth.c:908` `ff_veth_setup_interface`: `ND_IFINFO(ifp)->flags \|= ND6_IFF_NO_DAD` (L980) — user-space has no timer to complete DAD, and FreeBSD 15 `ip6_input` silently drops unicast to `NOTREADY`/`TENTATIVE` addresses. |
84+
85+
Traceability: `docs/ff_rss_check_opt_spec/zh_cn/` (00-11), esp. `05-接口设计.md` (signatures), `11-IPv6反代VIP-onlink修复.md` (v6 address fix), and `10-实施与验证报告.md`.
86+
87+
---
88+
6689
## 3. Directory Structure
6790

6891
```

0 commit comments

Comments
 (0)