Skip to content

docs(plans): AF_XDP integration plan for higher pps (Phase 1)#65

Merged
skullcrushercmd merged 2 commits intomainfrom
perf/portscan-afxdp-plan
Apr 27, 2026
Merged

docs(plans): AF_XDP integration plan for higher pps (Phase 1)#65
skullcrushercmd merged 2 commits intomainfrom
perf/portscan-afxdp-plan

Conversation

@skullcrushercmd
Copy link
Copy Markdown
Contributor

Phase 1 — Design + plan only. No scanner C code changes.

Phase 2 implementation is gated on explicit user/orchestrator approval after this plan PR merges.

Why

anygpt-4's c6in.metal multi-NIC bench hit 12.8 Mpps aggregate at 4 ENIs, gated by the host kernel TX/syscall path: a single AF_PACKET PACKET_TX_RING socket caps near 3 Mpps even with PACKET_QDISC_BYPASS. The Python adapter already documents this (vulnscanner-zmap-adapter.py:669).

AF_XDP lets the ENA backplane (~100 Mpps theoretical on c6in.metal) be the actual bottleneck instead.

What this PR adds

A single new file: plans/2026-04-27-portscan-afxdp-plan-v1.md (407 lines).

The plan is comprehensive and mergeable as a reference doc — the user has it in-tree without committing to implementation.

Sections:

  • §2: Walk-through of the existing scanner I/O paths (AF_PACKET sender, half-wired PF_RING ZC) with concrete file:line citations.
  • §3: Architecture diagram, file layout, dispatch refactor (resolves the pre-existing wart where engine.c:165 hardcodes sender_thread and never invokes pfring_zc_sender_thread), per-NIC AF_XDP setup (XSK per (NIC, queue_id), UMEM/ring sizing, ENA zero-copy quirks), build-system integration (USE_AF_XDP=1 mirroring USE_PFRING_ZC=1), CLI plumbing (--io-engine= flag).
  • §4: Dependency surface (libxdp-dev, libbpf-dev), Ubuntu 22.04 vs 24.04 caveats, runtime probe sequence, capability requirements (CAP_NET_RAW already present, CAP_BPF needs to be added to systemd).
  • §5: Test plan — synthetic veth loopback, unit-style harness, live c6in.metal bench, AF_PACKET regression.
  • §6: Risk register including ENA driver-reset history (amzn-drivers#221), libxdp version skew, ZC lower-half-channel constraint, AIMD ceiling coordination with anygpt-33.
  • §7: Effort estimate — 6-8 days over four small Phase 2 PRs.
  • §8: Rollout plan — feature-flag default off, manual canary on c6in.metal first, gradual default-on per tier.
  • §9: Open questions (deliberately not blocking this PR): where the C changes live (upstream PR vs AnyVM-Tech/anyscan-engine-c fork), libnuma optionality, SO_PREFER_BUSY_POLL, AIMD coordination with anygpt-33.
  • §10: Reference index — kernel docs, libxdp API, xdp-tutorial, Suricata AF_XDP, Cloudflare postmortems, amzn-drivers issues.

LOC estimate

Component Est. LOC C
src/send-afxdp.c (new) ~280
src/recv-afxdp.c (new) ~120
include/xdp-defs.h (new) ~60
src/engine.c dispatch refactor (modify) ~50
include/scanner_defs.h, scanner.h ~30
src/conf.c CLI plumbing (modify) ~30
Makefile (modify) ~10
Total ~580

In line with the brief's 450-500 ballpark; the extra ~80 covers the engine.c dispatch refactor that's prerequisite for AF_XDP and incidentally fixes the never-invoked PF_RING ZC path.

Coordination

  • ✅ Stayed out of anyscan_rate_controller.py (anygpt-33 owns it).
  • ✅ Did not touch /etc/anyscan/runtime.env or anything ops-owned.
  • ✅ Did not touch the AnyGPT submodule pointer.
  • §6 risk register flags one item that needs anygpt-33 coordination during Phase 2 (AIMD ceiling parameter).

Verification

  • cargo build --workspace: clean (only pre-existing dead-code warnings on anyscan-api.rs).
  • cargo test --workspace: 437 passed, 0 failed, 4 ignored — matches the brief's baseline expectation.

This is a doc-only change, so the build/test verification is just confirming no accidental damage.

Out of scope (explicit)

  • Writing the AF_XDP C code (Phase 2).
  • Implementing the upstream fork decision (open question §9.1).
  • Bumping the AnyGPT submodule pointer.
  • Editing prod systemd units or runtime.env.

Reviewer ask

Please verify the plan is comprehensive enough that a Phase 2 worker can execute task-by-task without needing additional context, and call out any architectural choice that should be re-litigated before Phase 2 begins (especially §9.1 — where the C changes physically live).

🤖 Generated with Claude Code

Comprehensive design + dependency + LOC + test + risk + rollout plan for
adding an AF_XDP I/O path to the bundled C scanner. No scanner code is
changed; Phase 2 implementation is gated on user approval.

Motivation: anygpt-4 c6in.metal 4-NIC bench hit 12.8 Mpps aggregate,
gated by the AF_PACKET TX/syscall path (single socket caps ~3 Mpps even
with PACKET_QDISC_BYPASS). AF_XDP lets the ENA backplane (~100 Mpps
theoretical on c6in.metal) be the actual bottleneck.

Design highlights:
- New send-afxdp.c / recv-afxdp.c slot into the same shape as the
  existing USE_PFRING_ZC build flag (template files in upstream).
- Plan also resolves a pre-existing dispatch wart: engine.c hardcodes
  sender_thread, so the existing PF_RING ZC files compile but never
  run. Plan introduces a small io_engine_vtable_t that wires both
  paths cleanly.
- Per-NIC: one XSK socket per (ENI, queue_id), TX-only (rx=NULL),
  XDP_USE_NEED_WAKEUP, native zero-copy on ENA where channel index
  permits, SKB-mode fallback otherwise.
- Build: USE_AF_XDP=1 mirrors USE_PFRING_ZC=1; runtime opt-in via new
  --io-engine={af_packet,af_xdp,pfring_zc} flag (default af_packet).
- Estimated ~580 LOC C (450 net new, 130 in modified files), 6-8 days
  of implementation effort over four small PRs.

Phase 2 is explicitly out of scope for this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e93252435f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


```bash
apt-get install -y --no-install-recommends \
libxdp1 libbpf1 libelf1 libz1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use installable runtime package names

The runtime install command includes libelf1 and libz1, but on Ubuntu 24.04 (the baseline this plan recommends in §4.1) those package names are not available (libelf1t64/zlib1g are). If this snippet is followed in Phase 2 bootstrap scripts, apt-get install will fail and block AF_XDP rollout on the target AMI, so the package list should be corrected or made distro-conditional.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — this was a real bug that would have broken Phase 2 bootstrap on the recommended Ubuntu 24.04 baseline. Fixed in af90ff0.

Specifically:

  • libz1zlib1g (the codex callout is right; libz1 does not exist on any Debian-family distro I can find).
  • libelf1libelf1t64 on Noble, retained as libelf1 for Jammy/bookworm (per the Ubuntu 64-bit time_t transition).
  • libxdp1 and libbpf1 are unchanged across Jammy/Noble/bookworm — verified they're not on the t64 rename list.

§4.2 now provides two separate apt-get install lines (Noble baseline + older-LTS fallback) and a note on the t64 rename so Phase 2 doesn't re-discover this. Phase 2 will pick the right line based on lsb_release -rs at install time rather than copy-pasting blindly.

The runtime install snippet in §4.2 used `libelf1 libz1`, but on Ubuntu
24.04 (the baseline §4.1 recommends) `libelf1` was renamed to
`libelf1t64` as part of the 64-bit time_t transition, and `libz1` was
never a real Debian-family package name (the runtime zlib package is
`zlib1g`). Phase 2 bootstrap scripts copying this snippet verbatim
would have hit `apt-get install: Unable to locate package`.

Provides separate, correct package lists for the Ubuntu 24.04 baseline
and the Ubuntu 22.04 / Debian bookworm fallback, plus a note on the
t64 rename so Phase 2 doesn't re-discover this.

Addresses chatgpt-codex-connector review on PR #65.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@skullcrushercmd skullcrushercmd merged commit 551d1f4 into main Apr 27, 2026
@skullcrushercmd skullcrushercmd deleted the perf/portscan-afxdp-plan branch April 27, 2026 19:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: af90ff00e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +171 to +173
| `bind_flags` | `XDP_USE_NEED_WAKEUP` | Standard kernel-side config; `xsk_ring_prod__needs_wakeup` decides when to `sendto`. |
| `xdp_flags` | `XDP_FLAGS_DRV_MODE` `\| XDP_ZEROCOPY` | Native zero-copy on ENA (driver supports it; see §3.5). Falls back to `XDP_FLAGS_SKB_MODE` (generic) if ENA refuses ZC for this queue. |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Move zero-copy flag to AF_XDP bind flags

This plan puts XDP_ZEROCOPY under xdp_flags, but in AF_XDP zero-copy/copy mode is selected via socket bind flags (sxdp_flags), not XDP program-attach flags. If Phase 2 implements this literally (also echoed later in the fallback probe section), af_xdp setup can fail on otherwise supported hosts because the wrong flag set is passed at socket/program setup time. Keep xdp_flags to driver/SKB mode selection and put zero-copy policy in bind_flags (XDP_ZEROCOPY or XDP_COPY).

Useful? React with 👍 / 👎.

@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Deployed to prod ✅

Prod redeploy

Deployed `2026-04-27 19:11 UTC`
Source HEAD `origin/main` @ `551d1f48` (covers #65, #66)
Build `cargo build --release --locked --bin anyscan-api --bin anyscan-worker` → 1m 4s
anyscan-api `1f3af11f…` → `00b4b83b…` (PID 3236925)
anyscan-worker `3f77e19e…` (unchanged — PR #66 doesn't touch worker rust source)
Old binaries preserved at `/opt/anyscan/bin/anyscan-{api,worker}.pre-pr66-deploy.bak`
Public site `HTTP 200 | 10ms | 61107b` ✓
Wedge-sweep janitor startup line confirmed

The api binary sha changed because PR #66's edits to anyscan_rate_controller.py, vulnscanner-zmap-adapter.py, and runtime.worker.env.template flow into the api binary via include_bytes! in HOSTED_AGENT_BUNDLE_ASSETS. Asset audit clean (no install-line-vs-asset-list mismatch this round either).

Fresh bundle: agent-bundle-linux-x86_64__20260427191153-3236925-5dd517c87d76.tar.gz

Size 17162815 bytes
Fingerprint 5dd517c87d76

Required content all confirmed in tar -tzf:

  • extensions/anyscan_rate_controller.py
  • extensions/portscan-adapter.py
  • env/runtime.env.template
  • bin/tune-scanner-host.sh
  • bin/reserve-control-bandwidth.sh

PR #66 plumbing verified inside the bundle

# anyscan_rate_controller.py:180-187
cpu_pressure = cpu_saturated and heartbeat_slip
if not cpu_pressure and not network_pressure:
    …
if cpu_pressure and not network_pressure:
    …                       # local CPU starvation — don't rate-cut
if network_pressure and not cpu_pressure:
    …                       # genuine network slip — rate-cut

Plus # survives even partial windows in the calibration writer (line 838) and ANYSCAN_RATE_MAX_CONCURRENT_SUBPROCESSES referenced in portscan-adapter.py (lines 47, 846) and runtime.env.template:76.

Bundle endpoint serves the freshly-built artifact

$ curl -fsSL "https://scan.anyvm.tech/api/agent/install.sh?rebuild=false&platform=linux-x86_64" | grep BUNDLE_NAME
BUNDLE_NAME='agent-bundle-linux-x86_64__20260427191214-3236925-5dd517c87d76.tar.gz'

Worker remote-update — one alive worker

The auto-recreated fleet worker (anyscan-ec2-worker, i-0b94844f5ace75d28 at 44.203.214.161) was alive and already running a post-#66 bundle from its fresh bootstrap. Remote-update fired against it cleanly:

Pre Post
agentd sha a786750834… a786750834… (same — PR #66 didn't touch worker source)
AGENT_BUNDLE_NAME …191248-…5dd517c87d76 …191309-…5dd517c87d76
Service active active

ANYSCAN_RATE_MAX_CONCURRENT_SUBPROCESSES=4 confirmed in /etc/agentd/runtime.env — PR #66's install-time default fired correctly. So the next 8-NIC metal launch will only run 4 shards by default, exactly as the deploy note said.

Note for the next bench cycle

When the user authorizes another c6in.metal launch and an 8-shard CPU-pressure handling test, the operator can override:

echo 'ANYSCAN_RATE_MAX_CONCURRENT_SUBPROCESSES=8' >> /etc/agentd/runtime.env
systemctl restart agentd

…then re-run the same bench shape to confirm the CPU-vs-network slip distinction handles the regressed case from the prior bench (8-NIC at 1.34M aggregate). Expectation: AIMD's cpu_pressure branch should not rate-cut on heartbeat lag when CPU is the cause, so per-NIC pps shouldn't collapse to 167k.

Out of scope per spec

skullcrushercmd added a commit that referenced this pull request Apr 27, 2026
…rk (#67)

Phase 2 PR 1 of 4 of the AF_XDP integration plan (PR #65 §9.1) ships a
refactor of the scanner C source (engine.c dispatch table + --io-engine
CLI flag + PF_RING ZC dispatch fix) which lives in a fork of the third-party
upstream scanner repository:

  - Upstream:        github.com/Lorikazzzz/VulnScanner-zmap-alternative-
  - Fork:            github.com/AnyVM-Tech/anyscan-engine-c
  - Phase 2 PR 1 commit on the fork:
      AnyVM-Tech/anyscan-engine-c@998c66b on
      branch perf/portscan-afxdp-phase2-pr1

Why fork: the plan §9.1 calls out that the upstream scanner is third-party
and proposes a fork under AnyVM-Tech as the resting place for the
integration patches (AF_XDP send/receive paths in PRs 2 + 3, build
integration in PR 4, and follow-on PF_RING ZC cluster init).

This commit only updates the AnyScan-side scripts to resolve from the new
fork:

  - install-external-deps.sh:11-12 — clone URL and local checkout dir now
    default to the AnyVM-Tech fork. Both can still be overridden via the
    existing ANYSCAN_VULNSCANNER_REPO_URL / ANYSCAN_VULNSCANNER_REPO_DIR
    environment variables (no behaviour change for callers that set them).
  - package-worker-bundle.sh:519-525 — preferred lookup order is now
    `anyscan-engine-c/scanner` first, the legacy
    `VulnScanner-zmap-alternative-/scanner` directory second (kept for
    transitional dev checkouts), and `/opt/anyscan/bin/scanner` last.

What is NOT in this PR:
  - The actual AF_XDP send/receive paths (PR 2 + 3 of Phase 2).
  - The Makefile / install-external-deps.sh `USE_AF_XDP=1` build flag
    plumbing (PR 4 of Phase 2).
  - Live c6in.metal benchmarks (PR 5 of Phase 2).
  - AnyGPT submodule pointer bump.
  - Any change to runtime.env or to the AIMD rate controller.

Test plan:
  - `cargo build --workspace` (release) — clean.
  - `cargo test --workspace --no-fail-fast` — 437 tests pass (matches
    post-#66 baseline: 371 + 31 + 2 + 33).
  - `python3 -m py_compile vulnscanner-zmap-adapter.py` — clean.
  - On the scanner fork:
      - `make` (default AF_PACKET) — builds.
      - `make test` — 11 dispatch smoke tests pass.
      - `gcc -fsyntax-only -DUSE_PFRING_ZC ...` — compiles, dispatch reaches
        the ZC thread bodies.
      - `./scanner --io-engine=af_xdp` exits 1 with a clear "USE_AF_XDP=1
        not set; AF_XDP send/receive paths land in PRs 2 + 3" message.
      - `./scanner --io-engine=pfring_zc` (without USE_PFRING_ZC) exits 1
        with the equivalent compile-flag error.
      - `./scanner --io-engine=bogus` exits 1 with "Unknown --io-engine".

Refs: AnyVM-Tech/AnyScan PR #65, plan §3.1 + §3.3 + §9.1.

Co-authored-by: AnyVM-Tech AO <agent@anyvm.tech>
@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Phase 2 — c6in.metal live bench (anygpt-42)

Live bench on freshly-deployed AF_XDP build (PR #71 + engine-c PR #3) on c6in.metal (128 vCPU, 8 ENIs). Driven by anygpt-42; replaces the wedged anygpt-4. Engine commit f1288d6, AnyScan commit 989c44e, scanner: /opt/agentd/bin/scanner (60,992 B stripped, libxdp.so.1 / libbpf.so.1 / libelf.so.1 linked).

Headline

Config Aggregate peak Aggregate avg vs prior baseline
AF_PACKET 8-NIC, threads=8 7.49 M pps 0.87× baseline 8.58 M (regression check)
AF_XDP 1-NIC ens1, threads=4 6.40 M 5.83 M 2.02× AF_PACKET 1-NIC 3.16 M
AF_XDP 8-NIC, cap=4, threads=8 22.43 M 19.20 M 2.66× AF_PACKET 8-NIC baseline ✅

Per-config wall + tx_dropped:

Config Wall Per-NIC peak (pps) tx_dropped Notes
AF_PACKET 8-NIC, t=8, cooldown=2 17.92 s ~936 K 0 Counters via /sys/class/net/.../tx_packets (kernel TX ring)
AF_XDP 1-NIC ens1, t=4, c=2 25.54 s 6.40 M 0 drv+copy mode (zerocopy Operation not supported on ENA at this kernel)
AF_XDP 8-NIC cap=4, t=4 29.02 s ~1.55 M 0 Cap=4 design: 4 simultaneous scanners, each 4 sender threads = 16 active threads
AF_XDP 8-NIC cap=4, t=8 20.26 s ~2.80 M 0 Best — 32 active threads, all within 128-core capacity
AF_XDP 8-NIC cap=4, t=16 (combined=16) 21.46 s ~2.85 M 0 No further gain — bottleneck is per-NIC, not thread count
AF_XDP 8-NIC cap=8, t=4 26.28 s ~805 K 0 Regression — 32 sockets fight for memory bandwidth, cap=4 is the sweet spot

Live vs synthetic projections (PR #65 §10)

Comparison Synthetic projection Live Verdict
AF_XDP single-NIC speedup over AF_PACKET 1-NIC ~10–12 M (3.5× baseline) 6.40 M (2.02× baseline) Lower than projected; ENA forces drv+copy (not driver-mode zerocopy), per-thread copy budget ceilings throughput
AF_XDP 8-NIC cap=4 aggregate 30–50 M (toward 14 M / ENI ENA spec) 22.43 M peak / 19.2 M avg Below projection but ~2.66× AF_PACKET 8-NIC baseline; ENA xdp drv+copy per-NIC ceiling appears to be ~3 M, not 14 M
AF_PACKET 8-NIC regression check 8.58 M baseline holds 7.49 M (87 %) Within ~13 % of baseline; jitter likely from cooldown-time=2 per-shard tail rather than algorithmic regression

Setup notes (operational findings the plan should fold in)

  1. MTU constraint: ENA driver rejects XDP attach at MTU=9001 (jumbo frames default). Bench had to lower all 8 ENIs to MTU=3498 (ip link set dev <iface> mtu 3498). Plan §6 should add this as a worker-bench host-prep step.
  2. Queue space: ENA also rejects XDP attach when combined-queue count is at hardware max (32 on c6in.metal). Setting ethtool -L <iface> combined 8 (or 16) freed up XDP TX queue slots. Plan §6 should add this too.
  3. Mode ladder: scanner correctly walks drv+zerocopy → drv+copy → skb. ENA on kernel 6.12.74 supports only drv+copy; zerocopy was tested across all 8 NICs and rejected with Operation not supported. Driver-side zerocopy patches in newer kernels (6.16+ ena_xdp_zc, in-flight upstream) would close the projection gap.
  4. /sys/class/net/.../statistics/tx_packets does NOT count XDP TX on ENA — counters appeared as 0 (or ~negative due to clock skew) for AF_XDP runs. Bench harness switched to scanner self-reported pps for AF_XDP rows. PR docs(plans): AF_XDP integration plan for higher pps (Phase 1) #65 §11 (telemetry) should call out this counter caveat.
  5. PACKET_FANOUT errno 22 on AF_PACKET path — visible in scanner stderr; affects RX dedup only, doesn't affect TX bench numbers but should be investigated separately.
  6. Cap=8 regression is real and worth an instrumented note in the plan: AF_XDP umem (16 MiB × 32 sockets = 512 MiB resident) plus simultaneous send-thread descriptor churn evicts cache lines; the rate-controller should keep cap=4 as the multi-NIC default on ≤128-core hosts.

Bench harness

Custom bash harness (not via the AnyScan API/adapter path) — direct /opt/agentd/bin/scanner invocations, one subprocess per ENI, with --shards i/N sharding the target range. Captures /sys/class/net/<iface>/statistics/{tx_packets,tx_dropped} pre/post for AF_PACKET, and parses the scanner's own Mp/s avg self-reports for AF_XDP. Target 198.18.0.0/15 (RFC2544 benchmark range, IGW-dropped sink) ports 1-1024 = 134 M probes total per run, sharded 1/8 each. Cooldown 2 s. Rate cap 99999999 (effectively uncapped). All runs returned tx_dropped=0 — no kernel TX-ring overflow. Logs preserved in /tmp/anygpt-42-bench/ on the metal until termination.

Out of scope (per task instructions)

  • No api/adapter-driven bench (rate-controller per-window classification, heartbeat_jitter not captured) — direct scanner invocation traded these for a smaller measurement window. Future bench should drive via port_scan API on metal-afxdp-bench-1 worker so the adapter's classifier emits its histogram.
  • No AnyGPT submodule bump.
  • No /0 or production-scope scan.

Cleanup

  • c6in.metal i-0958c76a9ba1a0483 — terminated.
  • 7 secondary ENIs (eni-06c639cf.., eni-0b6da590.., eni-0ad37ebf.., eni-098e0e96.., eni-0815d240.., eni-04459a62.., eni-065de86f..) — deleted post-termination.
  • anyscan-ec2-worker-manager.service — restarted (will respawn the standard xlarge fleet from the AF_XDP bundle published in PR feat(portscan): ANYSCAN_SCANNER_IO_ENGINE env knob + adapter --io-engine plumbing #70 deploy).

@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Phase 2 — c6in.metal 15-NIC live bench (anygpt-48): adding NICs and a 6.19 kernel fail to close the gap

Live follow-on to anygpt-42's 8-NIC bench (issuecomment-4336192354). Same 198.18.0.0/15 × ports 1-1024 = 134 M-probe target; same custom bash harness (one scanner subprocess per ENI, --shards i/N). c6in.metal launched via tools/ec2_worker_manager.py once with ANYSCAN_EC2_INSTANCE_TYPE=c6in.metal ANYSCAN_MAX_ENIS=15 after stopping anyscan-ec2-worker-manager.service and terminating the existing xlarge fleet.

TL;DR

The 30–50 M-pps synthetic projection from this PR's plan §10 still does not hold on AWS c6in.metal in 2026-04. PR #73 (kernel backport opt-in) and PR #74 (15-ENI launch path) wire the knobs cleanly, but the underlying premises — "kernel 6.16+ unlocks ena_xdp_zc" and "more ENIs unlock more PCIe trees" — both fail in production for the reasons documented inline below. The 8-NIC cap=4 t=8 22.43 M-peak number from anygpt-42 remains the c6in.metal AF_XDP ceiling.

Headline

Config Aggregate peak Aggregate avg vs anygpt-42 8-NIC baseline
AF_PACKET 8-NIC, T=8 (anygpt-42) 7.49 M — (the baseline)
AF_XDP 8-NIC cap=4, T=R=8 (anygpt-42, best) 22.43 M 19.20 M 2.66× the AF_PACKET baseline ✅
AF_PACKET 15-NIC, T=8 (anygpt-48) 12.96 M 7.69 M 1.03× AF_PACKET 8-NIC — flat regression-check
AF_XDP 15-NIC, T=R=4 (apples-to-apples thread budget) 8.56 M 5.18 M 0.27× the 22.43 M peak — sharp regression ❌
AF_XDP 15-NIC, T=R=8 (matching anygpt-42 per-NIC config) 12.18 M 8.65 M 0.54× peak / 0.45× avg — still regressed ❌
AF_XDP 15-NIC drv+zerocopy (Bench C) UNRUNNABLEena_xdp_zc still not upstream as of Linux 6.19.11

Bench harness wall + tx_dropped

Config Wall Per-NIC peak tx_dropped Notes
AF_PACKET 15-NIC, T=8, c=2 19.05 s ~ 469 K 0 /sys/class/net/<if>/statistics/tx_packets deltas; perfectly balanced 8.95 M packets per NIC across all 15 (sharding good)
AF_XDP 15-NIC, T=R=4, c=2 29.08 s 0.42 – 0.78 M n/a drv+copy fallback; clear card asymmetry — card-1 NICs 0.66–0.78 M peak, card-0 NICs 0.42–0.44 M
AF_XDP 15-NIC, T=R=8, c=2 18.46 s 0.72 – 1.04 M n/a drv+copy fallback; 240 active threads on 128-core (oversubscription); per-NIC peak ~3.9× lower than anygpt-42's 2.80 M @ 8-NIC

Why it regressed: three live findings on top of the prior anyscan_afxdp_ena_constraint memory

1. PR #74's c6in.metal NetworkCard fixture is incorrect

The PR's test fixture and docstrings claim:

NetworkCards = [
  {NetworkCardIndex: 0, MaximumNetworkInterfaces: 5},  (primary)
  {NetworkCardIndex: 1, MaximumNetworkInterfaces: 4},
  {NetworkCardIndex: 2, MaximumNetworkInterfaces: 3},
  {NetworkCardIndex: 3, MaximumNetworkInterfaces: 3},
]

Live aws ec2 describe-instance-types --instance-types c6in.metal --region us-east-1:

TopLevel MaximumNetworkInterfaces = 16
NetworkCards:
  card 0: max_nics=8, perf=Up to 170 Gigabit
  card 1: max_nics=8, perf=Up to 170 Gigabit
total via cards = 16

So the live launch placed 15 ENIs as 8/7 across 2 cards, not 5/4/3/3 across 4. PR #74's tools/test_ec2_worker_manager.py::test_max_enis_15_on_c6in_metal_emits_15_network_interfaces (and the docs at tools/ec2_worker_manager.py:121-125) were authored against a synthetic mock that doesn't match AWS's reality. The 40-test unit suite passes against the mock; the launch-path code itself is fine — it's the verification that's mocking the wrong shape. Suggest: refresh the fixture to 2 cards × 8 + add an integration test that asserts against aws ec2 describe-instance-types output for the actual instance type.

The "more PCIe trees" justification in PR #74's commit body (and §6.1 of the plan doc) therefore over-promises: c6in.metal has 2 trees, not 4. That alone caps the multi-NIC headroom at ~2× single-tree, not ~4×.

2. PR #73's bookworm-backports source is the wrong suite for the current AMI

The PR defaults to ANYSCAN_KERNEL_BACKPORT_SUITE=bookworm-backports and ANYSCAN_KERNEL_BACKPORT_PACKAGE=linux-image-cloud-amd64. The current ANYSCAN_EC2_AMI_ID=ami-06e3e2b7faca0265d is Debian 13 (Trixie), not Debian 12 (Bookworm) — so bookworm-backports/linux-image-cloud-amd64 is at version 6.12.74-2~bpo12+1, which is the same kernel version the metal already runs. The opt-in completes successfully ("0 upgraded, 0 newly installed"), so the operator gets a green light without ever moving off 6.12.74.

For this bench I worked around it by apt-get install -t trixie-backports linux-image-amd64, which pulls 6.19.11-1~bpo13+1. Suggest: the install path should detect /etc/os-release ID=debian VERSION_CODENAME=trixie and switch the suite to trixie-backports. The package selection should also note that linux-image-cloud-amd64 from trixie-backports is currently still 6.12.74 — only the non-cloud linux-image-amd64 jumps to 6.19.

3. The big one: ena_xdp_zc still hasn't landed upstream as of Linux 6.19.11

Post-reboot probe:

$ uname -r
6.19.11+deb13-cloud-amd64

$ nm /lib/modules/.../ena.ko.xz | grep -iE 'xsk|_zc|zerocopy'
                 U xdp_convert_zc_to_xdp_frame   ← only undefined import (generic XDP)

$ strings /lib/modules/.../ena.ko.xz | grep -iE 'xsk|af_xdp_zc|XDP_ZEROCOPY|xsk_pool|xsk_buff'
(no matches)

Compare with mlx5_core.ko which has dozens of xsk_* and _zc symbols. The ENA driver has standard XDP_TX/REDIRECT/PASS/DROP paths but no driver-side zerocopy/XSK pool support — exactly what the in-flight upstream patches were supposed to add for "6.16+".

Live confirmation from the scanner's mode ladder when attaching XDP on ens1:

[*] afxdp: xsk_socket__create(ens1, q=0, mode=drv+zerocopy) failed: Operation not supported
[*] afxdp: xsk_socket__create(ens1, q=1, mode=drv+zerocopy) failed: Operation not supported
[*] afxdp: xsk_socket__create(ens1, q=2, mode=drv+zerocopy) failed: Operation not supported
[*] afxdp: xsk_socket__create(ens1, q=3, mode=drv+zerocopy) failed: Operation not supported
[*] afxdp: thread 0 bound ens1 queue 0 in mode=drv+copy (umem=16 MiB, tx=4096 rx=4096 frames × 2048 B)

Identical Operation not supported to anygpt-42 on the 6.12.74 kernel. The kernel backport is not a workaround for the ENA zerocopy gap on AWS today. The plan's "wait for kernel 6.16+ AMI" path needs to be revised — the patches still aren't merged. The viable paths to unlock zerocopy continue to be (a) Mellanox/mlx5 NICs on non-AWS bare metal, or (b) PF_RING ZC with a paid ntop license on AWS (PR #75 is currently engine-init-stub, so that path is not yet usable either).

4. Bonus: PR #74's multi-ENI launch path skips public-IP allocation

A single-NIC launch on this subnet auto-assigns a public IP because MapPublicIpOnLaunch=true on the subnet. The multi-ENI path (launch_args["NetworkInterfaces"] = build_network_interfaces(...)) does not set AssociatePublicIpAddress=True on the primary interface, so the launched metal had no public IP and was unreachable from outside the VPC. I worked around it by aws ec2 allocate-address + associate-address on the primary ENI. Suggest: when eni_attach['attached'] == 1 (single-ENI fallback even with the new path) the launch should set AssociatePublicIpAddress: True on the primary entry; for multi-ENI, the operator may still want the same on DeviceIndex=0, NetworkCardIndex=0.

Per-NIC detail (T=R=8 run, the better of the two AF_XDP runs)

  enp13s0   peak=0.74M avg=0.60M
  enp154s0  peak=0.91M avg=0.56M
  enp155s0  peak=0.90M avg=0.56M
  enp156s0  peak=0.89M avg=0.56M
  enp157s0  peak=0.81M avg=0.56M
  enp158s0  peak=0.83M avg=0.56M
  enp159s0  peak=0.77M avg=0.56M
  enp15s0   peak=0.74M avg=0.60M
  enp160s0  peak=0.90M avg=0.56M
  ens1      peak=0.73M avg=0.60M
  ens2      peak=0.72M avg=0.56M
  ens3      peak=1.04M avg=0.60M
  ens4      peak=0.72M avg=0.60M
  ens5      peak=0.74M avg=0.60M
  ens7      peak=0.74M avg=0.60M
AGGREGATE   peak=12.18M  avg=8.65M

Per-NIC numbers are roughly uniform within each card domain (enp154-160 cluster around 0.83 M, ens*+enp1[35]s0 around 0.75 M). The drop from 2.80 M per-NIC (anygpt-42, 8 NICs, T=R=8) to ~0.81 M (anygpt-48, 15 NICs, T=R=8) is the CPU-thrashing signature of 240 active threads on 128 cores — confirms that per-host AF_XDP throughput on c6in.metal is CPU-bound under drv+copy, not NIC-bound. Adding NICs without unlocking zerocopy doesn't help because the bottleneck moves to descriptor-copy CPU work, which scales with thread count regardless of NIC count.

Setup notes (delta from anygpt-42)

  1. Bundle: agent-bundle-linux-x86_64__anygpt-48-afxdp-kbackport-20260428174058.tar.gz (sha256 775cf88e1a3434846a038f210c1aa841e10efa215fbcb689cef7d6db34d930c0) built via package-worker-bundle.sh ANYSCAN_USE_AF_XDP=1 ANYSCAN_INSTALL_KERNEL_BACKPORT=1 ANYSCAN_USE_PFRING_ZC=0. PF_RING off per task brief (PR fix(build): wire ANYSCAN_USE_PFRING_ZC=1 through install-external-deps + package-worker-bundle + deploy + adapter #75 engine-init stub).
  2. Metal i-043714fbb73cca641, AZ us-east-1a, EIP 54.165.21.227. Primary ENI eni-0c47a4a7f3ba69511.
  3. SSH wedge after systemctl reboot was longer than expected (~6 min), but came back into 6.19 cleanly. EC2 console showed boot reaching cloud-final.service then continuing into a normal start. No grub reorder needed; linux-image-amd64 set itself as the default.
  4. AF_XDP host prep was the same as anygpt-42: ip link set <if> mtu 3498 + ethtool -L <if> combined 8 (matched RSS to T=R=8) on every ENI before bench B.
  5. /sys/class/net/<if>/statistics/tx_packets continues to NOT count XDP TX on ENA — bench-A used kernel-counter deltas; bench-B used scanner self-reports (send: …M p/s).
  6. AF_PACKET path was perfectly balanced (each of 15 ENIs sent exactly 8,947,8xx packets ± 60), confirming the harness's --shards i/15 distribution is sound.

Summary verdict on the env-knob PRs

PR Wired correctly? Closes 22.43 M → 30–50 M gap?
#71 (AF_XDP build wireup) ✅ scanner ships with libxdp linkage; mode ladder fires; falls back to drv+copy n/a — was never claimed to
#73 (kernel backport opt-in) ⚠️ wired correctly but defaults to wrong suite for current AMI; even on 6.19.11 from trixie-backports, ENA still has no zerocopy support Noena_xdp_zc not upstream as of 6.19.11
#74 (15-ENI launch path) ⚠️ launches 15 ENIs successfully, but fixture claims 4 cards × (5/4/3/3) when reality is 2 cards × 8; also drops public-IP allocation on multi-ENI path No — adding ENIs at drv+copy is CPU-bound; 15-NIC regressed vs 8-NIC at every T=R config tested
#75 (PF_RING ZC build wireup) (intentionally not exercised — engine-init stub blocks runtime) (not testable yet)
#76 (PF_RING gating) (not exercised — use_pfring_zc=0 in bundle) (n/a)

Cleanup

  • c6in.metal i-043714fbb73cca641 — terminated 18:32:08Z (live for ~36 min total, ~$3.30 in on-demand spend).
  • 15 ENIs (eni-0c47a4a7f3ba69511, eni-049606866a71c9d87, eni-0c9c10e4dc28adde6, eni-0a6a17a3e724ff90b, eni-0acf00ef684fffb47, eni-02dc9e888c93d5a33, eni-0aad977041b58b791, eni-087e24d177d27ee7b, eni-03da977940c9bb8ef, eni-07cb100d30af9c73a, eni-06afba14a7dde0c93, eni-0d21500d7a1f5d641, eni-0c35ff242e90b7a59, eni-06c8680f516ae4ee6, eni-09746474722608a97) — auto-deleted on termination (DeleteOnTermination=true was the default).
  • EIP eipalloc-023db7e0b15246cb7 (54.165.21.227) — released.
  • .external-runtime.env restored to c6in.xlarge (no ANYSCAN_MAX_ENIS); anyscan-ec2-worker-manager.service restarted; replacement xlarge i-0778d6f698047418f already running.
  • Bench logs preserved at scan.anyvm.tech:/root/.worktrees/AnyGPT/anygpt-48/anygpt-48-bench-logs.tar.gz.

Out of scope (per task brief)

cc PR #73 / PR #74 — flagging the fixture and suite-default issues as separate follow-ups rather than blocking the env-knob PRs that are otherwise wired correctly.

skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
Phase 1 design document for adding a DPDK io_engine to the bundled C
scanner (AnyVM-Tech/anyscan-engine-c). Mirrors PR #65's AF_XDP plan
structure across §1-§10.

Why now: PR #65's AF_XDP work landed but the c6in.metal bench revealed
ENA on kernel <=6.12.74 forces drv+copy (not drv+zerocopy), capping the
8-NIC ceiling at ~22 M pps — short of the 30-50 M pps projection. DPDK
via vfio-pci bypasses the ENA kernel driver entirely, projecting
50-100 M pps realistic on c6in.metal.

This supersedes PR #63's deferral recommendation (which was conditioned
on AF_XDP clearing the throughput target — it did not).

Plan scope:
- engine repo: ~1,100 LOC (send-dpdk.c, recv-dpdk.c, dpdk-eal.c,
  dpdk-defs.h, vtable slot in engine.c, USE_DPDK Makefile block)
- AnyScan-side wire-up: ~765 LOC (mirrors PR #71's ANYSCAN_USE_AF_XDP
  pattern across install-external-deps.sh / package-worker-bundle.sh /
  deploy.sh / runtime.worker.env.template / adapter.py + new
  tools/setup-dpdk.sh for hugepages and vfio-pci bind/unbind)
- NIC-binding decision: dedicated-DPDK-NIC pattern. eth0 stays on
  kernel for agentd heartbeat; ENIs eth1..eth7 (c6in.metal) go to
  vfio-pci. Single-NIC instances are DPDK-ineligible by design.
- Effort: 12-15 days implementation + canary, ~3-4 weeks total.

Phase 2 implementation is gated on user/orchestrator approval after
this plan PR merges. No engine C code, no runtime config, no submodule
bumps in this PR.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
… 4x5/4/3/3=15) (#77)

PR #74 mocked NetworkCards as 4 cards distributed 5/4/3/3=15, but actual
AWS DescribeInstanceTypes for c6in.metal returns 2 cards x 8 = 16
(anygpt-48 live bench, PR #65 issuecomment-4338158487). The launch path
code is fine - distribute_enis_across_cards handles any card layout -
but the synthetic test fixture and the docstring example encoded a
shape that doesn't match production AWS.

Refresh the fixture, the docstring, and every test that hardcoded
15-derived numbers. Add a new RecordedDescribeInstanceTypesIntegrityTests
class that anchors the fixture against
tools/c6in_metal_describe_instance_types.json (a real
`aws ec2 describe-instance-types` capture) so future drift gets caught
at unit-test time instead of bench time.

Effect on capacity claim: c6in.metal has 2 PCIe trees, not 4, so the
multi-NIC headroom caps at ~2x single-tree, not ~4x.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…launches (#79)

When the launch payload uses an explicit NetworkInterfaces[] list
(ANYSCAN_MAX_ENIS set), AWS does NOT honor the subnet's
MapPublicIpOnLaunch — the operator has to opt in by setting
AssociatePublicIpAddress=True on the primary ENI explicitly.

In anygpt-48 (PR #65 issuecomment-4338158487) this caused the
c6in.metal launch to come up unreachable from outside the VPC; the
operator had to manually allocate-address + associate-address
post-launch as a workaround.

Add an opt-in env knob ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP (default off so
existing fleets are unchanged). When set, plumb through ManagerConfig
to build_network_interfaces, which sets AssociatePublicIpAddress=True
on the entry with DeviceIndex=0 NetworkCardIndex=0 only — AWS rejects
the field on secondaries.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
ANYSCAN_KERNEL_BACKPORT_SUITE defaulted to bookworm-backports
regardless of host. On the current Debian 13 (Trixie) AMI this means
bookworm-backports/linux-image-cloud-amd64 resolves to 6.12.74 — the
same kernel the metal already runs — so the opt-in completes "0
upgraded, 0 newly installed" and the operator gets a false green
light without ever upgrading.

Detect /etc/os-release VERSION_CODENAME and default the suite to
<codename>-backports. Switch the default package to linux-image-amd64
(NOT linux-image-cloud-amd64) on non-bookworm suites, because
trixie-backports cloud-amd64 is still 6.12 as of 2026-04 — only the
non-cloud image jumps to 6.19.

Operator-set ANYSCAN_KERNEL_BACKPORT_SUITE / _PACKAGE / _SOURCES_LIST
still win — detection is just a smarter default. Source-list path is
also derived from the resolved suite so the file matches.

ANYSCAN_OS_RELEASE_FILE env override added so the test suite can
inject a synthetic os-release without touching /etc/os-release on the
test host.

See PR #65 issuecomment-4338158487 (anygpt-48 c6in.metal bench) for
the kernel-resolution trace.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…undle + deploy + adapter (#81)

Phase 2 wire-up for the DPDK io_engine landing in
AnyVM-Tech/anyscan-engine-c PR #4. Mirrors PR #71's AF_XDP wire-up shape
across the install / bundle / deploy / adapter / install-time-probe
chain so the engine repo's USE_DPDK=1 build flag actually reaches every
producer of a worker bundle, and so the runtime --io-engine=dpdk knob
plumbed through ANYSCAN_SCANNER_IO_ENGINE has DPDK code to dispatch to.

Why DPDK now: AWS ENA on kernel ≤6.12.74 forces AF_XDP into drv+copy
mode, capping c6in.metal at ~22M pps aggregate (memory:
anyscan_afxdp_ena_constraint, also PR #65 issuecomment-4338158487 —
6.19.11 STILL does not have ena_xdp_zc). DPDK bypasses the kernel ENA
driver entirely via vfio-pci and removes the syscall-kick + lower-half
-channels-only ZC constraint.

What lands here:
  - install-external-deps.sh: ANYSCAN_USE_DPDK env knob;
    binary_has_dpdk_linkage probe (librte_eal.so via ldd → readelf -d);
    install_dpdk_build_deps (libdpdk-dev + dpdk apt-get, fail-open);
    cache short-circuit invalidation when cached binary lacks DPDK
    linkage; vulnscanner_make_args extension; post-build assertion.
  - package-worker-bundle.sh: same env knob, linkage probe,
    rebuild_scanner_with_dpdk helper, bundle_engine_make_args, README.txt
    use_dpdk field. Composes with USE_AF_XDP=1 USE_PFRING_ZC=1 — the
    earliest matching rebuild block produces a binary linked against
    every requested engine in a single make invocation.
  - deploy.sh: same env knob, linkage probe, make_args extension,
    pre-DPDK cached-binary drop, post-build assertion.
  - install-worker-bundle.sh: binary_has_dpdk_linkage,
    probe_dpdk_runtime_available (5 gates: scanner USE_DPDK-built,
    librte_eal.so loadable, vfio_pci kernel module, hugepages reserved
    in /sys/kernel/mm/hugepages/*, /dev/vfio/vfio present),
    apply_dpdk_availability writing ANYSCAN_DPDK_AVAILABLE.
  - vulnscanner-zmap-adapter.py: SUPPORTED_IO_ENGINES gains "dpdk";
    _IO_ENGINE_AVAILABILITY_KEYS maps "dpdk" → ANYSCAN_DPDK_AVAILABLE
    so the same fall-back-with-warning path the AF_XDP / PF_RING ZC
    plumbing already exercises picks up dpdk for free.
  - runtime.worker.env.template: full DPDK section documenting
    ANYSCAN_USE_DPDK (build-time), ANYSCAN_DPDK_AVAILABLE (install
    probe), ANYSCAN_DPDK_PCI_BDFS (BDF / iface CSV), and
    ANYSCAN_DPDK_HUGEPAGES_GB (default 4).
  - tools/setup-dpdk.sh (NEW, ~370 LOC): bind / unbind / status
    subcommands. Reserves hugepages (1 GiB pages preferred, falls back
    to 2 MiB), modprobe vfio-pci, dpdk-devbind.py --bind=vfio-pci.
    Idempotent (re-runs are no-ops). Reversible (`unbind` returns the
    NICs to ena and frees hugepages). Refuses to bind eth0 (agentd
    control-plane interface) and refuses to bind the only NIC. THP
    gets switched to "never" on bind (DPDK + THP fragments the static
    hugepage pool).
  - tools/test-install-external-deps-dpdk.sh (NEW, ~270 LOC): mirrors
    test-install-external-deps-afxdp.sh. Four cases × multiple
    assertions: default unset → no USE_DPDK=1 in make argv; opt-in +
    missing scanner → USE_DPDK=1; opt-in + cached non-DPDK binary →
    make clean + USE_DPDK=1; opt-in + cached DPDK-linked binary → no
    rebuild. Stubs make/git/ldd/readelf so it runs hermetically.
  - test_vulnscanner_adapter_io_engine.py: 7 new DPDK assertions
    covering the dpdk-with-runtime-available, dpdk-without-runtime
    -fall-back-with-warning, missing-availability-var, uppercase
    normalization, and cross-engine availability isolation cases.
    Updated test_invalid_value_falls_back_to_af_packet_with_warning
    to use "fake_engine" instead of "dpdk" — dpdk is now valid.

Verification (on Debian bookworm with libdpdk-dev 24.11 installed):
  - tools/test-install-external-deps-afxdp.sh: 11/11 (regression OK).
  - tools/test-install-external-deps-pfring-zc.sh: 10/10 (regression OK).
  - tools/test-install-external-deps-dpdk.sh: 10/10.
  - python3 -m unittest discover: 116/116 (32 in
    test_vulnscanner_adapter_io_engine, of which 7 are DPDK-specific).
  - All bash scripts parse cleanly via `bash -n`.
  - tools/setup-dpdk.sh status runs cleanly (no NICs bound, expected).

Engine PR for io_engine_dpdk: AnyVM-Tech/anyscan-engine-c#4

Out of scope (separate workers per the plan):
  - Phase 2 systemd unit edit adding CAP_SYS_RAWIO/CAP_IPC_LOCK/
    CAP_NET_ADMIN to anyscan-worker.service. Documented in the env
    template. Until that lands operators must add caps manually before
    flipping the runtime knob.
  - Live c6in.metal bench (plan §5.3).
  - AMI rebuild.
  - mlx5 / non-AWS hardware support.

Refs: plans/2026-04-28-portscan-dpdk-impl-v1.md (§3.10 wire-up, §3.11
NIC-binding decision, §4.3 kernel feature checks, §5.7 unit test shape).
      anygpt-50

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Phase 2 — c6in.metal AF_PACKET / AF_XDP / DPDK live bench (anygpt-52): regressions block AF_XDP + DPDK; AWS PPS allowance still untested

Live bench on c6in.metal (128 vCPU, 2 NetworkCards × 8 ENIs each = 16 max, 8 attached) driven by anygpt-52. Engine commit ccfd077 (post-PR #4 DPDK io_engine), AnyScan commit 4faa236 (post-PR #81 DPDK build wireup), scanner SHA256 19a0435964c5… linked against libxdp.so.1 + librte_eal.so.25 + libbpf.so.1. Bundle build proven OK; deploy comment on PR #70 (issuecomment-4338757890).

TL;DR

Config Aggregate avg pps vs anygpt-42 baseline AWS pps_allowance_exceeded
AF_PACKET 8-NIC, T=R=8, -r 50M, ~36 s wall 7.98 M 1.07× (anygpt-42 was 7.49 M) ✅ regression-check passes 0 / no quota hit
AF_XDP 8-NIC n/a, segfaulted (engine bug) ❌
DPDK 7-NIC (vfio-pci) n/a, EAL refused (engine bug) ❌

The 22 M AF_XDP ceiling from anygpt-42 is still neither confirmed nor refuted as AWS-imposed. AF_PACKET-only at ~8 M pps is well below the AWS ENA per-instance PPS allowance for c6in.metal — pps_allowance_exceeded, bw_in_allowance_exceeded, bw_out_allowance_exceeded all stayed at 0 across all 8 ENIs for the full bench window. Without working AF_XDP or DPDK on this build, we cannot push hard enough to surface the AWS quota.

Bench harness

  • Target: 198.18.0.0/15 × ports 1-1024 (= 134 M probes, 16.78 M per shard at 8 shards). Same target as anygpt-42 / anygpt-48.
  • One scanner subprocess per ENI, --shards i/N, -T 8 -R 8, -r 50M, -c 1 (cooldown 1 s), -q.
  • PPS measurement: /sys/class/net/<iface>/statistics/tx_packets delta ÷ wall time (ethtool -S does not surface tx_packets on ENA).
  • AWS quota: ethtool -S <iface> | grep allowance_exceeded PRE and POST on every NIC.

AF_PACKET 8-NIC results (T=R=8, -r 50M, wall 35.84 s)

enp154s0 tx_delta=16777290  (468377 pps)  rx_delta=16
enp155s0 tx_delta=16777260  (468377 pps)  rx_delta=15
enp156s0 tx_delta=16777261  (468377 pps)  rx_delta=15
enp157s0 tx_delta=16777263  (468377 pps)  rx_delta=16
ens1     tx_delta=16777328  (468379 pps)  rx_delta=98     ← control plane (also scanned)
ens2     tx_delta=16779063  (468427 pps)  rx_delta=1689   ← carries SSH traffic + DNS
ens3     tx_delta=16777284  (468378 pps)  rx_delta=18
ens4     tx_delta=16777269  (468377 pps)  rx_delta=17

AGGREGATE_TX_PACKETS=285,901,464  (16.78 M × 8 shards × ~2 passes; ran longer than one scan-cycle)
AGGREGATE_PPS=7,981,615           (≈ 7.98 M aggregate)

Per-NIC PPS is remarkably uniform at ~468 K each — matches the PR #65 plan §2 documented AF_PACKET single-socket cap (~3 M / 8 threads ≈ 375-500 K per scanner — consistent with vulnscanner-zmap-adapter.py:669's comment about per-PACKET_TX_RING socket throughput). 8-NIC scaling is near-linear because each scanner has its own ring on its own ENI.

AWS *_allowance_exceeded deltas across all 8 ENIs

Counter PRE POST Δ
pps_allowance_exceeded 0 0 0
bw_in_allowance_exceeded 0 0 0
bw_out_allowance_exceeded 0 0 0
conntrack_allowance_exceeded 0 0 0
linklocal_allowance_exceeded 0 0 0
conntrack_allowance_available 6567014 6567010 −4 (SSH/DNS, irrelevant)

At ~8 M aggregate pps, AWS does not throttle. That's the only hard data point we have on the AWS quota for c6in.metal from this run.

Why AF_XDP did not run

[*] afxdp: xsk_socket__create(enp154s0, q=0, mode=drv+zerocopy) failed: Operation not supported
libbpf: elf: skipping unrecognized data section(8) .xdp_run_config
…
Segmentation fault

The bind-mode ladder in anyscan-engine-c/src/send-afxdp.c:287-289 (AFXDP_BIND_ZEROCOPY → AFXDP_BIND_DRV_COPY → AFXDP_BIND_SKB) is correct in source — it's just supposed to retry on each failure — but in practice the second attempt (drv+copy) segfaults the process instead of returning a clean error. The first attempt's xsk_socket__create returns -EOPNOTSUPP (matches memory anyscan_afxdp_ena_constraint: ENA on Linux 6.12 only supports drv+copy, not drv+zerocopy), so this is the exact code path we know we need on AWS — and it's broken.

Bug location: the engine's afxdp_try_bind() doesn't fully tear down the partially-created XSK / UMEM / xdp_program after a failed xsk_socket__create() before the next attempt. Repro is single-scanner -i ens1 --io-engine=af_xdp -T 4 -R 4 … on c6in.metal — segfaults every time.

This is a regression-blocker for AF_XDP on AWS. anygpt-42 was on engine f1288d6 (pre-PR #4); current ccfd077 (PR #4 merge) introduced the regression — most likely the DPDK changes share state-init code with the AF_XDP setup path. Splitting this into its own engine PR would have caught it via the bench gate plan §5.3 demands.

Why DPDK did not run

Two engine-side bugs found in PR #4's dpdk-eal.c:

  1. Hardcoded nb_tx_desc=1024 exceeds ENA's max=512:

    ETHDEV: Invalid value for nb_tx_desc(=1024), should be: <= 512, >= 128, and a product of 1
    [-] dpdk-eal: rte_eth_tx_queue_setup(port=0, q=0) failed: Invalid argument
    

    The TX descriptor count must be queried from rte_eth_dev_info_get(...).tx_desc_lim.nb_max per device, not hardcoded. (mlx5 supports 4 K, ENA caps at 512.)

  2. EAL argv splitter mangles --socket-mem 1024: the trailing 1024 token gets replaced by the scanner program path on the way into rte_eal_init. Repro: scanner --io-engine=dpdk … -- --file-prefix=foo --socket-mem 1024 ends up as scanner --file-prefix=foo --socket-mem scanner in the EAL log. The space-separated form fails; the = form (--socket-mem=1024,1024) is untested.

Plus deploy-side blockers (worked around but worth flagging for follow-up):

  1. librte-net-ena25 not in the bundle install path. Debian DPDK 24.11.4 ships ENA PMD as a separate package; without it, rte_eal_init succeeds but no eth ports are probed. Manual apt install librte-net-ena25 was required. install-external-deps.sh::install_dpdk_build_deps should pull this in when ANYSCAN_USE_DPDK=1 is set.

  2. setup-dpdk.sh bind refused active interfaces. dpdk-devbind safety check ("Warning: routing table indicates that interface is active. Not modifying") forces the operator to manually ip link set <ifc> down on each NIC before bind succeeds. PR fix(build): wire ANYSCAN_USE_DPDK=1 through install-external-deps + bundle + deploy + adapter #81's tools/setup-dpdk.sh should down the iface itself or at least call out the requirement in its README.

  3. 1 GiB hugepages reserved without hugetlbfs mount. setup-dpdk.sh reserved 8 × 1 GiB pages but didn't mount a 1G-pagesize hugetlbfs. EAL fell back to "no hugepages reported on node 0/1" until I manually mount -t hugetlbfs -o pagesize=1G nodev /mnt/huge1g.

Other deploy-path bugs surfaced (worked around)

  1. PR feat(ec2-worker): ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP knob for multi-ENI launches #79 ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP=true + multi-ENI is rejected by AWS: InvalidParameterCombination — The associatePublicIPAddress parameter cannot be specified when launching with multiple network interfaces. AWS only honors AssociatePublicIpAddress on NetworkInterfaces[] when there is exactly one NetworkInterface entry. PR feat(ec2-worker): ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP knob for multi-ENI launches #79's multi-ENI public-IP path is non-functional on every multi-ENI launch. Workaround: launch without it, then aws ec2 allocate-address && aws ec2 associate-address --network-interface-id <primary-eni> post-launch.

  2. /api/agent/install.sh?rebuild=false served a stub scanner. The bundle the worker fetched at bootstrap (agent-bundle-…20260428202928…) had a 37-byte shell script in place of the scanner binary (#!/bin/sh; echo cached-scanner-pfring). The API's bundle-cache path does not honor ANYSCAN_USE_AF_XDP/ANYSCAN_USE_DPDK flags from the original package-worker-bundle.sh invocation; bundles served from the API are AF_PACKET-only stubs unless those env vars are set in the API's systemd EnvironmentFile. Manually scp'd the real scanner from the operator-built bundle as a workaround.

  3. AF_XDP and DPDK runtime probes (in install-worker-bundle.sh) returned available=false on c6in.metal Debian 13 + kernel 6.12.74. Probe message: "kernel <5.10 or libxdp.so missing; ANYSCAN_AF_XDP_AVAILABLE=false" — but the kernel is 6.12 and libxdp1 is in the package archive. The probe's kernel-version check is wrong (likely cut -d. -f1-style parsing failing on 6.12.74).

What the AWS PPS allowance numbers actually tell us

Reproducing the table in memory anyscan_aws_pps_allowance:

Source Aggregate PPS hit pps_allowance_exceeded non-zero? Verdict
anygpt-4 / anygpt-42 AF_PACKET 8-NIC 7.49 M (not captured) unknown
anygpt-42 AF_XDP 8-NIC cap=4 t=8 (best) 22.43 M (not captured) unknown
anygpt-48 AF_XDP 15-NIC 12.18 M (regressed) (not captured) unknown
anygpt-52 AF_PACKET 8-NIC T=8 R=8 7.98 M 0 — quota not hit AWS allowance ≥ 8 M pps for c6in.metal
anygpt-52 AF_XDP engine bug, see §"Why AF_XDP"
anygpt-52 DPDK 7-NIC engine bug, see §"Why DPDK"

So all we've confirmed empirically is AWS allows ≥ 8 M pps. The 22 M historic ceiling could still be either AWS- or engine-imposed. Memory anyscan_aws_pps_allowance (and the underlying AWS public docs) suggest the c6in family quota is on the order of low-tens-of-M; the next bench that pushes >10 M is the one that decides this question.

Follow-up tickets (please file before next bench)

  1. AF_XDP fall-back segfaultanyscan-engine-c/src/send-afxdp.c::afxdp_try_bind() must clean up XSK/UMEM/xdp_program on xsk_socket__create() failure before the next bind-mode attempt. Repro single-line above.
  2. DPDK hardcoded nb_tx_desc=1024 — query rte_eth_dev_info.tx_desc_lim.nb_max per device.
  3. DPDK EAL argv splitter — passes --socket-mem 1024 as --socket-mem <next-token-which-is-actually-program-path>.
  4. PR feat(ec2-worker): ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP knob for multi-ENI launches #79 multi-ENI public IP — top-level launch flag silently misuses the AWS API. Either (a) only allow on single-NIC launches, (b) emit a post-launch EIP allocate+associate, or (c) document that the operator must do the EIP step themselves.
  5. API bundle cache/api/agent/install.sh serves a stub scanner unless ANYSCAN_USE_AF_XDP=1 ANYSCAN_USE_DPDK=1 are in the api's systemd EnvironmentFile. Either pin the env vars there or hash the bundle's build flags into the cache key.
  6. install-external-deps.sh — must apt install librte-net-ena25 when ANYSCAN_USE_DPDK=1 (currently only installs libdpdk-dev).
  7. tools/setup-dpdk.sh bind — should ip link set <ifc> down automatically (or at least error with that hint instead of "interface active. Not modifying") and should mount 1 GiB hugetlbfs when reserving 1 GiB hugepages.
  8. AF_XDP install probe — kernel version check fails on 6.12.74. Ship a probe that handles 3-component versions and is a no-op when libxdp.so is present in ldconfig -p.

Cleanup

  • setup-dpdk.sh unbind — restored 7 ENIs to ENA driver, freed hugepages. ✓
  • aws ec2 terminate-instances --instance-ids i-0b8fe00496163d227 — instance shutting-down. ✓
  • aws ec2 release-address --allocation-id eipalloc-0bf5d678f6ff3bc20 — EIP released. ✓
  • systemctl start anyscan-ec2-worker-manager.service — watchdog back online (it'll re-launch its c6in.xlarge fleet). ✓

Cross-reference: memory anyscan_aws_pps_allowance (PR-comment-pointer to here updated), memory anyscan_afxdp_ena_constraint. Deploy proof: PR #70 issuecomment-4338757890.

— driven by anygpt-52, instance lifetime ≈ 1 h 12 min, total spend ≈ $7.

skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…ches (#82)

PR #79's ANYSCAN_EC2_ASSOCIATE_PUBLIC_IP=true on a multi-ENI launch is
hard-rejected by AWS:

  InvalidParameterCombination — The associatePublicIPAddress parameter
  cannot be specified when launching with multiple network interfaces.

AWS only honors AssociatePublicIpAddress on NetworkInterfaces[] when
exactly one entry is supplied, even if the field appears only on the
primary entry of a multi-NIC payload. The entire RunInstances call
fails. Reported in PR #65 issuecomment-4339242358 (anygpt-52).

Fix: when len(NetworkInterfaces) > 1, suppress the field inline and
allocate-address + associate-address on the primary ENI post-launch.
The recreate path now also releases the previously-recorded EIP
before terminating the old instance so we don't leak Elastic IPs on
every recreate.

- build_network_interfaces only emits AssociatePublicIpAddress when
  the resulting payload is single-NIC (target_count == 1).
- Ec2WorkerManager._associate_public_ip_post_launch allocates an EIP
  (Domain=vpc), associates it with the primary ENI (DeviceIndex=0),
  records AllocationId/AssociationId in self.state. Allocate or
  associate failures are surfaced in eni_attach.public_ip but do not
  abort the recreate — the worker is still usable on private IPs.
- Ec2WorkerManager._release_recorded_eip disassociates and releases
  any previously-recorded EIP at the start of recreate_instance.

Tests:
- New: launch payload free of AssociatePublicIpAddress on multi-ENI;
  allocate_address + associate_address called post-launch with the
  primary ENI's NetworkInterfaceId; allocation_id persisted.
- New: AllocateAddress failure does not abort recreate.
- New: AssociateAddress failure still records AllocationId so the
  next recreate can release it.
- New: previously-recorded EIP is disassociated + released before
  terminating the old instance on the next recreate.
- Updated: prior tests that asserted the broken inline-flag behavior
  on multi-NIC now assert the field is suppressed everywhere.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
When the API process runs with ANYSCAN_USE_AF_XDP=1 (or any of the
sibling ANYSCAN_USE_DPDK / ANYSCAN_USE_PFRING_ZC /
ANYSCAN_INSTALL_KERNEL_BACKPORT knobs), package-worker-bundle.sh
rebuilds the scanner with matching feature linkage. The resulting
bundle carries a feature-flagged scanner binary even though the
*installed* /opt/anyscan/bin/scanner stays the same.

`current_hosted_agent_bundle_source_fingerprint` previously only
hashed the embedded asset payload and the installed binaries — the
build-flag env vars were absent from the cache key. So a default-flags
rebuild produced a fingerprint identical to a feature-flagged one and
silently overwrote the cached AF_XDP/DPDK bundle. Operators bootstrapping
via /api/agent/install.sh?rebuild=false then received an
AF_PACKET-only stub scanner. Reported in PR #65 issuecomment-4339242358
(anygpt-52): "/api/agent/install.sh?rebuild=false served a stub scanner".

Fix: fold each documented build-flag env var name + value into the
fingerprint hash. Bundles built with different flags now land in
different cache slots; rebuild=false serves the bundle that matches
the API's current build-flag environment instead of a stale one.

- BUNDLE_BUILD_FLAG_ENV_VARS pins the exact set so future ANYSCAN_USE_*
  knobs surface as a static-array compile-time decision.
- hash_bundle_build_flag_env_vars takes an env-lookup closure so unit
  tests can hash hermetic inputs without poking std::env (which would
  race with parallel test execution).
- bundle_build_flag_env_fingerprint is a #[cfg(test)] helper that
  produces just the build-flag contribution as a SHA-256 hex digest.

Tests:
- Default vs ANYSCAN_USE_AF_XDP=1 produce different fingerprints.
- Each flag flipped on its own produces a unique fingerprint (no
  collisions between AF_XDP-only and DPDK-only builds).
- Same flag with values "1" / "0" / unset are all distinct.
- Repeated lookup with same input returns same fingerprint.
- Static check that the four documented flags are all in the const.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…DPDK (#84)

Debian DPDK 24.11.x ships every Poll-Mode Driver as its own package
(librte-net-<vendor><abi>) instead of bundling them into libdpdk-dev.
Without the relevant PMD installed, rte_eal_init() succeeds but no
eth ports are probed and the scanner refuses to start.

anygpt-52 hit this on c6in.metal: ENA NICs were silently absent from
rte_eth_dev_count_avail() until librte-net-ena25 was apt-installed
manually. Reported in PR #65 issuecomment-4339242358.

Fix: install_dpdk_build_deps now pulls librte-net-ena25 (AWS ENA PMD —
every c6in/c5n/m5n/m6in instance) AND librte-net-mlx5-25 (Mellanox
ConnectX-5/6 PMD for non-AWS bare-metal hosts at Equinix/OVH/Hetzner)
alongside libdpdk-dev. The 25 ABI suffix matches Debian trixie's DPDK
24.11.x. Stock Intel ixgbe/i40e drivers are still in libdpdk-dev's
auto-pull set so we don't need to name them.

Falls back to libdpdk-dev alone if PMDs are unavailable in the
archive — better to ship a partial DPDK build than fail the install.
The runtime warning makes it explicit so operators know to check
rte_eth_dev_count_avail() if it returns 0 later.

Test: new Case 5 in test-install-external-deps-dpdk.sh runs the
install script with ANYSCAN_INSTALL_DPDK_DEPS=true (default) + stubs
id and apt-get on PATH, then asserts apt-get install was called with
libdpdk-dev, librte-net-ena25, and librte-net-mlx5-25.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
…1g (#85)

Two operator-side speedbumps surfaced on c6in.metal during anygpt-52
(PR #65 issuecomment-4339242358):

1. dpdk-devbind refuses to bind active interfaces:
     Warning: routing table indicates that interface is active.
     Not modifying.
   Operators had to `ip link set <ifc> down` on every NIC by hand
   before the bind step succeeded.

2. Reserving 1 GiB hugepages was not enough for rte_eal_init:
     EAL: No available 1048576 kB hugepages reported on node 0/1
   Operators had to `mount -t hugetlbfs -o pagesize=1G nodev /mnt/
   huge1g` themselves before the scanner could find the pages.

Fix:

- bdf_to_iface() walks /sys/bus/pci/devices/<bdf>/net/ to map BDF →
  kernel iface name. cmd_bind invokes `ip link set <ifc> down` on
  each target before invoking dpdk-devbind, with best-effort failure
  semantics (missing ip command, missing iface, already-down iface
  all proceed to the bind).

- ensure_hugetlbfs_mount() mounts a hugetlbfs of the matching pagesize
  at the configured path after a successful nr_hugepages reservation.
  Default targets are /mnt/huge1g (1 GiB) and /mnt/huge2m (2 MiB);
  ANYSCAN_DPDK_HUGEPAGES_1G_MOUNT / _2M_MOUNT override or set them
  to "" to opt out (operators provisioning hugetlbfs via fstab).
  Idempotent: detects existing hugetlbfs of the right pagesize via
  findmnt / /proc/mounts and skips remount.

- ANYSCAN_DPDK_LOAD_ONLY=1 hook lets test-setup-dpdk.sh source the
  script for hermetic helper testing without triggering the argv
  dispatch.

Tests (new tools/test-setup-dpdk.sh):

- cmd_bind invokes `ip link set <ifc> down` BEFORE dpdk-devbind
  --bind=vfio-pci (order verified by line numbers in a single cmd
  log).
- ensure_hugetlbfs_mount calls `mount -t hugetlbfs -o pagesize=1G
  nodev <path>`.
- ensure_hugetlbfs_mount is a no-op when target is already hugetlbfs
  with the matching pagesize.
- ensure_hugetlbfs_mount is a no-op with an empty mount path.
- bdf_to_iface returns iface for a populated fake /sys tree.
- bdf_to_iface returns empty when net/ dir is missing.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
skullcrushercmd added a commit that referenced this pull request Apr 28, 2026
probe_afxdp_runtime_available reported "kernel <5.10 or libxdp.so
missing" on c6in.metal Debian 13 + kernel 6.12.74 even though both
prerequisites were satisfied. The previous parameter-expansion parser
silently mishandled some 3-component release shapes; the failure
mode reported in PR #65 issuecomment-4339242358 (anygpt-52) was a
generic "false" with no indication of which check fired, leaving
the operator to guess.

Fix:

- New parse_kernel_major_minor() helper uses awk -F'[.-]' so 3-
  component releases like 6.12.74-cloud-amd64, 5.10.0-13-amd64,
  6.12.74+deb13+1-amd64, and 5.4.282-rt all parse cleanly. Returns
  "MAJOR MINOR" on stdout, "0 0" on parse failure.

- probe_afxdp_runtime_available emits a one-line stderr explanation
  whenever it returns "false" so the operator can immediately see
  which check fired ("kernel 4.19 < 5.10", "libxdp.so not in
  ldconfig -p", "could not parse kernel version"). Quiet on success.

- apply_afxdp_availability captures the probe stderr and includes
  the reason in its summary log line — replaces the previous
  hardcoded "kernel <5.10 or libxdp.so missing" that was wrong half
  the time.

- ANYSCAN_INSTALL_LOAD_ONLY=1 hook lets unit tests source the script
  for hermetic helper testing without triggering main().

Test (new tools/test-install-worker-bundle-afxdp-probe.sh, 21 cases):

- parse_kernel_major_minor across 8 release shapes (clean 3-component,
  +deb13 suffix, -cloud-amd64 suffix, -rt suffix, 4.x, 2-component,
  1-component, empty).
- probe_afxdp_runtime_available with stubbed uname + ldconfig:
  c6in.metal 6.12.74 + libxdp → true (the bug 5 repro).
  Kernel 4.19 too old → false + stderr names the version.
  Kernel 5.9 vs 5.10 vs 5.11 boundary correctness.
  libxdp.so missing → false + stderr names the missing library.
  Empty/non-numeric uname → false + stderr names parse failure.

Co-authored-by: skullcmd <skullcmd@anyvm.tech>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant