Skip to content

feat(portscan): ANYSCAN_SCANNER_IO_ENGINE env knob + adapter --io-engine plumbing#70

Merged
skullcrushercmd merged 1 commit intomainfrom
feat/scanner-io-engine-env
Apr 28, 2026
Merged

feat(portscan): ANYSCAN_SCANNER_IO_ENGINE env knob + adapter --io-engine plumbing#70
skullcrushercmd merged 1 commit intomainfrom
feat/scanner-io-engine-env

Conversation

@skullcrushercmd
Copy link
Copy Markdown
Contributor

Summary

Phase 2 PR D of plans/2026-04-27-portscan-afxdp-plan-v1.md §3.7 — the runtime opt-in shape that lets a worker pick the scanner I/O backend without rebuilding the bundle. AnyScan-side only; the scanner fork already lands --io-engine={af_packet,af_xdp} (PRs A/B/C merged). PR C already shipped ANYSCAN_AF_XDP_AVAILABLE (kernel >=5.10 + libxdp.so probe) and CAP_BPF on the worker units, so this PR is the final adapter-layer wire-up.

Changes

  • vulnscanner-zmap-adapter.py — new resolve_io_engine() reads ANYSCAN_SCANNER_IO_ENGINE and emits the value build_command appends as --io-engine=<value> on the scanner argv. AF_PACKET is the unconditional default and the unconditional fallback.
  • runtime.worker.env.template — documents the new knob, including the ANYSCAN_AF_XDP_AVAILABLE=true + CAP_BPF prerequisites for af_xdp, and the silent-default behavior on AF_PACKET.
  • install-worker-bundle.sh — writes ANYSCAN_SCANNER_IO_ENGINE=af_packet on fresh installs (only when not already pinned, so in-place upgrades preserve operator choices).
  • test_vulnscanner_adapter_io_engine.py — new 16-test module: env reads, default behavior, the --io-engine flag composition through build_command, and an end-to-end adapter spawn against a stub scanner that confirms the subprocess receives the right flag.

Runtime opt-in shape

ANYSCAN_SCANNER_IO_ENGINE ANYSCAN_AF_XDP_AVAILABLE Forwarded to scanner Notes
unset / af_packet (any) --io-engine=af_packet unconditional default
af_xdp true --io-engine=af_xdp engaged
af_xdp false / unset --io-engine=af_packet warning to stderr (journal-visible audit trail)
unrecognized value (any) --io-engine=af_packet warning to stderr
blank (any) --io-engine=af_packet silent (treats blank as unset)

Fallback behavior

If an operator sets af_xdp on a host where the install-time probe set ANYSCAN_AF_XDP_AVAILABLE=false (kernel <5.10 or no libxdp), the adapter does not crash the scanner — it logs [anyscan-adapter] ANYSCAN_SCANNER_IO_ENGINE=af_xdp requested but ANYSCAN_AF_XDP_AVAILABLE!=true; ... Falling back to af_packet. to stderr and forwards --io-engine=af_packet. The warning is loud on purpose so a misconfigured worker is visible in journalctl -u agentd instead of silently scanning at AF_PACKET speeds.

What is verified

  • python3 -m py_compile vulnscanner-zmap-adapter.py — clean
  • python3 -m unittest test_vulnscanner_adapter_multinic test_vulnscanner_adapter_io_engine test_anyscan_rate_controller -v100/100 ok (31 + 16 + 53)
  • cargo build --workspace — clean (only pre-existing warnings)
  • cargo test --workspace437/437 ok (no regressions)
  • bash -n install-worker-bundle.sh — clean

Out of scope

  • Scanner fork edits (PR D is AnyScan-side only).
  • anyscan_rate_controller.py and the multi-NIC adapter Python.
  • Prod runtime.env edits.
  • The live c6in.metal bench cycle (the orchestrator dispatches a separate worker after this merges).

🤖 Generated with Claude Code

…ine plumbing

Phase 2 PR D of plans/2026-04-27-portscan-afxdp-plan-v1.md §3.7. Wires
the runtime opt-in env knob into the Python adapter so workers can pick
between AF_PACKET (the default) and AF_XDP per host without rebuilding
the bundle. Pairs with PR C, which already wrote ANYSCAN_AF_XDP_AVAILABLE
into runtime.env after probing kernel >=5.10 + libxdp.so loadable, and
added CAP_BPF to anyscan-worker(-only).service.

- vulnscanner-zmap-adapter.py: new resolve_io_engine() reads
  ANYSCAN_SCANNER_IO_ENGINE and falls back to af_packet when:
    * the value is unset, blank, or unrecognized;
    * af_xdp is requested but ANYSCAN_AF_XDP_AVAILABLE is not true
      (warning emitted to stderr so the journal carries the downgrade).
  build_command appends --io-engine=<value> to scanner argv.
- runtime.worker.env.template: documents the new knob, including the
  CAP_BPF + AF_XDP availability prerequisites and the safe fallback.
- install-worker-bundle.sh: writes ANYSCAN_SCANNER_IO_ENGINE=af_packet
  on fresh installs (only when not already pinned, so in-place upgrades
  preserve operator choices).
- test_vulnscanner_adapter_io_engine.py: 16 unit tests covering env
  reads, default fallback, --io-engine flag composition, AF_XDP
  availability gating, and an end-to-end adapter spawn that confirms
  the scanner subprocess receives the correct flag.

Out of scope: scanner fork edits (PR D is AnyScan-side only),
anyscan_rate_controller.py changes, prod runtime.env edits, live bench.

Verified:
- python3 -m py_compile vulnscanner-zmap-adapter.py
- python3 -m unittest test_vulnscanner_adapter_multinic
  test_vulnscanner_adapter_io_engine test_anyscan_rate_controller
  -> 100/100 ok
- cargo build --workspace -> clean
- cargo test --workspace  -> 437/437 ok
- bash -n install-worker-bundle.sh -> ok
@skullcrushercmd skullcrushercmd merged commit b5da5fc into main Apr 28, 2026
@skullcrushercmd skullcrushercmd deleted the feat/scanner-io-engine-env branch April 28, 2026 03:00
@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Phase 1 deploy — AF_XDP build wire-up (#71) live on scan.anyvm.tech

Deploy timestamp: 2026-04-28T13:16:23Z (services restarted) — bundle published 13:17:48Z, api-rebuilt hosted bundle 13:19:46Z. Driven by anygpt-42.

Source HEADs

Build (with ANYSCAN_USE_AF_XDP=1)

  • install-external-deps.sh ANYSCAN_USE_AF_XDP=1 cloned engine-c, apt-get install libxdp-dev libbpf-dev libelf-dev, make USE_AF_XDP=1 clean compile (-DUSE_AF_XDP, -lxdp -lbpf -lelf -lz).
  • cargo build --release --bin anyscan-api --bin anyscan-worker (47s on 1.92.0 toolchain). Source /root/AnyGPT/apps/anyscan @ 989c44e.
  • package-worker-bundle.sh ANYSCAN_USE_AF_XDP=1 produced agent-bundle-linux-x86_64__20260428131650-2106120.tar.gz; api-driven rebuild via GET /api/agent/bundles/refresh produced agent-bundle-linux-x86_64__20260428131946-2105099-d14b6181b8a4.tar.gz (sha256 431823f1…dba017, 17,169,876B → published).

AF_XDP linkage proof

/opt/anyscan/bin/scanner (70,632B, mode 0755):

ldd:
  libxdp.so.1 => /lib/x86_64-linux-gnu/libxdp.so.1
  libbpf.so.1 => /lib/x86_64-linux-gnu/libbpf.so.1
  libelf.so.1 => /lib/x86_64-linux-gnu/libelf.so.1
strings: xsk_socket__fd | xsk_socket__create | xsk_socket__delete | af_xdp
--help: --io-engine=NAME    I/O engine: af_packet (default), pfring_zc, af_xdp

Bundled scanner (bin/scanner inside the published tarball, 60,992B stripped): identical libxdp.so.1 / libbpf.so.1 / libelf.so.1 linkage; same xsk_socket__* symbols.

Snapshots → pre-pr70-deploy.bak

binary before (bytes / mtime) after (bytes / mtime)
anyscan-api 38,032,936 / 2026-04-27 19:11 38,054,688 / 2026-04-28 13:16
anyscan-worker 17,793,816 / 2026-04-27 19:10 17,798,120 / 2026-04-28 13:15
scanner 36 (test-fixture stub) / 2026-04-28 12:46 70,632 / 2026-04-28 13:01

Service state post-restart

  • anyscan-api.service — active (running) since 13:16:23Z; listening on 0.0.0.0:8088; claim-wedge sweep healthy.
  • anyscan-worker.service — active (running). Pre-existing 401 register worker retry loop (local-anyscan-worker token mismatch) — orthogonal to AF_XDP; flagged for separate fix.

Phase 1.6 — existing-fleet xlarge

The pre-existing xlarge i-079b112b54c9552bc (registered as agent-20260428125346-8de2f8 on the AF_PACKET-only bundle …aa6ee0d04b2f) wedged at 13:08:52Z (last heartbeat) and was terminated at 13:08:56Z — a recurrence of the wedge that took out anygpt-4. POST /api/workers/{id}/remote-update returned 400 because the API rejects offline workers. The watchdog anyscan-ec2-worker-manager.service auto-launched replacement i-0c72660ea2b512c6f at 13:40:05Z (3.231.166.10); on bootstrap it pulls the latest hosted bundle (…d14b6181b8a4 — the AF_XDP one), so the deploy reaches the fleet via wedge-and-replace. Phase 2 will terminate this xlarge as part of the c6in.metal switch, so a manual remote-update would be moot.

Side-effects worth noting

  1. install-external-deps.sh rewrote ANYSCAN_EXTENSION_MANIFEST_PATHS and ANYSCAN_LOCAL_BOOTSTRAP_ARTIFACT_DIR in /etc/anyscan/runtime.env to source-tree paths (it was designed for dev workstations). Restored to the prod /opt/anyscan/extensions/ and /var/lib/anyscan/bootstrap-artifacts values before service restart.
  2. /opt/anyscan/bin/scanner was a 36-byte cached-scanner-afxdp echo stub (timestamp 2026-04-28T12:46:57Z, ~6 min before fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 merged) — looks like a leftover from tools/test-install-external-deps-afxdp.sh. Local worker was in the 401 retry loop and never invoked it; install-external-deps.sh's binary_has_afxdp_linkage check correctly forced a rebuild.
  3. First cargo build hit a transient rustup proxy race (.partial for rust-src / clippy vanishing mid-rename) — resolved by setting RUSTUP_TOOLCHAIN=1.92.0-x86_64-unknown-linux-gnu and invoking the toolchain cargo directly. systemd ExecStartPre build of api succeeded with cached artifacts on restart.

Phase 2 (c6in.metal AF_PACKET 8-NIC baseline + AF_XDP 1-NIC + AF_XDP 8-NIC cap=4 bench cycle) starts next.

@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Phase 1 deploy — env-knob plumbing (PRs #73 / #74 / #75 / #76) live on scan.anyvm.tech

Deploy timestamp: 2026-04-28T17:36:31Z (binary swap) → 17:36:52Z (api restart green after a 17:36:38Z status=233/RUNTIME_DIRECTORY transient that Restart=always recovered automatically) — bundle published 2026-04-28T17:41:01Z. Driven by anygpt-48 (continuation of the #71 deploy in anygpt-42).

Source HEADs

Build

  • cargo build --release --bin anyscan-api --bin anyscan-worker against /root/AnyGPT/apps/anyscan @ b62ff2e finished in 23.01s (mostly cached against the fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 deploy).
  • systemctl restart anyscan-api re-ran the ExecStartPre cargo build (--locked --target-dir /var/lib/anyscan/build-target) which finished in 8.85s — same source tree, two warnings (public_page / hash_path_recursively unused; pre-existing).
  • package-worker-bundle.sh ANYSCAN_USE_AF_XDP=1 ANYSCAN_INSTALL_KERNEL_BACKPORT=1 ANYSCAN_USE_PFRING_ZC=0 produced
    • agent-bundle-linux-x86_64__anygpt-48-afxdp-kbackport-20260428174058.tar.gz (17,168,505 B)
    • sha256 775cf88e1a3434846a038f210c1aa841e10efa215fbcb689cef7d6db34d930c0

Knob propagation proof — bundle README records all three env vars (PR #75 §wire-up):

$ grep -E 'use_af_xdp|use_pfring_zc|install_kernel_backport' README.txt
  use_af_xdp: 1
  use_pfring_zc: 0
  install_kernel_backport: 1

Scanner linkage — bundled bin/scanner (60,992 B stripped):

$ ldd bin/scanner | grep -E 'libxdp|libbpf|libpfring'
  libxdp.so.1 => /lib/x86_64-linux-gnu/libxdp.so.1
  libbpf.so.1 => /lib/x86_64-linux-gnu/libbpf.so.1
  (libpfring intentionally absent — PR #75 PF_RING build path skipped)

Matches the AF_XDP linkage from the #71 deploy; no PF_RING libpfring linkage as required (PR #75 engine-cluster init stub is acknowledged in the brief).

PR #74 launch-path verification (manager-side)

The 15-ENI distribution logic is exercised by tools/test_ec2_worker_manager.py's 40-case suite, which I re-ran on this host's freshly-pulled submodule:

$ python3 -m unittest tools.test_ec2_worker_manager
Ran 40 tests in 0.003s — OK

…including test_max_enis_15_on_c6in_metal_emits_15_network_interfaces and test_max_enis_15_on_c6in_xlarge_clamps_to_4. Live RunInstances call against a real c6in.metal still pending (this PR's "Test plan" unchecked item) — that lands in Phase 2.

Snapshots → pre-pr76-deploy.bak

binary before (bytes / mtime) after (bytes / mtime)
anyscan-api 38,054,688 / 2026-04-28 13:16 38,067,096 / 2026-04-28 17:36
anyscan-worker 17,798,120 / 2026-04-28 13:15 17,797,592 / 2026-04-28 17:36
scanner (host stub) 36 / 2026-04-28 17:30 36 / 2026-04-28 17:30 (unchanged — cached-scanner-afxdp echo wrapper is never invoked on the control-plane host; the real scanner ships in the bundle)

Service state post-restart

  • anyscan-api.service — active (running) since 2026-04-28T17:36:52Z (MainPID 3197942); auto-built a default bundle (agent-bundle-linux-x86_64__20260428173910-3197942-1159249ed3e2.tar.gz) on startup as expected.
  • anyscan-worker.service — active (running). The pre-existing 401 register worker retry loop on local-anyscan-worker (flagged in the fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 deploy comment) is still present — its journal entries at 17:35:51Z–17:36:28Z prove it predates this deploy by ~5 min and is therefore not a regression. Still orthogonal to AF_XDP / kernel-backport / 15-ENI; tracking it stays a separate fix.

Side-effects worth noting

  1. The first restart attempt (17:36:38Z) hit status=233/RUNTIME_DIRECTORY because /run/anyscan had been transiently torn down between the worker stop and the api start. Restart=always retried at 17:36:52Z and succeeded — /run/anyscan recreated with anyscan:anyscan 0755 ownership. No operator intervention; flagging because future deploys may want a RuntimeDirectoryPreserve=yes or a small ordering tweak in the unit if this becomes a habit.
  2. /etc/anyscan/runtime.env was not modified this round (the install-external-deps.sh side-effects from the fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 deploy were avoided by going straight through package-worker-bundle.sh rather than re-running install-deps).
  3. PF_RING gating from PR fix(install): gate ANYSCAN_PFRING_ZC_AVAILABLE on scanner libpfring linkage #76 (ANYSCAN_PFRING_ZC_AVAILABLE) is recorded as use_pfring_zc: 0 in the bundle README, so even if a downstream operator later flipped the runtime knob the install-side gate would catch the missing libpfring linkage.

Linkage to the env-knob PRs

PR env knob live verification
#73 ANYSCAN_INSTALL_KERNEL_BACKPORT=1 recorded in bundle README; tools/test-install-external-deps-kernel-backport.sh present in submodule and runnable; actual kernel install fires on the c6in.metal in Phase 2c
#74 ANYSCAN_MAX_ENIS=15 40-test unit suite green on host (incl. 15-ENI placement); live RunInstances call in Phase 2b
#75 ANYSCAN_USE_PFRING_ZC=1 bundle path's PF_RING branch is deliberately not exercised here — PF_RING engine init is a stub; bundled OFF (use_pfring_zc: 0)
#76 ANYSCAN_PFRING_ZC_AVAILABLE gate implicit — the gate would have rejected a libpfring-less scanner if PR #75's path had been taken; not exercised this run

Phase 2 (c6in.metal ANYSCAN_MAX_ENIS=15 launch + kernel-backport bootstrap + AF_PACKET 15-NIC baseline + AF_XDP 15-NIC cap=4 [+ zerocopy if ena_xdp_zc unlocks post-reboot]) starts next; the bench-results table will land on PR #65 to keep the AF_XDP-plan thread together with the prior 8-NIC numbers.

@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

Local deploy on scan.anyvm.tech (anygpt-52)

Deploy timestamp: 2026-04-28T20:19:00Z

Source heads:

Bundle:

  • Name: agent-bundle-linux-x86_64__20260428201803-75359.tar.gz
  • SHA256 (bundle): 3726e04b57bee78fb6b8e38434a696ec6074b8b40c730020e86dabb62cc47ded
  • SHA256 (scanner inside bundle): 19a0435964c51919ec95c68992671a90fc13ca14f8964f90f3a2750aac218009

Build flags: ANYSCAN_USE_AF_XDP=1 ANYSCAN_USE_DPDK=1 ANYSCAN_INSTALL_KERNEL_BACKPORT=0

Linkage proof — ldd $(scanner) filtered to XDP/DPDK libs (full list):

libxdp.so.1
libbpf.so.1
librte_eal.so.25
librte_ethdev.so.25
librte_mbuf.so.25
librte_mempool.so.25
librte_kvargs.so.25
librte_log.so.25
librte_telemetry.so.25
librte_net.so.25
librte_ring.so.25
librte_meter.so.25

Both libxdp.so (AF_XDP) and librte_eal.so (DPDK) are present — bundle is correctly built for the c6in.metal AF_XDP+DPDK bench.

Services restarted on scan.anyvm.tech:

  • anyscan-api.service (rebuilt via systemd ExecStartPre cargo, sha256 aef04372…)
  • anyscan-worker.service (also restarted; local worker token rotated to ewt_1704_… via fresh enrollment token to clear pre-existing 401 spam — unrelated to fleet workers).

Note on engine-c structure: anyscan-engine-c lives at /root/AnyGPT/anyscan-engine-c/ (top-level sibling of apps/anyscan/), which is the path package-worker-bundle.sh derives via SCRIPT_DIR/../../anyscan-engine-c. It is not a registered git submodule of either AnyGPT or AnyScan (no .gitmodules entry); it's a peer working-tree clone of AnyVM-Tech/anyscan-engine-c. Consider promoting to a real submodule in a follow-up to make ownership explicit.

— posted from session anygpt-52, en route to c6in.metal bench gauntlet (PR #65 will get the bench results comment).

@skullcrushercmd
Copy link
Copy Markdown
Contributor Author

anygpt-55 deploy proof — fresh bundle for c6in.metal bench cycle

Deploy timestamp: 2026-04-29T11:51Z

Component SHA Notes
AnyScan main 360c108 PRs #80#86 in (kernel-version probe, librte-net-ena25 install, ip-link-down setup-dpdk, hugetlbfs mount, public-IP knob)
anyscan-engine-c main a22c8f2 PR #5 fixes (AF_XDP teardown on mode-ladder, DPDK nb_tx_desc clamp via tx_desc_lim.nb_max, EAL argv split)
Bundle agent-bundle-linux-x86_64__20260429115124-2670063.tar.gz sha256 8bfdd5c3…0837bc25

Bundle scanner linkage:

$ ldd bin/scanner | grep -E 'libxdp|librte_eal|libbpf'
libxdp.so.1   => /lib/x86_64-linux-gnu/libxdp.so.1
libbpf.so.1   => /lib/x86_64-linux-gnu/libbpf.so.1
librte_eal.so.25 => /lib/x86_64-linux-gnu/librte_eal.so.25

Built with ANYSCAN_USE_AF_XDP=1 ANYSCAN_USE_DPDK=1 ANYSCAN_INSTALL_KERNEL_BACKPORT=0 (skip kernel backport — anyscan_afxdp_ena_constraint memory says 6.19.11 ena.ko still has zero xsk_* symbols, so the backport doesn't unlock zerocopy on ENA; drv+copy is the ceiling either way).

Proceeding with c6in.metal bench gauntlet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant