feat(portscan): ANYSCAN_SCANNER_IO_ENGINE env knob + adapter --io-engine plumbing#70
Conversation
…ine plumbing
Phase 2 PR D of plans/2026-04-27-portscan-afxdp-plan-v1.md §3.7. Wires
the runtime opt-in env knob into the Python adapter so workers can pick
between AF_PACKET (the default) and AF_XDP per host without rebuilding
the bundle. Pairs with PR C, which already wrote ANYSCAN_AF_XDP_AVAILABLE
into runtime.env after probing kernel >=5.10 + libxdp.so loadable, and
added CAP_BPF to anyscan-worker(-only).service.
- vulnscanner-zmap-adapter.py: new resolve_io_engine() reads
ANYSCAN_SCANNER_IO_ENGINE and falls back to af_packet when:
* the value is unset, blank, or unrecognized;
* af_xdp is requested but ANYSCAN_AF_XDP_AVAILABLE is not true
(warning emitted to stderr so the journal carries the downgrade).
build_command appends --io-engine=<value> to scanner argv.
- runtime.worker.env.template: documents the new knob, including the
CAP_BPF + AF_XDP availability prerequisites and the safe fallback.
- install-worker-bundle.sh: writes ANYSCAN_SCANNER_IO_ENGINE=af_packet
on fresh installs (only when not already pinned, so in-place upgrades
preserve operator choices).
- test_vulnscanner_adapter_io_engine.py: 16 unit tests covering env
reads, default fallback, --io-engine flag composition, AF_XDP
availability gating, and an end-to-end adapter spawn that confirms
the scanner subprocess receives the correct flag.
Out of scope: scanner fork edits (PR D is AnyScan-side only),
anyscan_rate_controller.py changes, prod runtime.env edits, live bench.
Verified:
- python3 -m py_compile vulnscanner-zmap-adapter.py
- python3 -m unittest test_vulnscanner_adapter_multinic
test_vulnscanner_adapter_io_engine test_anyscan_rate_controller
-> 100/100 ok
- cargo build --workspace -> clean
- cargo test --workspace -> 437/437 ok
- bash -n install-worker-bundle.sh -> ok
Phase 1 deploy — AF_XDP build wire-up (#71) live on scan.anyvm.techDeploy timestamp: 2026-04-28T13:16:23Z (services restarted) — bundle published 13:17:48Z, api-rebuilt hosted bundle 13:19:46Z. Driven by anygpt-42. Source HEADs
Build (with
|
| binary | before (bytes / mtime) | after (bytes / mtime) |
|---|---|---|
| anyscan-api | 38,032,936 / 2026-04-27 19:11 | 38,054,688 / 2026-04-28 13:16 |
| anyscan-worker | 17,793,816 / 2026-04-27 19:10 | 17,798,120 / 2026-04-28 13:15 |
| scanner | 36 (test-fixture stub) / 2026-04-28 12:46 | 70,632 / 2026-04-28 13:01 |
Service state post-restart
anyscan-api.service— active (running) since 13:16:23Z; listening on0.0.0.0:8088; claim-wedge sweep healthy.anyscan-worker.service— active (running). Pre-existing 401register workerretry loop (local-anyscan-workertoken mismatch) — orthogonal to AF_XDP; flagged for separate fix.
Phase 1.6 — existing-fleet xlarge
The pre-existing xlarge i-079b112b54c9552bc (registered as agent-20260428125346-8de2f8 on the AF_PACKET-only bundle …aa6ee0d04b2f) wedged at 13:08:52Z (last heartbeat) and was terminated at 13:08:56Z — a recurrence of the wedge that took out anygpt-4. POST /api/workers/{id}/remote-update returned 400 because the API rejects offline workers. The watchdog anyscan-ec2-worker-manager.service auto-launched replacement i-0c72660ea2b512c6f at 13:40:05Z (3.231.166.10); on bootstrap it pulls the latest hosted bundle (…d14b6181b8a4 — the AF_XDP one), so the deploy reaches the fleet via wedge-and-replace. Phase 2 will terminate this xlarge as part of the c6in.metal switch, so a manual remote-update would be moot.
Side-effects worth noting
install-external-deps.shrewroteANYSCAN_EXTENSION_MANIFEST_PATHSandANYSCAN_LOCAL_BOOTSTRAP_ARTIFACT_DIRin/etc/anyscan/runtime.envto source-tree paths (it was designed for dev workstations). Restored to the prod/opt/anyscan/extensions/and/var/lib/anyscan/bootstrap-artifactsvalues before service restart./opt/anyscan/bin/scannerwas a 36-bytecached-scanner-afxdpecho stub (timestamp 2026-04-28T12:46:57Z, ~6 min before fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 merged) — looks like a leftover fromtools/test-install-external-deps-afxdp.sh. Local worker was in the 401 retry loop and never invoked it; install-external-deps.sh'sbinary_has_afxdp_linkagecheck correctly forced a rebuild.- First cargo build hit a transient rustup proxy race (
.partialforrust-src/clippyvanishing mid-rename) — resolved by settingRUSTUP_TOOLCHAIN=1.92.0-x86_64-unknown-linux-gnuand invoking the toolchain cargo directly. systemdExecStartPrebuild of api succeeded with cached artifacts on restart.
Phase 2 (c6in.metal AF_PACKET 8-NIC baseline + AF_XDP 1-NIC + AF_XDP 8-NIC cap=4 bench cycle) starts next.
Phase 1 deploy — env-knob plumbing (PRs #73 / #74 / #75 / #76) live on scan.anyvm.techDeploy timestamp: 2026-04-28T17:36:31Z (binary swap) → 17:36:52Z (api restart green after a 17:36:38Z Source HEADs
Build
Knob propagation proof — bundle README records all three env vars (PR #75 §wire-up):Scanner linkage — bundled
|
| binary | before (bytes / mtime) | after (bytes / mtime) |
|---|---|---|
| anyscan-api | 38,054,688 / 2026-04-28 13:16 | 38,067,096 / 2026-04-28 17:36 |
| anyscan-worker | 17,798,120 / 2026-04-28 13:15 | 17,797,592 / 2026-04-28 17:36 |
| scanner (host stub) | 36 / 2026-04-28 17:30 | 36 / 2026-04-28 17:30 (unchanged — cached-scanner-afxdp echo wrapper is never invoked on the control-plane host; the real scanner ships in the bundle) |
Service state post-restart
anyscan-api.service— active (running) since 2026-04-28T17:36:52Z (MainPID 3197942); auto-built a default bundle (agent-bundle-linux-x86_64__20260428173910-3197942-1159249ed3e2.tar.gz) on startup as expected.anyscan-worker.service— active (running). The pre-existing401 register workerretry loop onlocal-anyscan-worker(flagged in the fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 deploy comment) is still present — its journal entries at 17:35:51Z–17:36:28Z prove it predates this deploy by ~5 min and is therefore not a regression. Still orthogonal to AF_XDP / kernel-backport / 15-ENI; tracking it stays a separate fix.
Side-effects worth noting
- The first restart attempt (17:36:38Z) hit
status=233/RUNTIME_DIRECTORYbecause/run/anyscanhad been transiently torn down between the worker stop and the api start.Restart=alwaysretried at 17:36:52Z and succeeded —/run/anyscanrecreated withanyscan:anyscan0755 ownership. No operator intervention; flagging because future deploys may want aRuntimeDirectoryPreserve=yesor a small ordering tweak in the unit if this becomes a habit. /etc/anyscan/runtime.envwas not modified this round (the install-external-deps.sh side-effects from the fix(build): wire ANYSCAN_USE_AF_XDP=1 through install-external-deps + package-worker-bundle + deploy #71 deploy were avoided by going straight throughpackage-worker-bundle.shrather than re-running install-deps).- PF_RING gating from PR fix(install): gate ANYSCAN_PFRING_ZC_AVAILABLE on scanner libpfring linkage #76 (
ANYSCAN_PFRING_ZC_AVAILABLE) is recorded asuse_pfring_zc: 0in the bundle README, so even if a downstream operator later flipped the runtime knob the install-side gate would catch the missing libpfring linkage.
Linkage to the env-knob PRs
| PR | env knob | live verification |
|---|---|---|
| #73 | ANYSCAN_INSTALL_KERNEL_BACKPORT=1 |
recorded in bundle README; tools/test-install-external-deps-kernel-backport.sh present in submodule and runnable; actual kernel install fires on the c6in.metal in Phase 2c |
| #74 | ANYSCAN_MAX_ENIS=15 |
40-test unit suite green on host (incl. 15-ENI placement); live RunInstances call in Phase 2b |
| #75 | ANYSCAN_USE_PFRING_ZC=1 |
bundle path's PF_RING branch is deliberately not exercised here — PF_RING engine init is a stub; bundled OFF (use_pfring_zc: 0) |
| #76 | ANYSCAN_PFRING_ZC_AVAILABLE gate |
implicit — the gate would have rejected a libpfring-less scanner if PR #75's path had been taken; not exercised this run |
Phase 2 (c6in.metal ANYSCAN_MAX_ENIS=15 launch + kernel-backport bootstrap + AF_PACKET 15-NIC baseline + AF_XDP 15-NIC cap=4 [+ zerocopy if ena_xdp_zc unlocks post-reboot]) starts next; the bench-results table will land on PR #65 to keep the AF_XDP-plan thread together with the prior 8-NIC numbers.
Local deploy on scan.anyvm.tech (anygpt-52)Deploy timestamp: 2026-04-28T20:19:00Z Source heads:
Bundle:
Build flags: Linkage proof — Both Services restarted on scan.anyvm.tech:
Note on engine-c structure: — posted from session anygpt-52, en route to c6in.metal bench gauntlet (PR #65 will get the bench results comment). |
|
anygpt-55 deploy proof — fresh bundle for c6in.metal bench cycle Deploy timestamp:
Bundle scanner linkage: Built with Proceeding with c6in.metal bench gauntlet. |
Summary
Phase 2 PR D of
plans/2026-04-27-portscan-afxdp-plan-v1.md§3.7 — the runtime opt-in shape that lets a worker pick the scanner I/O backend without rebuilding the bundle. AnyScan-side only; the scanner fork already lands--io-engine={af_packet,af_xdp}(PRs A/B/C merged). PR C already shippedANYSCAN_AF_XDP_AVAILABLE(kernel >=5.10 + libxdp.so probe) andCAP_BPFon the worker units, so this PR is the final adapter-layer wire-up.Changes
vulnscanner-zmap-adapter.py— newresolve_io_engine()readsANYSCAN_SCANNER_IO_ENGINEand emits the valuebuild_commandappends as--io-engine=<value>on the scanner argv. AF_PACKET is the unconditional default and the unconditional fallback.runtime.worker.env.template— documents the new knob, including theANYSCAN_AF_XDP_AVAILABLE=true+CAP_BPFprerequisites foraf_xdp, and the silent-default behavior on AF_PACKET.install-worker-bundle.sh— writesANYSCAN_SCANNER_IO_ENGINE=af_packeton fresh installs (only when not already pinned, so in-place upgrades preserve operator choices).test_vulnscanner_adapter_io_engine.py— new 16-test module: env reads, default behavior, the--io-engineflag composition throughbuild_command, and an end-to-end adapter spawn against a stub scanner that confirms the subprocess receives the right flag.Runtime opt-in shape
ANYSCAN_SCANNER_IO_ENGINEANYSCAN_AF_XDP_AVAILABLEaf_packet--io-engine=af_packetaf_xdptrue--io-engine=af_xdpaf_xdpfalse/ unset--io-engine=af_packet--io-engine=af_packet--io-engine=af_packetFallback behavior
If an operator sets
af_xdpon a host where the install-time probe setANYSCAN_AF_XDP_AVAILABLE=false(kernel <5.10 or no libxdp), the adapter does not crash the scanner — it logs[anyscan-adapter] ANYSCAN_SCANNER_IO_ENGINE=af_xdp requested but ANYSCAN_AF_XDP_AVAILABLE!=true; ... Falling back to af_packet.to stderr and forwards--io-engine=af_packet. The warning is loud on purpose so a misconfigured worker is visible injournalctl -u agentdinstead of silently scanning at AF_PACKET speeds.What is verified
python3 -m py_compile vulnscanner-zmap-adapter.py— cleanpython3 -m unittest test_vulnscanner_adapter_multinic test_vulnscanner_adapter_io_engine test_anyscan_rate_controller -v— 100/100 ok (31 + 16 + 53)cargo build --workspace— clean (only pre-existing warnings)cargo test --workspace— 437/437 ok (no regressions)bash -n install-worker-bundle.sh— cleanOut of scope
anyscan_rate_controller.pyand the multi-NIC adapter Python.runtime.envedits.c6in.metalbench cycle (the orchestrator dispatches a separate worker after this merges).🤖 Generated with Claude Code