fix(build): wire ANYSCAN_USE_DPDK=1 through install-external-deps + bundle + deploy + adapter#81
Conversation
…undle + deploy + adapter Phase 2 wire-up for the DPDK io_engine landing in AnyVM-Tech/anyscan-engine-c PR #4. Mirrors PR #71's AF_XDP wire-up shape across the install / bundle / deploy / adapter / install-time-probe chain so the engine repo's USE_DPDK=1 build flag actually reaches every producer of a worker bundle, and so the runtime --io-engine=dpdk knob plumbed through ANYSCAN_SCANNER_IO_ENGINE has DPDK code to dispatch to. Why DPDK now: AWS ENA on kernel ≤6.12.74 forces AF_XDP into drv+copy mode, capping c6in.metal at ~22M pps aggregate (memory: anyscan_afxdp_ena_constraint, also PR #65 issuecomment-4338158487 — 6.19.11 STILL does not have ena_xdp_zc). DPDK bypasses the kernel ENA driver entirely via vfio-pci and removes the syscall-kick + lower-half -channels-only ZC constraint. What lands here: - install-external-deps.sh: ANYSCAN_USE_DPDK env knob; binary_has_dpdk_linkage probe (librte_eal.so via ldd → readelf -d); install_dpdk_build_deps (libdpdk-dev + dpdk apt-get, fail-open); cache short-circuit invalidation when cached binary lacks DPDK linkage; vulnscanner_make_args extension; post-build assertion. - package-worker-bundle.sh: same env knob, linkage probe, rebuild_scanner_with_dpdk helper, bundle_engine_make_args, README.txt use_dpdk field. Composes with USE_AF_XDP=1 USE_PFRING_ZC=1 — the earliest matching rebuild block produces a binary linked against every requested engine in a single make invocation. - deploy.sh: same env knob, linkage probe, make_args extension, pre-DPDK cached-binary drop, post-build assertion. - install-worker-bundle.sh: binary_has_dpdk_linkage, probe_dpdk_runtime_available (5 gates: scanner USE_DPDK-built, librte_eal.so loadable, vfio_pci kernel module, hugepages reserved in /sys/kernel/mm/hugepages/*, /dev/vfio/vfio present), apply_dpdk_availability writing ANYSCAN_DPDK_AVAILABLE. - vulnscanner-zmap-adapter.py: SUPPORTED_IO_ENGINES gains "dpdk"; _IO_ENGINE_AVAILABILITY_KEYS maps "dpdk" → ANYSCAN_DPDK_AVAILABLE so the same fall-back-with-warning path the AF_XDP / PF_RING ZC plumbing already exercises picks up dpdk for free. - runtime.worker.env.template: full DPDK section documenting ANYSCAN_USE_DPDK (build-time), ANYSCAN_DPDK_AVAILABLE (install probe), ANYSCAN_DPDK_PCI_BDFS (BDF / iface CSV), and ANYSCAN_DPDK_HUGEPAGES_GB (default 4). - tools/setup-dpdk.sh (NEW, ~370 LOC): bind / unbind / status subcommands. Reserves hugepages (1 GiB pages preferred, falls back to 2 MiB), modprobe vfio-pci, dpdk-devbind.py --bind=vfio-pci. Idempotent (re-runs are no-ops). Reversible (`unbind` returns the NICs to ena and frees hugepages). Refuses to bind eth0 (agentd control-plane interface) and refuses to bind the only NIC. THP gets switched to "never" on bind (DPDK + THP fragments the static hugepage pool). - tools/test-install-external-deps-dpdk.sh (NEW, ~270 LOC): mirrors test-install-external-deps-afxdp.sh. Four cases × multiple assertions: default unset → no USE_DPDK=1 in make argv; opt-in + missing scanner → USE_DPDK=1; opt-in + cached non-DPDK binary → make clean + USE_DPDK=1; opt-in + cached DPDK-linked binary → no rebuild. Stubs make/git/ldd/readelf so it runs hermetically. - test_vulnscanner_adapter_io_engine.py: 7 new DPDK assertions covering the dpdk-with-runtime-available, dpdk-without-runtime -fall-back-with-warning, missing-availability-var, uppercase normalization, and cross-engine availability isolation cases. Updated test_invalid_value_falls_back_to_af_packet_with_warning to use "fake_engine" instead of "dpdk" — dpdk is now valid. Verification (on Debian bookworm with libdpdk-dev 24.11 installed): - tools/test-install-external-deps-afxdp.sh: 11/11 (regression OK). - tools/test-install-external-deps-pfring-zc.sh: 10/10 (regression OK). - tools/test-install-external-deps-dpdk.sh: 10/10. - python3 -m unittest discover: 116/116 (32 in test_vulnscanner_adapter_io_engine, of which 7 are DPDK-specific). - All bash scripts parse cleanly via `bash -n`. - tools/setup-dpdk.sh status runs cleanly (no NICs bound, expected). Engine PR for io_engine_dpdk: AnyVM-Tech/anyscan-engine-c#4 Out of scope (separate workers per the plan): - Phase 2 systemd unit edit adding CAP_SYS_RAWIO/CAP_IPC_LOCK/ CAP_NET_ADMIN to anyscan-worker.service. Documented in the env template. Until that lands operators must add caps manually before flipping the runtime knob. - Live c6in.metal bench (plan §5.3). - AMI rebuild. - mlx5 / non-AWS hardware support. Refs: plans/2026-04-28-portscan-dpdk-impl-v1.md (§3.10 wire-up, §3.11 NIC-binding decision, §4.3 kernel feature checks, §5.7 unit test shape). anygpt-50 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 73dbf98418
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if [ ! -e /dev/vfio/vfio ]; then | ||
| # The vfio control char device is created by the vfio-pci module |
There was a problem hiding this comment.
Gate DPDK readiness on a bound VFIO device, not /dev/vfio/vfio
probe_dpdk_runtime_available treats /dev/vfio/vfio as proof that a NIC is bound for DPDK, but that node can exist as soon as VFIO is loaded even when no PCI device is attached to vfio-pci. In that state this probe can write ANYSCAN_DPDK_AVAILABLE=true, so the adapter forwards --io-engine=dpdk and the scanner can still fail at runtime due to zero usable DPDK ports. The check should verify at least one bound device (for example via /sys/bus/pci/drivers/vfio-pci/* or a /dev/vfio/<group> node) rather than only the control device.
Useful? React with 👍 / 👎.
| printf '[!] %s: 1 GiB hugepages reservation fell short (got %s, wanted %s); falling back to 2 MiB.\n' \ | ||
| "$SCRIPT_NAME" "$current" "$target_gb" >&2 | ||
| fi | ||
| fi | ||
| if [ -d "$hp2m_dir" ]; then |
There was a problem hiding this comment.
Subtract partial 1GiB reservations before 2MiB fallback
When 1GiB hugepage reservation partially succeeds, the fallback path immediately requests the full target_gb * 512 2MiB pages without accounting for already-reserved 1GiB pages. On fragmented hosts this can over-reserve memory beyond the configured target (e.g., partial 1GiB success plus full 2MiB fallback), which can unnecessarily starve system memory during bind. The fallback should either clear partial 1GiB pages first or request only the remaining capacity.
Useful? React with 👍 / 👎.
| if [ "$entry" = "eth0" ]; then | ||
| printf '[!] %s: skipping eth0 (agentd control-plane interface, never bound to vfio-pci).\n' "$SCRIPT_NAME" >&2 | ||
| continue | ||
| fi | ||
| # Looks like a BDF (e.g. 0000:00:06.0)? |
There was a problem hiding this comment.
Reject control-plane NIC even when provided as PCI BDF
The eth0 safety rule only triggers for the literal token eth0; if the same interface is provided as a PCI BDF, it bypasses this guard and can still be bound to vfio-pci. In multi-NIC environments where another IPv4 NIC exists, count_remaining_kernel_nics may still pass, allowing the control-plane interface to be detached despite the script’s stated hard safety rule. Resolve each candidate BDF back to interface name(s) and enforce the control-plane rejection there as well.
Useful? React with 👍 / 👎.
Summary
Phase 2 wire-up for the DPDK io_engine landing in
AnyVM-Tech/anyscan-engine-c#4. Mirrors PR #71's AF_XDP wire-up shape across the install / bundle / deploy / adapter / install-time-probe chain so the engine repo'sUSE_DPDK=1build flag actually reaches every producer of a worker bundle, and so the runtime--io-engine=dpdkknob plumbed throughANYSCAN_SCANNER_IO_ENGINEhas DPDK code to dispatch to.Why DPDK now: AWS ENA on kernel ≤6.12.74 forces AF_XDP into
drv+copymode, capping c6in.metal at ~22M pps aggregate (memory:anyscan_afxdp_ena_constraint, also PR #65issuecomment-4338158487— 6.19.11 STILL does not haveena_xdp_zc). DPDK bypasses the kernel ENA driver entirely viavfio-pciand removes the syscall-kick + lower-half-channels-only ZC constraint. Plan:plans/2026-04-28-portscan-dpdk-impl-v1.md(merged in #72).Companion engine PR
This PR is the AnyScan-side half. The engine-side half is AnyVM-Tech/anyscan-engine-c#4 — that's the PR that adds
--io-engine=dpdksupport to the bundled scanner. Without it landing first, this PR'sUSE_DPDK=1build flag has nothing to compile in.What's in the PR
Build-flag plumbing (mirrors PR #71)
install-external-deps.sh—ANYSCAN_USE_DPDKenv knob,binary_has_dpdk_linkageprobe (librte_eal.so via ldd → readelf -d),install_dpdk_build_deps(libdpdk-dev + dpdk apt-get, fail-open), cache short-circuit invalidation when cached binary lacks DPDK linkage,vulnscanner_make_argsextension, post-build assertion.package-worker-bundle.sh— same env knob, linkage probe,rebuild_scanner_with_dpdkhelper,bundle_engine_make_args,README.txtuse_dpdkfield. Composes withUSE_AF_XDP=1USE_PFRING_ZC=1— earliest matching rebuild block produces a binary linked against every requested engine in a single make invocation.deploy.sh— same env knob, linkage probe, make_args extension, pre-DPDK cached-binary drop, post-build assertion.Runtime probe
install-worker-bundle.sh::probe_dpdk_runtime_available— five gates:\$VULNSCANNER_BIN_DESTwas USE_DPDK-built (closes the same gap PR fix(build): wire ANYSCAN_USE_PFRING_ZC=1 through install-external-deps + package-worker-bundle + deploy + adapter #75 review flagged forpfring_zc)librte_eal.soloadable vialdconfigvfio_pcikernel module loaded/sys/kernel/mm/hugepages/*/dev/vfio/vfiopresent (kernel-side prerequisite that vfio-pci's bind step would have created)apply_dpdk_availabilitywritesANYSCAN_DPDK_AVAILABLEto/etc/agentd/runtime.envalways (true OR false) so a partial upgrade can't leave a staletruein place.Adapter
vulnscanner-zmap-adapter.py::SUPPORTED_IO_ENGINESgains\"dpdk\"._IO_ENGINE_AVAILABILITY_KEYSmaps\"dpdk\"→ANYSCAN_DPDK_AVAILABLEso the same fall-back-with-warning path the AF_XDP / PF_RING ZC plumbing already exercises picks up dpdk for free.Host setup script (NEW)
tools/setup-dpdk.sh(~370 LOC) —bind/unbind/statussubcommands.unbindreturns the NICs toenaand frees hugepages.eth0(agentd control-plane interface)./sys/class/net/<iface>/device.Documentation
runtime.worker.env.template— full DPDK section documentingANYSCAN_USE_DPDK(build-time),ANYSCAN_DPDK_AVAILABLE(install probe),ANYSCAN_DPDK_PCI_BDFS(CSV of BDFs/ifaces),ANYSCAN_DPDK_HUGEPAGES_GB(default 4).Tests (NEW)
tools/test-install-external-deps-dpdk.sh(~270 LOC) — mirrorstest-install-external-deps-afxdp.sh. Four cases × multiple assertions, hermetic (stubs make/git/ldd/readelf).test_vulnscanner_adapter_io_engine.py— 7 new DPDK assertions covering with-runtime-available, without-runtime-fallback-with-warning, missing-availability-var, uppercase-normalization, and cross-engine availability isolation. Updatedtest_invalid_value_falls_back_to_af_packet_with_warningto use `fake_engine` instead of `dpdk` (dpdk is now valid).Verification
Out of scope
These follow as separate work per the plan's §8 rollout:
CAP_SYS_RAWIO/CAP_IPC_LOCK/CAP_NET_ADMINtoanyscan-worker.service. Documented in the env template; until that lands operators must add caps manually before flipping the runtime knob.anyscan_aws_pps_allowanceAWS may enforce a quota that hits before the engine-side ceiling.Test plan
tools/test-install-external-deps-dpdk.sh: 10/10 cases (default, fresh-build, force-rebuild, cached-skip).python3 -m unittest discover: 116 tests pass — 32 intest_vulnscanner_adapter_io_engine, 7 of those DPDK-specific.test-install-external-deps-afxdp.sh(11/11) and-pfring-zc.sh(10/10) still pass.tools/setup-dpdk.sh statusruns cleanly with no NICs bound.setup-dpdk.sh bind/unbindagainst a c6in.metal-class host — separate worker, see plan §5.5.Refs: `plans/2026-04-28-portscan-dpdk-impl-v1.md` (§3.10 wire-up, §3.11 NIC-binding decision, §4.3 kernel feature checks, §5.7 unit test shape)