feat(component query): bulk spec querying with --arch enforcement by liunan-ms · Pull Request #204 · microsoft/azure-linux-dev-tools

liunan-ms · 2026-05-21T21:03:16Z

Introduces azldev component query for bulk-querying rendered RPM specs inside a single shared mock chroot, with arch-aware policy enforcement.

CLI

New azldev component query command — selects components via the standard component filter, renders results as a table / CSV / Markdown / JSON.
New --arch flag (default x86_64, also accepts aarch64) driving rpmspec --target=<arch> and ExclusiveArch/ExcludeArch policy.

Docs

CLI reference under docs/user/reference/cli/ regenerated.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates azldev component query to operate on locally rendered specs and to batch rpmspec execution in a shared mock chroot, including reporting binary subpackages.

Changes:

Added batched spec querying via an embedded Python helper and shared Go scaffolding for running batch scripts in mock.
Extended spec query results to include binary subpackages and updated tests to validate subpackage extraction.
Updated CLI help/docs and adjusted component query to require and validate rendered-specs-dir.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
scenario/internal/projecttest/testspec.go	Adds subpackage rendering support in test spec generation.
scenario/component_query_test.go	Updates scenario test to render first and validate subpackages in query output.
internal/rpm/specquery.go	Exposes rpmspec queryformat constants and adds subpackage parsing helper.
internal/rpm/specquery_test.go	Adds unit tests for subpackage-output parsing and SRPM parsing behavior.
internal/app/azldev/core/sources/specquery.go	Introduces batch spec query orchestration and input validation.
internal/app/azldev/core/sources/specquery_test.go	Adds tests for batched query JSON parsing and validation.
internal/app/azldev/core/sources/query_process.py	Implements parallel per-spec rpmspec execution inside mock chroot.
internal/app/azldev/core/sources/mockprocessor.go	Refactors shared batch-script runner and adds configurable required package install.
internal/app/azldev/command.go	Emits partial results even when a command returns an error, with typed-nil guarding.
internal/app/azldev/cmds/component/query.go	Reworks `component query` to read rendered specs and use batched rpmspec querying.
internal/app/azldev/cmds/component/mockproc.go	Centralizes mock required-package presets and mock processor creation.
internal/app/azldev/cmds/component/query_test.go	Updates query command tests for rendered-specs-dir validation behavior.
internal/app/azldev/cmds/component/query_internal_test.go	Adds internal tests for building batch query inputs and skip behavior.
docs/user/reference/cli/azldev_component_query.md	Updates CLI reference to match rendered-spec-based query and new `--arch` flag.
docs/user/reference/cli/azldev_component.md	Updates command list description for `component query`.

liunan-ms · 2026-05-21T22:24:59Z


 [distros.azurelinux.versions.'4.0'.default-component-config]
-spec = { type = "upstream", upstream-distro = { name = "fedora", version = "42" } }
+spec = { type = "upstream", upstream-distro = { name = "fedora", version = "43" } }


java-srpm-macros is not available in f42 as a standalone binary RPM. Bumping the test mock to releasever=43 (and the matching f43-build URL) to unblock the scenario tests.

… enforcement Add an --arch flag to 'component query' (default x86_64) that drives rpmspec's --target and enforces ExclusiveArch/ExcludeArch policy via a probe queryformat wrapped around the srpm query. Specs excluded by the selected arch surface as ExcludedFromArch entries and are summarized in a per-run log line. Harden the per-spec query subprocess: - 180s rpmspec timeout with a dedicated _RpmspecTimeout exception - UTF-8 decoding with errors=replace on file IO and subprocess output Plumb the arch parameter through BatchQuerySpecs and add ExcludedFromArch to SpecQueryResult / its JSON mapping. Teach runFuncInternal to render partial results when the inner func returns both a value and an error, so per-component query failures still print the successful rows while exiting non-zero. Add javapackages-common to the mock package list so %javadoc_package (from macros.fjava) doesn't silently disappear. Bump scenario AZL 4.0 upstream + mock releasever to Fedora 43; java-srpm-macros is missing from f42-build koji. Regenerate CLI docs.

Copilot

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 5 comments.

+}
+
+// validateSpecRelPath rejects spec relative paths that could escape the
+// specs-dir bind mount or contain control characters.


reubeno · 2026-05-27T21:32:51Z

@dmcilvaney would you mind reviewing this? You'll be most familiar with the parts that touch the parallel mock processor.

dmcilvaney · 2026-05-28T17:39:36Z

+
+// validateSpecRelPath rejects spec relative paths that could escape the
+// specs-dir bind mount or contain control characters.
+func validateSpecRelPath(relPath string) error {


We should consolidate path validation to one spot, we have some semi common functions in fileutils.ValidateFilename, should augment that if it has holes.

Added a ValidateRelPath to fileutils (built on top of ValidateFilename per segment) and delegate validateSpecRelPath to it.

dmcilvaney · 2026-05-28T20:47:05Z

+			// (e.g. some items succeeded, some failed). Render what we have
+			// before propagating so callers see the partial output and the
+			// process still exits non-zero.
+			if results != nil && !isNilValue(results) {


Can you update render to also support this now? I worked around it by returning no error even when things failed. If we can still output the results table even on failure, I think returning an error code does make sense. Might need to update checkRenderErrors() if I recall right.

This might need an associated fix in the main repo's PR checks, but we can do that when we move the alzdev version pin forward.

Fixed in efd4560

dmcilvaney · 2026-05-28T20:49:25Z

+			// (e.g. some items succeeded, some failed). Render what we have
+			// before propagating so callers see the partial output and the
+			// process still exits non-zero.
+			if results != nil && !isNilValue(results) {


Same deal with update, although there I think we just want to call filterDisplayResults() before returning and add results[idx].Error != "" along with the other criteria. Otherwise I think it will dump 7k+ values out on an error.

Fixed in efd4560

dmcilvaney · 2026-05-28T20:56:35Z

+    # Apply per-spec rewrites (e.g. ghc.spec) to a scratch copy if needed.
+    # _sourcedir/_specdir stay pinned to the original spec's directory via
+    # _rpmspec_args, so sidecar files still resolve correctly.
+    effective_spec = _maybe_rewrite_spec(spec_path, scratch_dir, name)


Looks like this doesn't take the new effective_spec directory to use for spec_dir, is that an issue, or intended? (ie we want a different spec, but to still read everything as-if the spec was in its old spot).

Fixed in efd4560

dmcilvaney · 2026-05-28T20:58:44Z

+	name        string
+	version     string
+	release     string
+	buildArch   string


Seems like the arch logic is mostly untested end-to-end. Since a lot of it lives in the python side might be easiest to test here via scenario test.

Fixed in efd4560

dmcilvaney · 2026-05-28T21:03:54Z

+        content = src.read()
+
+    for find, replace in rewrites:
+        content = content.replace(find, replace)


I've had to do the same find/replace pattern when doing similar things, I haven't found a better way either, but we should consider erroring if the string isn't found anymore because the upstream spec changed or got overlays added, might make it easier to fix than a random parse error thats hard to diagnose.

dmcilvaney · 2026-05-28T21:09:15Z

+	// Subpackages lists the binary package names the spec produces, in the
+	// order rpmspec reports them. Empty when not queried (e.g., the
+	// per-component buildenv path that only requests --srpm output).
+	Subpackages []string


Nit: Are we worried about cases were a package produces no outputs? Maybe make it nil-able to differentiate?

Good question, I don't think we need to. The batched query path always populates Subpackages (even to an empty slice) on a successful query, so len == 0 already means "queried, spec produces no built binary RPMs." The only caller that leaves it nil is the legacy per-component QuerySpec, which doesn't run the subpackage query at all. Happy to revisit if a real caller needs the distinction.

dmcilvaney · 2026-05-28T21:11:15Z

+			// (e.g. some items succeeded, some failed). Render what we have
+			// before propagating so callers see the partial output and the
+			// process still exits non-zero.
+			if results != nil && !isNilValue(results) {


Nit: Maybe update the comments for the various CmdFuncType public functions to indicate this behavior. It may be a bit surprising that a failed command still outputs results.

Fixed in efd4560

dmcilvaney · 2026-05-28T21:14:12Z

+        arch,
+    )
+    # Insert --builtrpms right after `-q` so it associates with the query.
+    bin_args.insert(2, "--builtrpms")


Could use something like bin_args.index("-q") + 1 to find the -q so its stable if args move around.

Fixed in efd4560

dmcilvaney · 2026-05-28T21:15:57Z

+    return _ARCH_ALIASES.get(token, token)
+
+
+def _is_arch_excluded(arch, exclusive_arch, exclude_arch):


50/50 if this should live in the golang side somehow, and get passed in as a table. I'm not aware of any other go side features that will care about arch parsing, so I think its fine here, but that might change.

Agreed, keeping it here for now since query_process.py is the only consumer. If a Go-side feature ever needs the same canonicalization, the natural refactor is to define the table once in Go (e.g. under internal/utils/qemu) and ship it down via the inputs JSON alongside arch. I'll leave a TODO if you'd like.

- consolidate spec rel-path validation into fileutils.ValidateRelPath - render/update: surface partial results via runFuncInternal on error; widen filterDisplayResults to include errored entries; restore --fail-on-error on render as opt-in - document partial-results-on-error contract on RunFunc* / CmdFuncType - query_process.py: pin _sourcedir to the original spec dir when _maybe_rewrite_spec produces a scratch copy; raise loudly if a spec-rewrite find string no longer matches; look up -q by index when inserting --builtrpms; add TODO for moving _ARCH_ALIASES to Go - add scenario test for --arch ExclusiveArch/ExcludeArch handling (TestSpec gains WithExclusiveArch / WithExcludeArch)

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 5 comments.

+	queryResults, err := mockProcessor.BatchQuerySpecs(
+		env, env, renderedSpecsDir, scratchDir, archStr,
+		inputs, env.FS(), env.CPUBoundConcurrency(),
+	)


+		if after, ok := strings.CutPrefix(trimmed, "subpkg="); ok && after != "" {
+			result = append(result, after)
+		}


+    # Insert --builtrpms right after `-q` so it associates with the query.
+    # Look up `-q` rather than hard-coding the index so this stays correct if
+    # _rpmspec_args ever reorders its preamble.
+    bin_args.insert(bin_args.index("-q") + 1, "--builtrpms")


+	if len(inputs) == 0 {
+		return nil, fmt.Errorf("no components have a rendered spec on disk; skipped %d", skipped)
 	}


+			if results != nil && !isNilValue(results) {
+				if reportErr := reportResults(env, results); reportErr != nil {
+					return errors.Join(err, reportErr)
+				}
+			}


Copilot AI review requested due to automatic review settings May 21, 2026 21:03

Copilot AI reviewed May 21, 2026

View reviewed changes

Comment thread internal/app/azldev/command.go

Comment thread internal/app/azldev/core/sources/specquery.go Outdated

Comment thread internal/app/azldev/cmds/component/mockproc.go Outdated

Comment thread scenario/internal/projecttest/testspec.go

liunan-ms force-pushed the liunan/specquery branch from a6a9778 to abb1a51 Compare May 21, 2026 22:12

liunan-ms commented May 21, 2026

View reviewed changes

reubeno and others added 2 commits May 27, 2026 16:43

draft: bulk spec querying

5043a57

Copilot AI review requested due to automatic review settings May 27, 2026 16:57

liunan-ms force-pushed the liunan/specquery branch from abb1a51 to 090fa31 Compare May 27, 2026 16:57

Copilot AI reviewed May 27, 2026

View reviewed changes

reubeno requested a review from dmcilvaney May 27, 2026 21:32

dmcilvaney reviewed May 28, 2026

View reviewed changes

Copilot AI review requested due to automatic review settings June 2, 2026 00:29

liunan-ms force-pushed the liunan/specquery branch from c199cc1 to efd4560 Compare June 2, 2026 00:29

Copilot AI reviewed Jun 2, 2026

View reviewed changes

		return _ARCH_ALIASES.get(token, token)


		def _is_arch_excluded(arch, exclusive_arch, exclude_arch):

Conversation

liunan-ms commented May 21, 2026

CLI

Docs

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

reubeno commented May 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liunan-ms Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

liunan-ms Jun 2, 2026 •

edited

Loading

liunan-ms Jun 2, 2026 •

edited

Loading

liunan-ms Jun 2, 2026 •

edited

Loading

liunan-ms Jun 2, 2026 •

edited

Loading

liunan-ms Jun 2, 2026 •

edited

Loading

liunan-ms Jun 2, 2026 •

edited

Loading