feat(component query): bulk spec querying with --arch enforcement#204
feat(component query): bulk spec querying with --arch enforcement#204liunan-ms wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR updates azldev component query to operate on locally rendered specs and to batch rpmspec execution in a shared mock chroot, including reporting binary subpackages.
Changes:
- Added batched spec querying via an embedded Python helper and shared Go scaffolding for running batch scripts in mock.
- Extended spec query results to include binary subpackages and updated tests to validate subpackage extraction.
- Updated CLI help/docs and adjusted
component queryto require and validaterendered-specs-dir.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| scenario/internal/projecttest/testspec.go | Adds subpackage rendering support in test spec generation. |
| scenario/component_query_test.go | Updates scenario test to render first and validate subpackages in query output. |
| internal/rpm/specquery.go | Exposes rpmspec queryformat constants and adds subpackage parsing helper. |
| internal/rpm/specquery_test.go | Adds unit tests for subpackage-output parsing and SRPM parsing behavior. |
| internal/app/azldev/core/sources/specquery.go | Introduces batch spec query orchestration and input validation. |
| internal/app/azldev/core/sources/specquery_test.go | Adds tests for batched query JSON parsing and validation. |
| internal/app/azldev/core/sources/query_process.py | Implements parallel per-spec rpmspec execution inside mock chroot. |
| internal/app/azldev/core/sources/mockprocessor.go | Refactors shared batch-script runner and adds configurable required package install. |
| internal/app/azldev/command.go | Emits partial results even when a command returns an error, with typed-nil guarding. |
| internal/app/azldev/cmds/component/query.go | Reworks component query to read rendered specs and use batched rpmspec querying. |
| internal/app/azldev/cmds/component/mockproc.go | Centralizes mock required-package presets and mock processor creation. |
| internal/app/azldev/cmds/component/query_test.go | Updates query command tests for rendered-specs-dir validation behavior. |
| internal/app/azldev/cmds/component/query_internal_test.go | Adds internal tests for building batch query inputs and skip behavior. |
| docs/user/reference/cli/azldev_component_query.md | Updates CLI reference to match rendered-spec-based query and new --arch flag. |
| docs/user/reference/cli/azldev_component.md | Updates command list description for component query. |
a6a9778 to
abb1a51
Compare
|
|
||
| [distros.azurelinux.versions.'4.0'.default-component-config] | ||
| spec = { type = "upstream", upstream-distro = { name = "fedora", version = "42" } } | ||
| spec = { type = "upstream", upstream-distro = { name = "fedora", version = "43" } } |
There was a problem hiding this comment.
java-srpm-macros is not available in f42 as a standalone binary RPM. Bumping the test mock to releasever=43 (and the matching f43-build URL) to unblock the scenario tests.
… enforcement Add an --arch flag to 'component query' (default x86_64) that drives rpmspec's --target and enforces ExclusiveArch/ExcludeArch policy via a probe queryformat wrapped around the srpm query. Specs excluded by the selected arch surface as ExcludedFromArch entries and are summarized in a per-run log line. Harden the per-spec query subprocess: - 180s rpmspec timeout with a dedicated _RpmspecTimeout exception - UTF-8 decoding with errors=replace on file IO and subprocess output Plumb the arch parameter through BatchQuerySpecs and add ExcludedFromArch to SpecQueryResult / its JSON mapping. Teach runFuncInternal to render partial results when the inner func returns both a value and an error, so per-component query failures still print the successful rows while exiting non-zero. Add javapackages-common to the mock package list so %javadoc_package (from macros.fjava) doesn't silently disappear. Bump scenario AZL 4.0 upstream + mock releasever to Fedora 43; java-srpm-macros is missing from f42-build koji. Regenerate CLI docs.
abb1a51 to
090fa31
Compare
| } | ||
|
|
||
| // validateSpecRelPath rejects spec relative paths that could escape the | ||
| // specs-dir bind mount or contain control characters. |
|
@dmcilvaney would you mind reviewing this? You'll be most familiar with the parts that touch the parallel mock processor. |
|
|
||
| // validateSpecRelPath rejects spec relative paths that could escape the | ||
| // specs-dir bind mount or contain control characters. | ||
| func validateSpecRelPath(relPath string) error { |
There was a problem hiding this comment.
We should consolidate path validation to one spot, we have some semi common functions in fileutils.ValidateFilename, should augment that if it has holes.
There was a problem hiding this comment.
Added a ValidateRelPath to fileutils (built on top of ValidateFilename per segment) and delegate validateSpecRelPath to it.
| // (e.g. some items succeeded, some failed). Render what we have | ||
| // before propagating so callers see the partial output and the | ||
| // process still exits non-zero. | ||
| if results != nil && !isNilValue(results) { |
There was a problem hiding this comment.
Can you update render to also support this now? I worked around it by returning no error even when things failed. If we can still output the results table even on failure, I think returning an error code does make sense. Might need to update checkRenderErrors() if I recall right.
This might need an associated fix in the main repo's PR checks, but we can do that when we move the alzdev version pin forward.
| // (e.g. some items succeeded, some failed). Render what we have | ||
| // before propagating so callers see the partial output and the | ||
| // process still exits non-zero. | ||
| if results != nil && !isNilValue(results) { |
There was a problem hiding this comment.
Same deal with update, although there I think we just want to call filterDisplayResults() before returning and add results[idx].Error != "" along with the other criteria. Otherwise I think it will dump 7k+ values out on an error.
| # Apply per-spec rewrites (e.g. ghc.spec) to a scratch copy if needed. | ||
| # _sourcedir/_specdir stay pinned to the original spec's directory via | ||
| # _rpmspec_args, so sidecar files still resolve correctly. | ||
| effective_spec = _maybe_rewrite_spec(spec_path, scratch_dir, name) |
There was a problem hiding this comment.
Looks like this doesn't take the new effective_spec directory to use for spec_dir, is that an issue, or intended? (ie we want a different spec, but to still read everything as-if the spec was in its old spot).
| name string | ||
| version string | ||
| release string | ||
| buildArch string |
There was a problem hiding this comment.
Seems like the arch logic is mostly untested end-to-end. Since a lot of it lives in the python side might be easiest to test here via scenario test.
| content = src.read() | ||
|
|
||
| for find, replace in rewrites: | ||
| content = content.replace(find, replace) |
There was a problem hiding this comment.
I've had to do the same find/replace pattern when doing similar things, I haven't found a better way either, but we should consider erroring if the string isn't found anymore because the upstream spec changed or got overlays added, might make it easier to fix than a random parse error thats hard to diagnose.
| // Subpackages lists the binary package names the spec produces, in the | ||
| // order rpmspec reports them. Empty when not queried (e.g., the | ||
| // per-component buildenv path that only requests --srpm output). | ||
| Subpackages []string |
There was a problem hiding this comment.
Nit: Are we worried about cases were a package produces no outputs? Maybe make it nil-able to differentiate?
There was a problem hiding this comment.
Good question, I don't think we need to. The batched query path always populates Subpackages (even to an empty slice) on a successful query, so len == 0 already means "queried, spec produces no built binary RPMs." The only caller that leaves it nil is the legacy per-component QuerySpec, which doesn't run the subpackage query at all. Happy to revisit if a real caller needs the distinction.
| // (e.g. some items succeeded, some failed). Render what we have | ||
| // before propagating so callers see the partial output and the | ||
| // process still exits non-zero. | ||
| if results != nil && !isNilValue(results) { |
There was a problem hiding this comment.
Nit: Maybe update the comments for the various CmdFuncType public functions to indicate this behavior. It may be a bit surprising that a failed command still outputs results.
| arch, | ||
| ) | ||
| # Insert --builtrpms right after `-q` so it associates with the query. | ||
| bin_args.insert(2, "--builtrpms") |
There was a problem hiding this comment.
Could use something like bin_args.index("-q") + 1 to find the -q so its stable if args move around.
| return _ARCH_ALIASES.get(token, token) | ||
|
|
||
|
|
||
| def _is_arch_excluded(arch, exclusive_arch, exclude_arch): |
There was a problem hiding this comment.
50/50 if this should live in the golang side somehow, and get passed in as a table. I'm not aware of any other go side features that will care about arch parsing, so I think its fine here, but that might change.
There was a problem hiding this comment.
Agreed, keeping it here for now since query_process.py is the only consumer. If a Go-side feature ever needs the same canonicalization, the natural refactor is to define the table once in Go (e.g. under internal/utils/qemu) and ship it down via the inputs JSON alongside arch. I'll leave a TODO if you'd like.
- consolidate spec rel-path validation into fileutils.ValidateRelPath - render/update: surface partial results via runFuncInternal on error; widen filterDisplayResults to include errored entries; restore --fail-on-error on render as opt-in - document partial-results-on-error contract on RunFunc* / CmdFuncType - query_process.py: pin _sourcedir to the original spec dir when _maybe_rewrite_spec produces a scratch copy; raise loudly if a spec-rewrite find string no longer matches; look up -q by index when inserting --builtrpms; add TODO for moving _ARCH_ALIASES to Go - add scenario test for --arch ExclusiveArch/ExcludeArch handling (TestSpec gains WithExclusiveArch / WithExcludeArch)
c199cc1 to
efd4560
Compare
| queryResults, err := mockProcessor.BatchQuerySpecs( | ||
| env, env, renderedSpecsDir, scratchDir, archStr, | ||
| inputs, env.FS(), env.CPUBoundConcurrency(), | ||
| ) |
| if after, ok := strings.CutPrefix(trimmed, "subpkg="); ok && after != "" { | ||
| result = append(result, after) | ||
| } |
| # Insert --builtrpms right after `-q` so it associates with the query. | ||
| # Look up `-q` rather than hard-coding the index so this stays correct if | ||
| # _rpmspec_args ever reorders its preamble. | ||
| bin_args.insert(bin_args.index("-q") + 1, "--builtrpms") |
| if len(inputs) == 0 { | ||
| return nil, fmt.Errorf("no components have a rendered spec on disk; skipped %d", skipped) | ||
| } |
| if results != nil && !isNilValue(results) { | ||
| if reportErr := reportResults(env, results); reportErr != nil { | ||
| return errors.Join(err, reportErr) | ||
| } | ||
| } |
Introduces
azldev component queryfor bulk-querying rendered RPM specs inside a single shared mock chroot, with arch-aware policy enforcement.CLI
azldev component querycommand — selects components via the standard component filter, renders results as a table / CSV / Markdown / JSON.--archflag (defaultx86_64, also acceptsaarch64) drivingrpmspec --target=<arch>and ExclusiveArch/ExcludeArch policy.Docs
CLI reference under
docs/user/reference/cli/regenerated.