Skip to content

summarize() Mutates Argument List Between Calls #10121

@Mattral

Description

@Mattral

Describe the bug

The summarize() function in scripts/performance/benchmark_utils.py mutates the argument list used to invoke the summarization script.

Inside the function, a list summarize_args is created and used for the first subprocess.check_call() invocation. Before the second invocation, the same list is modified using .extend():

summarize_args.extend(['--output-format', 'json'])

Because the same list object is reused, the function mutates the original argument list. This can cause duplicated CLI arguments if the function is reused or invoked multiple times in the same process.

This behavior introduces unintended side effects and makes the function non-idempotent.


Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Expected Behavior

summarize() should construct a fresh argument list for each subprocess call so that the original list of arguments remains unchanged.

Each subprocess invocation should receive only the arguments required for that specific command execution.

For example:

  • First call should receive arguments for generating the text summary.
  • Second call should receive arguments for generating the JSON summary without mutating the original list.

Current Behavior

Current Behavior

The same list object is reused and mutated:

summarize_args.extend(['--output-format', 'json'])

This modifies the argument list globally for the function scope.

If summarize() is invoked again within the same process (for example in extended benchmarking workflows or reused utilities), the argument list may contain duplicated flags such as:

--output-format json --output-format json

This may lead to unexpected behavior depending on how the summarization script parses arguments.


Reproduction Steps

Reproduction Steps

Example simplified reproduction demonstrating the mutation behavior:

def test_mutation():
    summarize_args = ["script", "file1.csv"]

    # First command
    print("First call:", summarize_args)

    # Mutate list
    summarize_args.extend(["--output-format", "json"])

    # Second command
    print("Second call:", summarize_args)

    # Simulate reuse
    summarize_args.extend(["--output-format", "json"])
    print("Third call:", summarize_args)


test_mutation()

Output:

First call: ['script', 'file1.csv']
Second call: ['script', 'file1.csv', '--output-format', 'json']
Third call: ['script', 'file1.csv', '--output-format', 'json', '--output-format', 'json']

This illustrates how the argument list grows due to mutation.


Possible Solution

Possible Solution

Instead of mutating the original argument list, construct a new list for the JSON summary call.

Example fix:

with open(os.path.join(summary_dir, 'summary.txt'), 'wb') as f:
    subprocess.check_call(summarize_args, stdout=f)

with open(os.path.join(summary_dir, 'summary.json'), 'wb') as f:
    json_args = summarize_args + ['--output-format', 'json']
    subprocess.check_call(json_args, stdout=f)

This preserves the original argument list and avoids side effects.


Additional Information/Context

Additional Information / Context

This issue exists in internal benchmarking utilities used under scripts/performance. While these scripts are primarily developer tooling, ensuring deterministic command construction helps prevent subtle errors in automated benchmarking workflows or CI environments.

Avoiding mutation of shared argument lists is also consistent with common CLI invocation best practices.


CLI version used

Not applicable, issue located in repository tooling (scripts/performance).

Environment details (OS name and version, etc.)

OS: Linux / macOS (reproducible on any OS) Python: 3.x Repository: aws/aws-cli Path: scripts/performance/benchmark_utils.py

Metadata

Metadata

Labels

benchmarkingbugThis issue is a bug.p3This is a minor priority issueresponse-requestedWaiting on additional info and feedback. Will move to "closing-soon" in 7 days.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions