Replace nested Stream.concat with direct iteration in expression merging by RanVaknin · Pull Request #6725 · aws/aws-sdk-java-v2

RanVaknin · 2026-02-10T21:59:22Z

[NEEDS MORE TEST COVERAGE TO MERGE]

Overview

This PR targets two bottlenecks in UpdateExpressionConverter.toExpression as part of the Update operation request pipeline. toExpression scales and becomes the most noticable bottleneck the bigger the input is: ~40% cpu time in SMALL datasets, and ~74% in HUGE_FLAT datasets.

Interactive Flamegraph

1. String.format overhead (~18% CPU time)

groupExpressions uses String.format to build the update expression for each attribute. Can replace with a simple string concatenation which the JVM optimizes heavily.

2. Nested `Stream.concat()` in `mergeExpressionNames()` and `mergeExpressionValues()` (15-50% CPU time)

Current implementation:

private static Map<String, String> mergeExpressionNames(UpdateExpression expression) {
    return streamOfExpressionNames(expression)
        .reduce(Expression::joinNames)
        .orElseGet(Collections::emptyMap);
}

private static Stream<Map<String, String>> streamOfExpressionNames(UpdateExpression expression) {
    return Stream.concat(
        expression.setActions().stream().map(SetAction::expressionNames),
        Stream.concat(
            expression.removeActions().stream().map(RemoveAction::expressionNames),
            Stream.concat(
                expression.deleteActions().stream().map(DeleteAction::expressionNames),
                expression.addActions().stream().map(AddAction::expressionNames)
            )
        )
    );
}

The problem is that Stream.reduce() operation calls Expression.joinNames() for each action type (SET, ADD, REMOVE, DELETE). This method creates a new HashMap and copies all existing entries on every iteration:

public static Map<String, String> joinNames(Map<String, String> map1, Map<String, String> map2) {
    Map<String, String> result = new HashMap<>(map1);  // <- Allocate + copy all entries
    map2.forEach((key, value) -> {
        String oldValue = result.put(key, value);
        if (oldValue != null && !oldValue.equals(value)) {
            throw new IllegalArgumentException(...);
        }
    });
    return Collections.unmodifiableMap(result);
}

Example: With N attributes distributed across 4 action types, Stream.reduce() creates 3 intermediate HashMaps. For example, with 20 attributes (5 per action type), it copies 5 entries, then 10, then 15, before producing the final 20 entry result. The cost scales poorly with dataset size.

Instead we can iterate directly on each action and use a single HashMap:

private static Map<String, String> mergeExpressionNames(UpdateExpression expression) {
    Map<String, String> merged = new HashMap<>();
    
    for (SetAction action : expression.setActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (RemoveAction action : expression.removeActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (DeleteAction action : expression.deleteActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    for (AddAction action : expression.addActions()) {
        mergeNamesInto(merged, action.expressionNames());
    }
    
    return merged.isEmpty() ? Collections.emptyMap() : Collections.unmodifiableMap(merged);
}

private static void mergeNamesInto(Map<String, String> target, Map<String, String> source) {
    if (source == null || source.isEmpty()) {
        return;
    }
    source.forEach((key, value) -> {
        String oldValue = target.get(key);
        if (oldValue != null && !oldValue.equals(value)) {
            throw new IllegalArgumentException(
                String.format("Attempt to coalesce two expressions with conflicting expression names. "
                            + "Expression name key = '%s'", key));
        }
        target.put(key, value);
    });
}

This eliminates intermediate HashMap allocations and theredundant entry copying. The same proposed optimization also applies to mergeExpressionValues().

Results:

toExpression went from ~40% CPU time to ~15%

groupExpressions went from 18% CPU time to 6%
mergeExpressionValues and mergeExpressionNames went down from 18% to ~4%

Operation	Size	Master (μs)	String Fix (μs/op)	Δ	+Stream flattening (μs/op)	Total Δ
Update	TINY	1.38	1.395	+1.09%	1.366	-1.01%
Update	SMALL	9.174	8.653	-5.68%	7.202	-21.50%
Update	HUGE	39.23	37.51	-4.38%	34.438	-12.22%
Update	HUGE_FLAT	232.677	224.088	-3.69%	81.248	-65.08%

…he previous value associated with key

sonarqubecloud · 2026-02-10T23:23:03Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
94.9% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

RanVaknin added 2 commits February 9, 2026 21:30

Replace nested Stream.concat with direct iteration in expression merging

d6916da

Refactor merge function to only do one map lookup since put returns t…

1e70ec5

…he previous value associated with key

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace nested Stream.concat with direct iteration in expression merging#6725

Replace nested Stream.concat with direct iteration in expression merging#6725
RanVaknin wants to merge 2 commits intomasterfrom
rvaknin/ddb-optimization-updateExpression-stream-flattening

RanVaknin commented Feb 10, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RanVaknin commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

1. String.format overhead (~18% CPU time)

2. Nested Stream.concat() in mergeExpressionNames() and mergeExpressionValues() (15-50% CPU time)

Current implementation:

Results:

Uh oh!

sonarqubecloud bot commented Feb 10, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

RanVaknin commented Feb 10, 2026 •

edited

Loading

2. Nested `Stream.concat()` in `mergeExpressionNames()` and `mergeExpressionValues()` (15-50% CPU time)