refactor(es/minifier): Improve tpl to str by cuyl · Pull Request #11415 · swc-project/swc

cuyl · 2026-01-03T16:15:02Z

Description:

Enhance Pure::convert_tpl_to_str to prefer cooked values as much as possible.
Improve Str::from_tpl_raw to return Wtf8Atom instead of Atom.
Add more test cases.

changeset-bot · 2026-01-03T16:15:07Z

🦋 Changeset detected

Latest commit: 02a7aab

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

github-actions · 2026-01-03T16:23:53Z

Binary Sizes

File	Size
`swc.linux-x64-gnu.node`	28M (28443720 bytes)

Commit: 58e014a

codspeed-hq · 2026-01-03T16:27:30Z

Merging this PR will not alter performance

✅ 185 untouched benchmarks

_{Comparing cuyl:refact-convert-tpl-to-str (02a7aab) with main (144be84)}

cuyl · 2026-01-04T01:58:40Z

I'm not sure why we have force_str_for_tpl here, I can see 2 definition:

Minification::force_str_for_tpl returns false
Eval::force_str_for_tpl returns true

There are some encoding logic in convert_str_value_to_tpl_cooked, why we need convert Str::value to Tpl::cooked using convert_str_value_to_tpl_cooked, Aren't they the same thing?
Can I just remove convert_str_value_to_tpl_cooked?
@kdy1

kdy1 · 2026-01-05T04:06:42Z

Does CI passes, if you remove it?

cuyl · 2026-01-05T04:08:37Z

I have changed some outputs here, but looks not a bug.
@kdy1

kdy1 · 2026-01-05T05:05:38Z

@claude Review this PR. My concerns are

the correctness of compress_array_join_as_tpl in crates/swc_ecma_minifier/src/compress/pure/misc.rs
(as many as) edge cases you can think of, based on the code changes

claude · 2026-01-05T05:06:03Z

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

kdy1 · 2026-01-05T05:09:36Z

@claude Review this PR. My concerns are

the correctness of compress_array_join_as_tpl in crates/swc_ecma_minifier/src/compress/pure/misc.rs
(as many as) edge cases you can think of, based on the code changes

You should do gh pr checkout 11415 to checkout the PR branch

claude · 2026-01-05T05:10:02Z

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

kdy1

Review Summary

I've identified several critical issues that need to be addressed:

CRITICAL: Compilation error in - calls removed function
HIGH: Surrogate pair handling logic has multiple bugs in
MEDIUM: Error handling uses instead of graceful degradation
LOW: Missing test coverage for edge cases

Detailed comments are provided on each file. Please address the critical and high priority issues before merging.

kdy1

Review Summary

I have identified several critical issues that need to be addressed:

CRITICAL: Compilation error in misc.rs line 1429 - calls removed function
HIGH: Surrogate pair handling logic has multiple bugs in lit.rs
MEDIUM: Error handling uses panic instead of graceful degradation
LOW: Missing test coverage for edge cases

Detailed comments are provided on each file. Please address the critical and high priority issues before merging.

kdy1 · 2026-01-05T05:21:28Z

Correction on my first comment:

You are absolutely right that my comment about line 1429 in misc.rs was incorrect. Looking at the diff more carefully, the change on that line is:

- cur_cooked.push_wtf8(&convert_str_value_to_tpl_cooked(&s.value));
+ cur_cooked.push_wtf8(&Cow::Borrowed(&s.value));

This shows you already fixed this correctly in the PR - the new version uses Cow::Borrowed. I misread the diff and thought the old function was still being called. My apologies for the confusion!

The other comments about surrogate pair handling, iterator advancement, and panic usage remain valid concerns for the from_tpl_raw implementation in lit.rs.

cuyl · 2026-01-05T05:43:22Z

I assumed the panic is unreachable code, I will add error handing later.

cuyl · 2026-01-07T03:25:54Z

So, Can I try to remove force_str_for_tpl in this PR?

kdy1 · 2026-01-07T03:52:45Z

Yeap, it would be nice. Thank you!

cuyl · 2026-01-07T06:37:21Z

Seems Optimizer::reduce_escaped_newline_for_str_lit is a inverse operation of Pure::convert_tpl_to_str, added some code to prevent infinity loop.

cuyl · 2026-01-08T02:01:20Z

Remove force_str_for_tpl is complicated, I would open another PR instead.

kdy1 · 2026-01-08T02:04:12Z

I see, it sounds good. Thanks!

cuyl · 2026-01-11T07:21:36Z

OK, now force_str_for_tpl has been removed, and all tests passed

kdy1 · 2026-01-12T01:27:29Z

@claude Review this PR

claude · 2026-01-12T01:27:51Z

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

Change version from patch to major for swc_ecma_ast.

Copilot

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-12T01:34:51Z

+                                            // We can be sure `prev_result` is less than 0xdc00
+                                            buf.push(unsafe {
+                                                CodePoint::from_u32_unchecked(prev_result)
+                                            });


After pushing the previous high surrogate at lines 369-371, prev_result should be reset to NO_PREV_RESULT to avoid pushing it again at line 396. Without this reset, if the current result is also a high surrogate but not followed by \u, the previous surrogate will be pushed twice. Add prev_result = NO_PREV_RESULT; after line 371.

Suggested change

});

});

prev_result = NO_PREV_RESULT;

Copilot · 2026-01-12T01:34:52Z

+                                            )
+                                        } else {
+                                            break;
+                                        }
+                                    }


When count >= max_len and result > 0x10ffff, the code emits an error but doesn't break out of the loop, allowing it to continue parsing more hex digits beyond the maximum length. This should break after emitting the error to prevent parsing extra digits. Change line 350 to include a break statement or restructure the logic to break after the if-else at line 351-353.

Suggested change

)

} else {

break;

}

}

);

}

break;

}

Copilot · 2026-01-12T01:34:52Z

+                            '\n' | '\u{2028}' | '\u{2029}' => {}
+                            'u' | 'x' => {
+                                let mut count: u8 = 0;
+                                // result is a 4 digit hex value


The comment "result is a 4 digit hex value" is misleading because the result can be 2 digits for \x escapes, 4 digits for \uXXXX escapes, or up to 6 digits for \u{...} escapes. Consider updating the comment to be more accurate, such as "result accumulates the hex digits".

Suggested change

// result is a 4 digit hex value

// result accumulates the hex digits of the escape sequence

Copilot · 2026-01-12T01:34:52Z

+                            iter.next();
+                            let mut cloned_iter = iter.clone();
+                            while let Some(cp) = cloned_iter.next() {
+                                if let Some(ch) = cp.to_char() {
+                                    if ch == '}' {
+                                        iter = cloned_iter;
+                                        template_longer_count += 1;
+                                        break;
+                                    }
+                                }
+                            }


The logic for handling ${ sequences in string-to-template conversion is incorrect. When converting a string literal containing ${ to a template, only the ${ needs to be escaped as \${ (adding 1 character for the backslash). The current code incorrectly tries to find a matching } and only increments the count if found. This should be simplified to: when ${ is encountered, increment template_longer_count by 1 (for the added backslash) without looking for a closing brace.

Suggested change

iter.next();

let mut cloned_iter = iter.clone();

while let Some(cp) = cloned_iter.next() {

if let Some(ch) = cp.to_char() {

if ch == '}' {

iter = cloned_iter;

template_longer_count += 1;

break;

}

}

}

// When converting to a template, `${` becomes `\${`,

// adding exactly one extra character (the backslash).

iter.next(); // Consume the '{'

template_longer_count += 1;

kdy1

Opus is happy with it

The removal of convert_str_value_to_tpl_cooked appears to be correct. The function was doing an extra level of unescaping that was unnecessary because Str::value already contains the decoded string value. The direct use of Cow::Borrowed(&s.value) is semantically appropriate.

kdy1 · 2026-01-12T03:02:17Z

Thank you!

… es5 (#11529) **Description:** When the target environment is below ES2015, and non-BMP characters (like emojis) are encountered, we stop the conversion. This is because: 1. Tpl: `🦀` (may output directly in source code or require minimal escaping) -> shorter 2. Str (in ES5 mode): `\uD83E\uDD80` (escape sequence for surrogate pair) -> extremely long **Related issue (if exists):** #11415 --------- Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>

cuyl force-pushed the refact-convert-tpl-to-str branch from 7eb2654 to e65312b Compare January 4, 2026 00:03

refactor(es/minifier): Improve tpl to str

7c9954b

cuyl force-pushed the refact-convert-tpl-to-str branch from e65312b to 7c9954b Compare January 4, 2026 01:02

cuyl force-pushed the refact-convert-tpl-to-str branch from 400c7bb to b94f214 Compare January 5, 2026 02:56

remove convert_str_value_to_tpl_cooked

f5de72d

cuyl force-pushed the refact-convert-tpl-to-str branch from b94f214 to f5de72d Compare January 5, 2026 03:17

cuyl marked this pull request as ready for review January 5, 2026 04:07

cuyl requested a review from a team as a code owner January 5, 2026 04:07