Skip to content

refactor(es/minifier): Improve tpl to str#11415

Merged
kdy1 merged 16 commits intoswc-project:mainfrom
cuyl:refact-convert-tpl-to-str
Jan 12, 2026
Merged

refactor(es/minifier): Improve tpl to str#11415
kdy1 merged 16 commits intoswc-project:mainfrom
cuyl:refact-convert-tpl-to-str

Conversation

@cuyl
Copy link
Copy Markdown
Contributor

@cuyl cuyl commented Jan 3, 2026

Description:

  • Enhance Pure::convert_tpl_to_str to prefer cooked values as much as possible.
  • Improve Str::from_tpl_raw to return Wtf8Atom instead of Atom.
  • Add more test cases.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jan 3, 2026

🦋 Changeset detected

Latest commit: 02a7aab

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 3, 2026

Binary Sizes

File Size
swc.linux-x64-gnu.node 28M (28443720 bytes)

Commit: 58e014a

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jan 3, 2026

Merging this PR will not alter performance

✅ 185 untouched benchmarks


Comparing cuyl:refact-convert-tpl-to-str (02a7aab) with main (144be84)

Open in CodSpeed

@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from 7eb2654 to e65312b Compare January 4, 2026 00:03
@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from e65312b to 7c9954b Compare January 4, 2026 01:02
@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 4, 2026

I'm not sure why we have force_str_for_tpl here, I can see 2 definition:

  1. Minification::force_str_for_tpl returns false
  2. Eval::force_str_for_tpl returns true

There are some encoding logic in convert_str_value_to_tpl_cooked, why we need convert Str::value to Tpl::cooked using convert_str_value_to_tpl_cooked, Aren't they the same thing?
Can I just remove convert_str_value_to_tpl_cooked?
@kdy1

@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from 400c7bb to b94f214 Compare January 5, 2026 02:56
@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from b94f214 to f5de72d Compare January 5, 2026 03:17
@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 5, 2026

Does CI passes, if you remove it?

@cuyl cuyl marked this pull request as ready for review January 5, 2026 04:07
@cuyl cuyl requested a review from a team as a code owner January 5, 2026 04:07
@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 5, 2026

I have changed some outputs here, but looks not a bug.
@kdy1

@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 5, 2026

@claude Review this PR. My concerns are

  • the correctness of compress_array_join_as_tpl in crates/swc_ecma_minifier/src/compress/pure/misc.rs
  • (as many as) edge cases you can think of, based on the code changes

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jan 5, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 5, 2026

@claude Review this PR. My concerns are

  • the correctness of compress_array_join_as_tpl in crates/swc_ecma_minifier/src/compress/pure/misc.rs
  • (as many as) edge cases you can think of, based on the code changes

You should do gh pr checkout 11415 to checkout the PR branch

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jan 5, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

Copy link
Copy Markdown
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

I've identified several critical issues that need to be addressed:

  1. CRITICAL: Compilation error in - calls removed function
  2. HIGH: Surrogate pair handling logic has multiple bugs in
  3. MEDIUM: Error handling uses instead of graceful degradation
  4. LOW: Missing test coverage for edge cases

Detailed comments are provided on each file. Please address the critical and high priority issues before merging.

Copy link
Copy Markdown
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

I have identified several critical issues that need to be addressed:

  1. CRITICAL: Compilation error in misc.rs line 1429 - calls removed function
  2. HIGH: Surrogate pair handling logic has multiple bugs in lit.rs
  3. MEDIUM: Error handling uses panic instead of graceful degradation
  4. LOW: Missing test coverage for edge cases

Detailed comments are provided on each file. Please address the critical and high priority issues before merging.

Comment thread crates/swc_ecma_minifier/src/compress/pure/misc.rs
Comment thread crates/swc_ecma_ast/src/lit.rs
Comment thread crates/swc_ecma_ast/src/lit.rs
Comment thread crates/swc_ecma_ast/src/lit.rs
Comment thread crates/swc_ecma_ast/src/lit.rs
Comment thread crates/swc_ecma_ast/src/lit.rs
Comment thread crates/swc_ecma_minifier/tests/eval.rs Outdated
@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 5, 2026

Correction on my first comment:

You are absolutely right that my comment about line 1429 in misc.rs was incorrect. Looking at the diff more carefully, the change on that line is:

- cur_cooked.push_wtf8(&convert_str_value_to_tpl_cooked(&s.value));
+ cur_cooked.push_wtf8(&Cow::Borrowed(&s.value));

This shows you already fixed this correctly in the PR - the new version uses Cow::Borrowed. I misread the diff and thought the old function was still being called. My apologies for the confusion!

The other comments about surrogate pair handling, iterator advancement, and panic usage remain valid concerns for the from_tpl_raw implementation in lit.rs.

@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 5, 2026

I assumed the panic is unreachable code, I will add error handing later.

@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 7, 2026

So, Can I try to remove force_str_for_tpl in this PR?

@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 7, 2026

Yeap, it would be nice. Thank you!

@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from 766606f to 627a9f4 Compare January 7, 2026 06:33
@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 7, 2026

Seems Optimizer::reduce_escaped_newline_for_str_lit is a inverse operation of Pure::convert_tpl_to_str, added some code to prevent infinity loop.

@cuyl cuyl force-pushed the refact-convert-tpl-to-str branch from 627a9f4 to 2a37002 Compare January 8, 2026 01:57
@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 8, 2026

Remove force_str_for_tpl is complicated, I would open another PR instead.

@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 8, 2026

I see, it sounds good. Thanks!

@cuyl cuyl requested a review from kdy1 January 8, 2026 09:43
@cuyl
Copy link
Copy Markdown
Contributor Author

cuyl commented Jan 11, 2026

OK, now force_str_for_tpl has been removed, and all tests passed

@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 12, 2026

@claude Review this PR

@kdy1 kdy1 requested a review from Copilot January 12, 2026 01:27
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jan 12, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 refact-convert-tpl-to-str

I'll analyze this and get back to you.

Change version from patch to major for swc_ecma_ast.
@kdy1 kdy1 requested a review from a team as a code owner January 12, 2026 01:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// We can be sure `prev_result` is less than 0xdc00
buf.push(unsafe {
CodePoint::from_u32_unchecked(prev_result)
});
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After pushing the previous high surrogate at lines 369-371, prev_result should be reset to NO_PREV_RESULT to avoid pushing it again at line 396. Without this reset, if the current result is also a high surrogate but not followed by \u, the previous surrogate will be pushed twice. Add prev_result = NO_PREV_RESULT; after line 371.

Suggested change
});
});
prev_result = NO_PREV_RESULT;

Copilot uses AI. Check for mistakes.
Comment on lines +350 to +354
)
} else {
break;
}
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When count >= max_len and result > 0x10ffff, the code emits an error but doesn't break out of the loop, allowing it to continue parsing more hex digits beyond the maximum length. This should break after emitting the error to prevent parsing extra digits. Change line 350 to include a break statement or restructure the logic to break after the if-else at line 351-353.

Suggested change
)
} else {
break;
}
}
);
}
break;
}

Copilot uses AI. Check for mistakes.
'\n' | '\u{2028}' | '\u{2029}' => {}
'u' | 'x' => {
let mut count: u8 = 0;
// result is a 4 digit hex value
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "result is a 4 digit hex value" is misleading because the result can be 2 digits for \x escapes, 4 digits for \uXXXX escapes, or up to 6 digits for \u{...} escapes. Consider updating the comment to be more accurate, such as "result accumulates the hex digits".

Suggested change
// result is a 4 digit hex value
// result accumulates the hex digits of the escape sequence

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +174
iter.next();
let mut cloned_iter = iter.clone();
while let Some(cp) = cloned_iter.next() {
if let Some(ch) = cp.to_char() {
if ch == '}' {
iter = cloned_iter;
template_longer_count += 1;
break;
}
}
}
Copy link

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for handling ${ sequences in string-to-template conversion is incorrect. When converting a string literal containing ${ to a template, only the ${ needs to be escaped as \${ (adding 1 character for the backslash). The current code incorrectly tries to find a matching } and only increments the count if found. This should be simplified to: when ${ is encountered, increment template_longer_count by 1 (for the added backslash) without looking for a closing brace.

Suggested change
iter.next();
let mut cloned_iter = iter.clone();
while let Some(cp) = cloned_iter.next() {
if let Some(ch) = cp.to_char() {
if ch == '}' {
iter = cloned_iter;
template_longer_count += 1;
break;
}
}
}
// When converting to a template, `${` becomes `\${`,
// adding exactly one extra character (the backslash).
iter.next(); // Consume the '{'
template_longer_count += 1;

Copilot uses AI. Check for mistakes.
Comment thread crates/swc_ecma_minifier/src/compress/pure/strings.rs
Copy link
Copy Markdown
Member

@kdy1 kdy1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opus is happy with it

The removal of convert_str_value_to_tpl_cooked appears to be correct. The function was doing an extra level of unescaping that was unnecessary because Str::value already contains the decoded string value. The direct use of Cow::Borrowed(&s.value) is semantically appropriate.

@kdy1 kdy1 enabled auto-merge (squash) January 12, 2026 02:08
@kdy1 kdy1 merged commit 0239523 into swc-project:main Jan 12, 2026
186 checks passed
@kdy1
Copy link
Copy Markdown
Member

kdy1 commented Jan 12, 2026

Thank you!

@kdy1 kdy1 modified the milestones: Planned, 1.15.10 Jan 20, 2026
kdy1 pushed a commit that referenced this pull request Feb 5, 2026
… es5 (#11529)

**Description:**
When the target environment is below ES2015, and non-BMP characters
(like emojis) are encountered, we stop the conversion.

This is because:
1. Tpl: `🦀` (may output directly in source code or require minimal
escaping) -> shorter
2. Str (in ES5 mode): `\uD83E\uDD80` (escape sequence for surrogate
pair) -> extremely long

**Related issue (if exists):**

#11415

---------

Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
@swc-project swc-project locked as resolved and limited conversation to collaborators Feb 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants