optimize TextDecoder for legacy encodings by anonrig · Pull Request #6295 · cloudflare/workerd

anonrig · 2026-03-11T02:21:05Z

This should optimize performance of TextDecoder for legacy encodings (as reported by @ChALkeR)

ask-bonk

I'm Bonk, and I've done a quick review of your PR.

This PR optimizes TextDecoder for legacy encodings by reusing a Rust-side output buffer across decode calls instead of allocating a new Vec<u16> each time, returning a raw pointer+length pair through the CXX bridge instead of moving the Vec.

Issues

1. [Critical] Trailing semicolon makes `decode()` fail to compile

Line 194 of lib.rs has }; (with a trailing semicolon) on the final ffi::DecodeResult expression. In Rust, this turns the tail expression into a statement, changing the function's implicit return from ffi::DecodeResult to (). This is a compile error — the function signature requires -> ffi::DecodeResult. (The early return on the Malformed path at line 156 is correct — it uses return explicitly.)

2. [Low] Unbounded buffer growth

The state.output buffer doubles on OutputFull but is never shrunk. A single large decode permanently inflates the allocation for the lifetime of the decoder. For streaming use cases (TextDecoderStream), this is probably fine since decoders are short-lived. For long-lived decoders that occasionally see a large chunk, you may want to shrink the buffer if it exceeds some threshold (e.g., if state.output.capacity() > 4 * state.output.len() { state.output.shrink_to(...) }). Not blocking — just something to be aware of.

src/rust/encoding/lib.rs

ask-bonk · 2026-03-11T02:23:32Z

I'm Bonk, and I've done a quick review of your PR.

The review has been posted on PR #6295. Here's a summary of what I found:

1. [Critical] Trailing semicolon on lib.rs:194 — The final ffi::DecodeResult { ... }; has a stray semicolon that turns the tail expression into a statement. This makes the function return () instead of ffi::DecodeResult, which is a compile error. I posted a suggestion to remove the semicolon.

2. [Low] Unbounded buffer growth — The reusable state.output buffer doubles on OutputFull but is never shrunk back. A single large decode permanently inflates memory for the decoder's lifetime. Not blocking, but noted as something to consider for long-lived decoders.

github run

github-actions · 2026-03-11T02:28:58Z

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>

src/rust/encoding/lib.rs

optimize TextDecoder for legacy encodings

dfd8791

anonrig requested a review from mikea March 11, 2026 02:21

anonrig requested review from a team as code owners March 11, 2026 02:21

anonrig requested review from guybedford and jasnell March 11, 2026 02:22

ask-bonk bot reviewed Mar 11, 2026

View reviewed changes

src/rust/encoding/lib.rs Outdated Show resolved Hide resolved

Update lib.rs

e8cb5fb

Co-authored-by: ask-bonk[bot] <249159057+ask-bonk[bot]@users.noreply.github.com>

jasnell reviewed Mar 11, 2026

View reviewed changes

src/rust/encoding/lib.rs Show resolved Hide resolved

jasnell approved these changes Mar 11, 2026

View reviewed changes

mikea reviewed Mar 11, 2026

View reviewed changes

src/rust/encoding/lib.rs Outdated Show resolved Hide resolved

mikea requested changes Mar 11, 2026

View reviewed changes

anonrig requested a review from mikea March 12, 2026 20:47

jasnell reviewed Mar 14, 2026

View reviewed changes

src/rust/encoding/lib.rs Show resolved Hide resolved

jasnell approved these changes Mar 14, 2026

View reviewed changes

address pr reviews

39c25ea

anonrig force-pushed the yagiz/optimize-text-decoder branch from e99a6d4 to 39c25ea Compare March 16, 2026 13:37

anonrig enabled auto-merge (squash) March 16, 2026 13:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize TextDecoder for legacy encodings#6295

optimize TextDecoder for legacy encodings#6295
anonrig wants to merge 3 commits intomainfrom
yagiz/optimize-text-decoder

anonrig commented Mar 11, 2026

Uh oh!

ask-bonk bot left a comment

Uh oh!

Uh oh!

ask-bonk bot commented Mar 11, 2026

Uh oh!

github-actions bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anonrig commented Mar 11, 2026

Uh oh!

ask-bonk bot left a comment

Choose a reason for hiding this comment

Issues

1. [Critical] Trailing semicolon makes decode() fail to compile

2. [Low] Unbounded buffer growth

Uh oh!

Uh oh!

ask-bonk bot commented Mar 11, 2026

Uh oh!

github-actions bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. [Critical] Trailing semicolon makes `decode()` fail to compile

github-actions bot commented Mar 11, 2026 •

edited

Loading