Skip to content

src: optimize utf-8 byte length calculation using simdutf#61601

Open
mertcanaltin wants to merge 2 commits intonodejs:mainfrom
mertcanaltin:mert/stringView/buffer
Open

src: optimize utf-8 byte length calculation using simdutf#61601
mertcanaltin wants to merge 2 commits intonodejs:mainfrom
mertcanaltin:mert/stringView/buffer

Conversation

@mertcanaltin
Copy link
Member

@mertcanaltin mertcanaltin commented Jan 31, 2026

I used stringView in simdutf for large strings, small strings continue as they are

The benchmark result for buffers is very long, so I created a gist for this benchmark result
all results:

➜  node git:(mert/stringView/buffer) ✗ node-benchmark-compare ./result.csv
                                                                                              confidence improvement accuracy (*)    (**)   (***)
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='four_bytes'                    1.49 %       ±2.19%  ±2.95%  ±3.93%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='latin1'                        0.71 %       ±2.58%  ±3.51%  ±4.72%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='one_byte'                      0.12 %       ±2.68%  ±3.62%  ±4.83%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='three_bytes'            *      2.57 %       ±2.24%  ±3.03%  ±4.03%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='two_bytes'                    -0.03 %       ±2.96%  ±4.00%  ±5.35%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='four_bytes'                      1.25 %       ±1.81%  ±2.44%  ±3.24%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='latin1'                          0.29 %       ±1.89%  ±2.55%  ±3.39%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='one_byte'                        0.92 %       ±2.08%  ±2.80%  ±3.73%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='three_bytes'                     0.63 %       ±1.63%  ±2.19%  ±2.92%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='two_bytes'                       2.27 %       ±2.71%  ±3.66%  ±4.88%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='four_bytes'                  -0.01 %       ±2.29%  ±3.09%  ±4.12%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='latin1'                       1.14 %       ±2.18%  ±2.94%  ±3.91%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='one_byte'                     1.15 %       ±2.48%  ±3.35%  ±4.46%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='three_bytes'                  1.55 %       ±2.14%  ±2.89%  ±3.84%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='two_bytes'                    0.29 %       ±2.24%  ±3.03%  ±4.03%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='four_bytes'            ***    216.82 %      ±15.57% ±21.51% ±29.69%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='latin1'                        -0.06 %       ±1.68%  ±2.27%  ±3.02%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='one_byte'               **      2.48 %       ±1.54%  ±2.09%  ±2.78%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='three_bytes'           ***    207.83 %       ±4.55%  ±6.26%  ±8.59%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='two_bytes'             ***    210.70 %       ±5.83%  ±8.06% ±11.15%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='four_bytes'                   -0.84 %       ±2.18%  ±2.97%  ±3.98%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='latin1'                        0.65 %       ±2.09%  ±2.83%  ±3.76%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='one_byte'                      2.12 %       ±2.26%  ±3.08%  ±4.15%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='three_bytes'                   0.85 %       ±2.35%  ±3.17%  ±4.21%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='two_bytes'                     0.56 %       ±2.32%  ±3.14%  ±4.21%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='four_bytes'                     -0.01 %       ±1.83%  ±2.46%  ±3.28%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='latin1'                          0.21 %       ±1.93%  ±2.61%  ±3.48%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='one_byte'                        1.56 %       ±1.82%  ±2.46%  ±3.27%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='three_bytes'                     0.20 %       ±1.48%  ±2.00%  ±2.66%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='two_bytes'                       1.10 %       ±1.85%  ±2.49%  ±3.32%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='four_bytes'                 -1.46 %       ±2.05%  ±2.77%  ±3.68%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='latin1'               *     -4.38 %       ±4.22%  ±5.73%  ±7.68%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='one_byte'                    1.62 %       ±3.54%  ±4.85%  ±6.62%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='three_bytes'                -0.72 %       ±1.49%  ±2.01%  ±2.67%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='two_bytes'                   1.30 %       ±2.01%  ±2.71%  ±3.61%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='four_bytes'           ***    241.57 %      ±16.03% ±21.83% ±29.44%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='latin1'                       -0.11 %       ±1.49%  ±2.01%  ±2.67%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='one_byte'                      1.09 %       ±1.55%  ±2.09%  ±2.78%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='three_bytes'          ***    345.77 %      ±19.42% ±26.39% ±35.52%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='two_bytes'            ***    320.12 %      ±13.94% ±18.83% ±25.06%

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 40 comparisons, you can thus expect the following amount of false-positive results:
  2.00 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.40 false positives, when considering a   1% risk acceptance (**, ***),
  0.04 false positives, when considering a 0.1% risk acceptance (***)
➜  node git:(mert/stringView/buffer) ✗ 

@nodejs-github-bot nodejs-github-bot added buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jan 31, 2026
@mertcanaltin mertcanaltin changed the title src: optimize utf-8 length calculation for small and large strings src: optimize utf-8 byte length calculation using simdutf Jan 31, 2026
Copy link
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmarks show mostly regressions, no?

@mertcanaltin
Copy link
Member Author

The benchmarks show mostly regressions, no?

I only changed the UTF-8 calculation for Buffer.byteLength(). The regression results you see may be due to the local environment. Could you run benchmark-ci for me? That way we can be more certain.

@ChALkeR ChALkeR added the performance Issues and PRs related to the performance of Node.js. label Jan 31, 2026
@mertcanaltin mertcanaltin force-pushed the mert/stringView/buffer branch 2 times, most recently from 97a81a8 to c63b474 Compare January 31, 2026 22:37
@codecov
Copy link

codecov bot commented Feb 1, 2026

Codecov Report

❌ Patch coverage is 69.56522% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.66%. Comparing base (330e3ee) to head (753dbd0).
⚠️ Report is 9 commits behind head on main.

Files with missing lines Patch % Lines
src/node_buffer.cc 69.56% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #61601      +/-   ##
==========================================
- Coverage   91.62%   89.66%   -1.97%     
==========================================
  Files         337      676     +339     
  Lines      140453   206320   +65867     
  Branches    21801    39526   +17725     
==========================================
+ Hits       128694   184999   +56305     
- Misses      11536    13450    +1914     
- Partials      223     7871    +7648     
Files with missing lines Coverage Δ
src/node_buffer.cc 69.79% <69.56%> (ø)

... and 462 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mertcanaltin
Copy link
Member Author

mertcanaltin commented Feb 1, 2026

Thanks for the review @BridgeAR Pushed a fix for the regressions and updated results

issue was String::ValueView overhead eating simdutf gains on one-byte strings. Now I grab external one-byte pointers directly (zero-copy), skip simdutf for small one-byte strings (≤1024) using V8's native path, and apply full simdutf for two-byte strings after 128 bytes.

Results:
https://gist.github.com/mertcanaltin/816fdb8bb6e20d5e4f97e308db0263e9

@mertcanaltin mertcanaltin requested a review from BridgeAR February 1, 2026 14:41
@anonrig
Copy link
Member

anonrig commented Feb 2, 2026

The benchmarks show mostly regressions, no?

@BridgeAR can you rereview? the benchmarks seems to be unstable enough to show differences in untriggered paths

@anonrig anonrig added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 2, 2026
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 2, 2026
@nodejs-github-bot
Copy link
Collaborator

@mertcanaltin
Copy link
Member Author

Hey @BridgeAR, if you're available, could you take a look at this?

@nodejs-github-bot
Copy link
Collaborator

@anonrig
Copy link
Member

anonrig commented Feb 27, 2026

@mertcanaltin I think the PR description (and the benchmark) needs to be updated since you added a minimum threshold.

@mertcanaltin mertcanaltin force-pushed the mert/stringView/buffer branch from 3da34b6 to ef28e76 Compare March 4, 2026 05:45
@mertcanaltin
Copy link
Member Author

mertcanaltin commented Mar 4, 2026

@anonrig Updated Pr Description, simplified the approach to only use simdutf for two-byte strings. One-byte strings delegate to V8's Utf8LengthV2 ~207-345% improvement on multi-byte

@BridgeAR regressions are resolved, could you re-review?

@BridgeAR
Copy link
Member

BridgeAR commented Mar 4, 2026

Benchmark: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1806/

@mertcanaltin
Copy link
Member Author

mertcanaltin commented Mar 5, 2026

Benchmark: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/1806/

@BridgeAR Thanks for the benchmark results, The regressions on small strings were caused by the extra branching overhead, the previous version had separate IsOneByte(), threshold, and simdutf checks before reaching Utf8LengthV2, which added up for short inputs (repeat=1, repeat=2).

I fixed it by combining the checks into a single condition: if (length <= kSmallStringThreshold || source->IsOneByte()). Since || short-circuits, small strings hit the length check first and go straight to V8 without any additional calls. This way we keep the fast simdutf path for large two-byte strings while small strings take the same lightweight path as before.

my local result, the solved regression:

repeat=1, two_bytes  -16.06% to +1.47% 
repeat=1, three_bytes  -17.44% to +1.52%
repeat=2, two_bytes  -19.59% to +1.86%
repeat=2, three_bytes  -10.65% to +1.06% 

my all local result:

➜  node git:(mert/stringView/buffer) ✗ node-benchmark-compare ./result.csv
                                                                                              confidence improvement accuracy (*)    (**)   (***)
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='four_bytes'                   -1.60 %       ±2.24%  ±3.03%  ±4.04%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='latin1'                       -1.04 %       ±2.46%  ±3.34%  ±4.49%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='one_byte'                      0.40 %       ±1.99%  ±2.69%  ±3.57%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='three_bytes'            *     -1.79 %       ±1.67%  ±2.26%  ±3.02%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='base64' type='two_bytes'                    -0.77 %       ±1.93%  ±2.60%  ±3.46%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='four_bytes'                     -0.60 %       ±4.04%  ±5.55%  ±7.59%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='latin1'                          0.39 %       ±1.45%  ±1.96%  ±2.61%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='one_byte'                        1.50 %       ±1.63%  ±2.20%  ±2.93%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='three_bytes'                     1.52 %       ±1.64%  ±2.21%  ±2.94%
buffers/buffer-bytelength-string.js n=4000000 repeat=1 encoding='utf8' type='two_bytes'                *      1.47 %       ±1.37%  ±1.85%  ±2.46%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='four_bytes'                   0.10 %       ±2.07%  ±2.79%  ±3.70%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='latin1'                *     -2.04 %       ±1.84%  ±2.49%  ±3.32%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='one_byte'                     2.66 %       ±4.29%  ±5.90%  ±8.09%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='three_bytes'                  0.01 %       ±1.66%  ±2.25%  ±3.00%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='base64' type='two_bytes'                    0.61 %       ±2.10%  ±2.84%  ±3.79%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='four_bytes'            ***    196.97 %       ±3.63%  ±5.00%  ±6.88%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='latin1'                        -0.05 %       ±1.43%  ±1.93%  ±2.57%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='one_byte'                *      1.77 %       ±1.41%  ±1.91%  ±2.54%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='three_bytes'           ***    208.59 %       ±5.56%  ±7.69% ±10.62%
buffers/buffer-bytelength-string.js n=4000000 repeat=16 encoding='utf8' type='two_bytes'             ***    204.15 %       ±3.91%  ±5.37%  ±7.35%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='four_bytes'                   -0.01 %       ±1.68%  ±2.27%  ±3.01%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='latin1'                       -0.47 %       ±1.72%  ±2.32%  ±3.09%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='one_byte'                     -0.66 %       ±2.15%  ±2.91%  ±3.89%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='three_bytes'                  -0.61 %       ±1.53%  ±2.07%  ±2.75%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='base64' type='two_bytes'                    -0.23 %       ±2.05%  ±2.76%  ±3.68%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='four_bytes'                     -0.41 %       ±1.49%  ±2.01%  ±2.69%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='latin1'                          0.28 %       ±1.54%  ±2.08%  ±2.78%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='one_byte'               ***      3.25 %       ±1.61%  ±2.17%  ±2.88%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='three_bytes'                     1.06 %       ±1.73%  ±2.34%  ±3.13%
buffers/buffer-bytelength-string.js n=4000000 repeat=2 encoding='utf8' type='two_bytes'                *      1.86 %       ±1.53%  ±2.07%  ±2.78%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='four_bytes'                 -0.57 %       ±2.07%  ±2.79%  ±3.71%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='latin1'                     -2.11 %       ±2.59%  ±3.56%  ±4.90%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='one_byte'                   -1.13 %       ±2.10%  ±2.83%  ±3.77%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='three_bytes'                 0.06 %       ±3.00%  ±4.09%  ±5.54%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='base64' type='two_bytes'                  -1.17 %       ±1.72%  ±2.32%  ±3.09%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='four_bytes'           ***    222.29 %       ±5.34%  ±7.38% ±10.21%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='latin1'                       -0.05 %       ±1.04%  ±1.41%  ±1.88%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='one_byte'                      0.80 %       ±1.43%  ±1.93%  ±2.59%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='three_bytes'          ***    299.53 %       ±7.37% ±10.21% ±14.14%
buffers/buffer-bytelength-string.js n=4000000 repeat=256 encoding='utf8' type='two_bytes'            ***    301.01 %       ±8.69% ±12.04% ±16.70%

Be aware that when doing many comparisons the risk of a false-positive result increases.
In this case, there are 40 comparisons, you can thus expect the following amount of false-positive results:
  2.00 false positives, when considering a   5% risk acceptance (*, **, ***),
  0.40 false positives, when considering a   1% risk acceptance (**, ***),
  0.04 false positives, when considering a 0.1% risk acceptance (***)
➜  node git:(mert/stringView/buffer) ✗ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

buffer Issues and PRs related to the buffer subsystem. c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. performance Issues and PRs related to the performance of Node.js.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants