Skip to content

Improve granularity of Perf.Utf8Encoding benchmarks#5048

Open
ylpoonlg wants to merge 1 commit intodotnet:mainfrom
ylpoonlg:github-encoding
Open

Improve granularity of Perf.Utf8Encoding benchmarks#5048
ylpoonlg wants to merge 1 commit intodotnet:mainfrom
ylpoonlg:github-encoding

Conversation

@ylpoonlg
Copy link
Contributor

Implement the benchmark functions as suggested by #1512 :

  • GetByteCount: Covers UTF-16 validation
  • GetCharCount: Covers UTF-8 validation
  • GetBytesFromChars: Covers UTF-16 to UTF-8 transcoding
  • GetCharsFromBytes: Covers UTF-8 to UTF-16 transcoding
  • GetStringFromBytes: Covers UTF-8 validation, string allocation and UTF-16 transcoding
  • GetBytesFromString: Covers UTF-16 validation, byte array instantiation and UTF-8 transcoding

Call the methods from Encoding.UTF8 instead of a new UTF8Encoding() instance to test the JIT's devirtualization.

Implement the suggestions from dotnet#1512:
* Break down benchmarks for Unicode validation and transcoding.
* Call from `Encoding.UTF8` so that devirtualization can be tested.
@ylpoonlg
Copy link
Contributor Author

cc @dotnet/arm64-contrib @a74nh @SwapnilGaikwad @tannergooding

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the UTF-8 encoding microbenchmarks to exercise more Encoding.UTF8 APIs (including span-based overloads) while simplifying setup state.

Changes:

  • Replace per-instance UTF8Encoding usage with Encoding.UTF8 calls.
  • Add new benchmarks for GetCharCount, span-based GetBytes/GetChars, and GetString from bytes.
  • Adjust setup state to include both UTF-16 chars and UTF-8 bytes buffers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +21 to +32
_string = File.ReadAllText(Path.Combine(TextFilesRootPath, $"{Input}.txt"));
_bytes = Encoding.UTF8.GetBytes(_string);
_chars = _string.ToCharArray();
}

[Benchmark]
[MemoryRandomization]
public int GetByteCount() => _utf8Encoding.GetByteCount(_unicode);
public int GetByteCount() => Encoding.UTF8.GetByteCount(_string);

[Benchmark]
public byte[] GetBytes() => _utf8Encoding.GetBytes(_unicode);
[MemoryRandomization]
public int GetCharCount() => Encoding.UTF8.GetCharCount(_bytes);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants