Average multi microbenchmarks results by VincentBu · Pull Request #5215 · dotnet/performance

VincentBu · 2026-05-01T07:37:34Z

This PR aims at calculating average value of multiple microbenchmarks results. The work revolves around:

Reduce memory usage.
Change namespace of some classes and rename them for future work.

…ks when creating suites

…ust namespaces

Copilot

Pull request overview

This PR updates the GC microbenchmark infrastructure to support aggregating (averaging) results across multiple microbenchmark runs/iterations, while also renaming/refactoring parts of the analysis/presentation pipeline and introducing an outlier-removal helper.

Changes:

Add configurable microbenchmark iteration count (iterations) and wire it into suite creation and execution.
Replace the previous single-result comparison flow with a new per-benchmark aggregation/comparison pipeline (MicrobenchmarkResultComparison, GCTraceMetrics, GCTraceMetricComparisonResult).
Refactor output generation to primarily emit JSON (markdown generation currently disabled).

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 18 comments.

Show a summary per file

File	Description
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/CreateSuiteCommand.cs	Reads configured iteration count and applies it to microbenchmark suite environment.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/MicrobenchmarksToRun.txt	Updates baseline suite benchmark list.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/Microbenchmarks.yaml	Renames environment iteration setting to `iterations`.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkCommand.cs	Runs microbenchmarks for `iterations` and switches to new aggregation/comparison logic before presenting results.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkAnalyzeCommand.cs	Updates analysis-only command to use the new aggregation/comparison logic.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Presentation.cs	Changes presentation API to accept precomputed grouped results; markdown output path currently disabled.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Markdown.cs	Markdown generation code is commented out.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json/JsonOutput.cs	Removes unused placeholder type.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json.cs	Moves JSON generator to Microbenchmarks presentation namespace and updates signature for grouped results.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/Microbenchmarks.Configuration.cs	Renames `iteration` to `iterations` in microbenchmark environment configuration.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/InputConfiguration.cs	Adds `iterations` map to input configuration.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultsAnalyzer.cs	Removes old analyzer/comparison pipeline.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultComparison.cs	Adds new JSON/trace mapping, per-benchmark analysis, and aggregation/grouping logic.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResult.cs	Introduces new MicrobenchmarkResult model (namespace currently mismatched vs usage).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkComparisonResult.cs	Updates comparison to support averaged values/outlier removal and new trace-metric comparisons.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetrics.cs	Adds trace-derived metric extraction (includes reflection/stat bugs).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparisonResult.cs	Adds averaged comparison for trace metrics (baseline vs comparand).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparison.cs	Adds helper wrapper for metric comparison construction.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/BdnJsonResult.cs	Refactors BDN JSON model types; renames top-level to `BdnJsonResult`.
src/benchmarks/gc/GC.Infrastructure/GC.Analysis.API/Statistics.cs	Adds `RemoveOutliers` helper (IQR method).
src/benchmarks/gc/GC.Infrastructure/Configurations/Run.yaml	Adds iteration configuration block (currently mismatched with new `iterations` input model).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

                if (format == "markdown")
                {
-                    Markdown.GenerateTable(configuration, comparisonResults, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));
+                    //Markdown.GenerateTable(configuration, comparisonResultsGroupedByName, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));


+            PauseDurationMSec_MeanWhereIsEphemeral =
+                GoodLinq.Average(GoodLinq.Where(processData.GCs, (gc => gc.Generation == 1 || gc.Generation == 0)), (gc => gc.PauseDurationMSec));
+            PauseDurationSeconds_SumWhereIsGen1 =
+                GoodLinq.Sum(GoodLinq.Where(processData.GCs, (gc => gc.Generation == 1)), (gc => gc.PauseDurationMSec));


-            IReadOnlyList<MicrobenchmarkComparisonResults> comparisonResults = MicrobenchmarkResultsAnalyzer.GetComparisons(configuration);
-            Presentation.Present(configuration, new()); // Execution details aren't available for the analysis-only mode.
+
+            Run run = configuration.Runs.Values.FirstOrDefault();


+
+            Presentation.Present(configuration, comparisonResultsGroupedName, new()); // Execution details aren't available for the analysis-only mode.
            Directory.SetCurrentDirectory(currentDirectory);
            AnsiConsole.Markup($"[bold green] ({DateTime.Now}) Wrote Microbechmark Results to: {Markup.Escape(Path.Combine(configuration.Output.Path, "Results.md"))} [/]");


VincentBu · 2026-05-01T07:49:20Z

 "System.Tests.Perf_GC<Byte>.NewOperator_Array(length: 10000)" |
 "System.Tests.Perf_GC<Char>.NewOperator_Array(length: 1000)" | 
 "System.Tests.Perf_GC<Char>.NewOperator_Array(length: 10000)" |
-"System.IO.Tests.Perf_File.ReadAllBytesAsync(size: 104857600)" | 


The result of benchmark will be written in < run name >/< datetime string >/< Namespace >.-report-full.json while the trace file is collected in < run name >/< full name >_< idx >.etl.zip. That means dup benchmark leading to mismatch between json and trace.

VincentBu · 2026-05-01T07:53:59Z

+
+namespace GC.Infrastructure.Core.Analysis
+{
+    public sealed class GCTraceMetrics


Rename ResultItem to GCTraceMetrics and move it GC.Infrastructure.Core.Analysis namespace since it's used by both gcperfsim and microbenchmarks.

VincentBu · 2026-05-01T07:57:49Z

+
+namespace GC.Infrastructure.Core.Analysis
+{
+    public sealed class GCTraceMetricComparisonResult


Rename ComparisonResult to GCTraceMetricComparisonResult and move it to GC.Infrastructure.Core.Analysis.

VincentBu · 2026-05-01T08:01:19Z

 {
-    public sealed class MicrobenchmarkResult


Since it's a data transfer object, "BdnJsonResult" is more explicit.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 13 comments.

                if (format == "markdown")
                {
-                    Markdown.GenerateTable(configuration, comparisonResults, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));
+                    //Markdown.GenerateTable(configuration, comparisonResultsGroupedByName, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));


-            IReadOnlyList<MicrobenchmarkComparisonResults> comparisonResults = MicrobenchmarkResultsAnalyzer.GetComparisons(configuration);
-            Presentation.Present(configuration, new()); // Execution details aren't available for the analysis-only mode.
+
+            Run run = configuration.Runs.Values.FirstOrDefault();


+                string[] jsonFiles = Directory.GetFiles(outputPathForRun, "*full.json", SearchOption.AllDirectories);
+
+                Parallel.ForEach(jsonFiles, (jsonFile) =>
+                {
+                    BdnJsonResult results = JsonConvert.DeserializeObject<BdnJsonResult>(File.ReadAllText(jsonFile));
+                    string fullName = results.Benchmarks.FirstOrDefault()?.FullName;
+                    benchmarkFullNameJsonMap[fullName] = benchmarkFullNameJsonMap.GetValueOrDefault(fullName, new());
+                    benchmarkFullNameJsonMap[fullName].Add(jsonFile);
+                });


+            string[] sortedJsonFiles = jsonTraceMap.Keys
+                .OrderBy(jsonFile => Path.GetFileName(Path.GetDirectoryName(jsonFile)))
+                .ToArray();
+
+            string traceFileNameTemplate = _benchmarkNameToTraceFilePatternMap[benchmarkFullName];
+
+            string[] sortedTraceFiles = Enumerable.Where(Directory.GetFiles(outputPathForRun, "*.etl.zip", SearchOption.TopDirectoryOnly), traceFile =>
+                    Path.GetFileName(traceFile).ToLower().Contains(traceFileNameTemplate.ToLower()))
+                .OrderBy(traceFile => traceFile)


+            // If property isn't found on the GCTraceMetrics, look in GCStats.
+            // TODO: Add the case where we look into the map.
+            else
+            {
+                pInfo = typeof(GCStats).GetProperty(metricName, BindingFlags.Instance | BindingFlags.Public);
+                if (pInfo == null)
+                {
+                    FieldInfo fieldInfo = typeof(GCStats).GetField(metricName, BindingFlags.Instance | BindingFlags.Public);
+                    if (fieldInfo == null)
+                    {
+                        // Out of luck!
+                        OriginalBaselineMetricCollection = Array.Empty<double>();
+                        OriginalComparandMetricCollection = Array.Empty<double>();
+                        OutliersFreeBaselineMetricCollection = Array.Empty<double>();
+                        OutliersFreeComparandMetricCollection = Array.Empty<double>();
+                        AveragedBaselineMetric = double.NaN;
+                        AveragedComparandMetric = double.NaN;
+                        return;
+                    }
+
+                    else
+                    {
+                        OriginalBaselineMetricCollection = GoodLinq.Select(baselines, baseline => (double)fieldInfo.GetValue(baseline));
+                        OriginalComparandMetricCollection = GoodLinq.Select(comparands, comparand => (double)fieldInfo.GetValue(comparand));
+                    }
+                }
+
+                else
+                {
+                    OriginalBaselineMetricCollection = GoodLinq.Select(baselines, baseline => (double)pInfo.GetValue(baseline));
+                    OriginalComparandMetricCollection = GoodLinq.Select(comparands, comparand => (double)pInfo.GetValue(comparand));
+                }


+            string[] sortedTraceFiles = Enumerable.Where(Directory.GetFiles(outputPathForRun, "*.etl.zip", SearchOption.TopDirectoryOnly), traceFile =>
+                    Path.GetFileName(traceFile).ToLower().Contains(traceFileNameTemplate.ToLower()))
+                .OrderBy(traceFile => traceFile)
+                .ToArray();
+
+            if (sortedJsonFiles.Length != sortedTraceFiles.Length)
+            {
+                throw new InvalidOperationException(
+                    $"The number of JSON files ({sortedJsonFiles.Length}) does not match the number of trace files ({sortedTraceFiles.Length}) for benchmark: {benchmarkFullName}");
+            }


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 11 comments.

                if (format == "markdown")
                {
-                    Markdown.GenerateTable(configuration, comparisonResults, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));
+                    //Markdown.GenerateTable(configuration, comparisonResultsGroupedByName, executionDetails, Path.Combine(configuration.Output.Path, "Results.md"));


+                string[] jsonFiles = Directory.GetFiles(outputPathForRun, "*full.json", SearchOption.AllDirectories);
+
+                Parallel.ForEach(jsonFiles, (jsonFile) =>
+                {
+                    BdnJsonResult results = JsonConvert.DeserializeObject<BdnJsonResult>(File.ReadAllText(jsonFile));
+                    string fullName = results.Benchmarks.FirstOrDefault()?.FullName;
+                    benchmarkFullNameJsonMap[fullName] = benchmarkFullNameJsonMap.GetValueOrDefault(fullName, new());
+                    benchmarkFullNameJsonMap[fullName].Add(jsonFile);
+                });


+                runsToResults[run.Value] = runsToResults.GetValueOrDefault(run.Value, new());
+
+                Parallel.ForEach(jsonTraceMap, jsonTracePair =>
+                {
+                    string jsonPath = jsonTracePair.Key;
+                    string tracePath = jsonTracePair.Value;
+
+                    BdnJsonResult results = JsonConvert.DeserializeObject<BdnJsonResult>(File.ReadAllText(jsonPath));
+
+                    foreach (var benchmark in results?.Benchmarks)
+                    {
+                        Statistics statistics = benchmark.Statistics;
+
+                        MicrobenchmarkResult microbenchmarkResult = new()
+                        {
+                            Statistics = statistics,
+                            Parent = run.Value,
+                            MicrobenchmarkName = benchmarkFullName,
+                        };
+
+                        if (!excludeTraces)
+                        {
+                            using var analyzer = AnalyzerManager.GetAnalyzer(tracePath);
+                            List<GCProcessData> allPertinentProcesses = analyzer.GetProcessGCData("dotnet");
+                            List<GCProcessData> corerunProcesses = analyzer.GetProcessGCData("corerun");
+                            allPertinentProcesses.AddRange(corerunProcesses);
+
+                            GCProcessData? benchmarkGCData = null;
+                            foreach (var process in allPertinentProcesses)
+                            {
+                                string commandLine = process.CommandLine.Replace("\"", "").Replace("\\", "");
+                                string runCleaned = benchmark.FullName.Replace("\"", "").Replace("\\", "");
+                                if (commandLine.Contains(runCleaned) && commandLine.Contains("--benchmarkName"))
+                                {
+                                    benchmarkGCData = process;
+                                    break;
+                                }
+                            }
+
+                            if (benchmarkGCData != null)
+                            {
+                                int processID = benchmarkGCData.ProcessID;
+                                microbenchmarkResult.GCData = benchmarkGCData;
+                                microbenchmarkResult.GCTraceMetrics = new GCTraceMetrics(benchmarkGCData, tracePath, benchmark.FullName);
+                                /*
+                                TODO: THIS NEEDS TO BE ADDED BACK.
+                                if (configuration.Output.cpu_columns != null && configuration.Output.cpu_columns.Count > 0)
+                                {
+                                    // TODO: Add parameterize.
+                                    benchmark.Value.GCData.Parent.AddCPUAnalysis(yamlPath: @"C:\Users\musharm\source\repos\GC.Analysis.API\GC.Analysis.API\CPUAnalysis\DefaultMethods.yaml",
+                                        symbolLogFile: Path.Combine(configuration.Output.Path, run.Key, Guid.NewGuid() + ".txt"),
+                                        symbolPath: Path.Combine(configuration.Output.Path, run.Key));
+                                    var d1 = benchmark.Value.GCData.Parent.CPUAnalyzer.GetCPUDataForProcessName("dotnet");
+                                    d1.AddRange(benchmark.Value.GCData.Parent.CPUAnalyzer.GetCPUDataForProcessName("corerun"));
+                                    benchmark.Value.CPUData = d1.FirstOrDefault(p => p.ProcessID == processID);
+                                }
+                                */
+                            }
+                        }
+                        runsToResults[run.Value].Add(microbenchmarkResult);
+                    }
+                });


+            string traceFileNameTemplate = _benchmarkNameToTraceFilePatternMap[benchmarkFullName];
+
+            string[] sortedTraceFiles = Enumerable.Where(Directory.GetFiles(outputPathForRun, "*.etl.zip", SearchOption.TopDirectoryOnly), traceFile =>
+                    Path.GetFileName(traceFile).ToLower().Contains(traceFileNameTemplate.ToLower()))
+                .OrderBy(traceFile => traceFile)
+                .ToArray();
+
+            if (sortedJsonFiles.Length != sortedTraceFiles.Length)
+            {
+                throw new InvalidOperationException(
+                    $"The number of JSON files ({sortedJsonFiles.Length}) does not match the number of trace files ({sortedTraceFiles.Length}) for benchmark: {benchmarkFullName}");
+            }


+                        {
+                            Statistics = statistics,
+                            Parent = run.Value,
+                            MicrobenchmarkName = benchmarkFullName,


+            Run run = configuration.Runs.Values.FirstOrDefault();
+            string outputPathForRun = Path.Combine(configuration.Output.Path, run.Name);


+            PauseDurationMSec_MeanWhereIsEphemeral =
+                GoodLinq.Average(GoodLinq.Where(processData.GCs, (gc => gc.Generation == 1 || gc.Generation == 0)), (gc => gc.PauseDurationMSec));
+            PauseDurationSeconds_SumWhereIsGen1 =
+                GoodLinq.Sum(GoodLinq.Where(processData.GCs, (gc => gc.Generation == 1)), (gc => gc.PauseDurationMSec));


+        public double CountIsBlockingGen2 { get; }
+        public double PauseDurationSeconds_SumWhereIsGen1 { get; }
+        public double PauseDurationMSec_MeanWhereIsEphemeral { get; }
+        public double PromotedMB_MeanWhereIsGen1 { get; }


VincentBu and others added 5 commits April 23, 2026 14:18

add iterations section for end-2-end config and set for microbenchmar…

9db4f30

…ks when creating suites

Merge branch 'dotnet:main' into average-microbenchmarks-iterations

a730074

add json-trace map and implement AnalyzeForBenchmark

9802484

Calculate comparison result by benchmark name, rename classes and adj…

f2318ac

…ust namespaces

present list of microbenchmarkresults

e34745b

Copilot AI review requested due to automatic review settings May 1, 2026 07:37

Copilot started reviewing on behalf of VincentBu May 1, 2026 07:38 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

VincentBu commented May 1, 2026

View reviewed changes

rename iteration section to iterations for Run.yaml

e9594b3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 6, 2026 05:25

Copilot started reviewing on behalf of VincentBu May 6, 2026 05:25 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Potential fix for pull request finding

c3f4b45

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 6, 2026 06:26

Copilot started reviewing on behalf of VincentBu May 6, 2026 06:27 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

fix for microbencharmks comparison

240ceb8

		Run run = configuration.Runs.Values.FirstOrDefault();
		string outputPathForRun = Path.Combine(configuration.Output.Path, run.Name);

Conversation

VincentBu commented May 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VincentBu May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VincentBu May 1, 2026

Choose a reason for hiding this comment

Uh oh!

VincentBu May 1, 2026

Choose a reason for hiding this comment

Uh oh!

VincentBu May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VincentBu May 1, 2026 •

edited

Loading