Skip to content

Pawang/perf testing parallel2#310215

Draft
pwang347 wants to merge 17 commits intomainfrom
pawang/perfTestingParallel2
Draft

Pawang/perf testing parallel2#310215
pwang347 wants to merge 17 commits intomainfrom
pawang/perfTestingParallel2

Conversation

@pwang347
Copy link
Copy Markdown
Member

No description provided.

Copilot AI review requested due to automatic review settings April 15, 2026 19:52
@github-actions
Copy link
Copy Markdown
Contributor

Screenshot Changes

Base: 7a79f779 Current: 1ae67c00

Changed (2)

chat/aiCustomizations/aiCustomizationManagementEditor/McpBrowseMode/Light
Before After
before after
editor/inlineCompletions/other/JumpToHint/Dark
Before After
before after

blocks-ci screenshots changed

Replace the contents of test/componentFixtures/blocks-ci-screenshots.md with:

Updated blocks-ci-screenshots.md
<!-- auto-generated by CI — do not edit manually -->

#### editor/codeEditor/CodeEditor/Dark
![screenshot](https://hediet-screenshots.azurewebsites.net/images/cb32a3e854b5734fe5aaca2318f2e0a42ee821b05ea97883ea42c5ba95edb3c3)

#### editor/codeEditor/CodeEditor/Light
![screenshot](https://hediet-screenshots.azurewebsites.net/images/42624fbba5e0db7f32c224b5eb9c5dd3b08245697ae2e7d2a88be0d7c287129b)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a chat performance benchmarking harness to the VS Code repo, including a mock LLM server, scenario definitions, local CLI entrypoints, and a GitHub Actions workflow to run comparisons (and optional leak checks) in CI.

Changes:

  • Introduce chat perf regression runner + memory leak checker scripts under scripts/chat-simulation/.
  • Add mock LLM server + scenario registry (content, tool-call, multi-turn) and supporting fixtures/config.
  • Wire up new npm run perf:chat / perf:chat-leak scripts and add a CI workflow + docs for running the benchmarks.
Show a summary per file
File Description
scripts/chat-simulation/test-chat-perf-regression.js New Playwright-based perf runner collecting timing/rendering/memory metrics and comparing vs a baseline build/JSON.
scripts/chat-simulation/test-chat-mem-leaks.js New state-based leak checker that cycles scenarios and measures post-GC heap + DOM node deltas.
scripts/chat-simulation/common/utils.js Shared utilities for build resolution, VS Code launch, stats, and config handling.
scripts/chat-simulation/common/mock-llm-server.js Local SSE mock server implementing Copilot/CAPI-like endpoints and scenario-based responses.
scripts/chat-simulation/common/perf-scenarios.js Defines and registers benchmark scenarios (content, tool-call, multi-turn) used by the runners.
scripts/chat-simulation/config.jsonc Default configuration for perf regression and leak thresholds.
scripts/chat-simulation/fixtures/*.ts Fixture TS files used as deterministic tool-call read/edit targets for scenarios.
package.json Adds perf:chat and perf:chat-leak npm scripts.
.github/workflows/chat-perf.yml New CI workflow to run perf comparisons (matrix-split scenarios) and optional leak checks.
.github/skills/chat-perf/SKILL.md Documentation for running the new perf + leak tools and interpreting results.
.gitignore Ignores .chat-simulation-data output directory.
build/filters.ts Excludes *.jsonc from the copyright filter (to allow the new config).

Copilot's findings

  • Files reviewed: 18/19 changed files
  • Comments generated: 5


// -- Save baseline ---------------------------------------------------
if (opts.saveBaseline) {
if (!opts.baseline) { console.error('[chat-simulation] --save-baseline requires --baseline <path>'); process.exit(1); }
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--save-baseline can overwrite the baseline JSON generated earlier in the run: opts.baseline is set to the downloaded baseline build’s JSON path, so --save-baseline will end up writing the test results into the baseline file. This breaks subsequent comparisons/caching. Consider requiring an explicit --baseline <path> argument for --save-baseline (separate from --baseline-build) and writing the saved baseline to that user-provided path (or a dedicated --save-baseline-as).

Suggested change
if (!opts.baseline) { console.error('[chat-simulation] --save-baseline requires --baseline <path>'); process.exit(1); }
if (!opts.baseline) { console.error('[chat-simulation] --save-baseline requires --baseline <path>'); process.exit(1); }
if (opts.baselineBuild) {
console.error('[chat-simulation] --save-baseline cannot be used with --baseline-build; pass an explicit --baseline <path> to save a new baseline');
process.exit(1);
}

Copilot uses AI. Check for mistakes.
Comment on lines +1318 to +1320
if (!cur || !bas || !bas.median) { continue; }
const change = (cur.median - bas.median) / bas.median;
const pct = `${change > 0 ? '+' : ''}${(change * 100).toFixed(1)}%`;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Baseline comparison skips metrics when the baseline median is 0 because of !bas.median. Several count-based metrics (layouts/style recalcs/long tasks) can legitimately have a median of 0, which would silently exclude them from regression detection. Use an explicit null/undefined check (and handle division-by-zero for percent change) instead of a truthiness check.

Suggested change
if (!cur || !bas || !bas.median) { continue; }
const change = (cur.median - bas.median) / bas.median;
const pct = `${change > 0 ? '+' : ''}${(change * 100).toFixed(1)}%`;
if (!cur || !bas || cur.median === null || cur.median === undefined || bas.median === null || bas.median === undefined) { continue; }
const change = bas.median !== 0
? (cur.median - bas.median) / bas.median
: (cur.median === 0 ? 0 : Number.POSITIVE_INFINITY);
const pct = Number.isFinite(change)
? `${change > 0 ? '+' : ''}${(change * 100).toFixed(1)}%`
: '+∞%';

Copilot uses AI. Check for mistakes.
console.log(` [ext-host] Error collecting metrics: ${err}`);
}
} finally {
extHostInspector.close();
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension-host inspector is closed in the inner finally without guarding against errors. If close() throws (e.g., WS already closed), the whole run will fail and metrics/diagnostics are lost. Wrap this close() in a try/catch (and/or set extHostInspector = null after closing) to keep cleanup best-effort.

Suggested change
extHostInspector.close();
try {
extHostInspector.close();
} catch (err) {
if (verbose) {
console.log(` [ext-host] Error closing inspector: ${err}`);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +61
const opts = {
iterations: CONFIG.iterations ?? 3,
messages: CONFIG.messages ?? 5,
verbose: false,
/** @type {string | undefined} */
build: undefined,
leakThresholdMB: CONFIG.leakThresholdMB ?? 5,
};
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--messages is parsed and documented as controlling “messages to send per iteration”, but it is never used. The leak check currently always cycles through all scenarios once per iteration. Either remove the flag/help text, or implement it (e.g., limit scenarios per iteration or repeat a subset) so CLI behavior matches documentation.

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +38
/** @param {string} text */
function stripJsoncComments(text) { return text.replace(/\/\/.*/g, '').replace(/\/\*[\s\S]*?\*\//g, ''); }

/**
* Load a namespaced section from config.jsonc.
* @param {string} section - Top-level key (e.g. 'perfRegression', 'memLeaks')
* @returns {Record<string, any>}
*/
function loadConfig(section) {
const raw = fs.readFileSync(path.join(__dirname, '..', 'config.jsonc'), 'utf-8');
const config = JSON.parse(stripJsoncComments(raw));
return config[section] ?? {};
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripJsoncComments() uses regexes to remove // and /* */ comments. This is not a safe JSONC parser (it will also strip comment-like sequences inside strings, e.g. "https://..."), which can make config parsing brittle. Since the repo already depends on jsonc-parser, consider using it to parse config.jsonc instead of regex stripping.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants