Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3474,3 +3474,16 @@
- "Use scheduler-recv-interval values 2/60/30/1200/600/1920 for conc 1-4/8/16/32/64/128-256"
- "Set max-running-requests=256, chunked-prefill-size=16384, mem-fraction-static=0.8, cuda-graph-max-bs=CONC, and enable symm-mem"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1544

- config-keys:
- minimaxm2.5-fp8-h100-vllm
- minimaxm2.5-fp8-h200-vllm
- minimaxm2.5-fp4-b200-vllm
- minimaxm2.5-fp4-b300-vllm
- minimaxm2.5-fp4-mi355x-vllm
description:
- "Re-run MiniMax-M2.5 single-node vLLM sweeps (H100/H200 FP8, B200/B300/MI355X FP4) with no recipe change, to capture per-GPU power telemetry (avg_power_w) added in #1558 for the power/energy canvas"
- "Source rows for the canvas predate the 2026-05-27 power-capture merge, so they carry throughput/latency but no measured power; this re-run replaces the modeled power layer with measured power"
- "benchmarks-only: power comes from the benchmark runs, evals add nothing here"
benchmarks-only: true
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1666
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new changelog entry's pr-link is set to https://github.com/SemiAnalysisAI/InferenceX/pull/XXX — a literal XXX placeholder rather than the actual PR number. The PR description references pull/1666 and every other entry in this file resolves to a real PR number; please replace XXX with 1666 before merge so the canvas re-run rows remain traceable.

Extended reasoning...

What the bug is

perf-changelog.yaml:3487 (the only line added by this PR's pr-link: field) literally reads:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

The XXX is a leftover template placeholder that was never substituted with the real PR number (1666).

How it manifests

Every other entry in perf-changelog.yaml resolves to a numeric PR — e.g. /pull/1544 at line 3476, and /pull/1648, /pull/1663, /pull/1647 in nearby blocks. This entry is the only one whose link does not resolve. As shipped, anyone clicking the link from a canvas row that originated in this sweep would get a 404, and any tooling that joins changelog rows back to their originating PR (for traceability or audit) will see an unparseable PR id.

Code path that triggers it

This is a pure data/config bug — the row is appended verbatim to perf-changelog.yaml, which is the authoritative changelog for sweep triggers. The placeholder is in the field that downstream tooling (and humans) use to map a sweep back to the PR that armed it. Because the sweep itself is armed by config-keys/description, the bad pr-link will not block execution, so it will silently land on main.

Why existing code doesn't prevent it

There is no schema validator on pr-link requiring a numeric PR id, and utils/process_changelog.py (mentioned in the PR description as the local validator) keys on config-keys, not the link. The author validated processing but not link well-formedness, so the placeholder slipped through.

Impact

Traceability is broken for the five canvas re-run rows generated by this entry. A follow-up cleanup PR will be required to replace XXX with 1666 (or any future correct number) — exactly the kind of trivial follow-up that wastes a review cycle when it could be caught here.

Fix

Change line 3487 from:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

to:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1666

Step-by-step proof

  1. Read perf-changelog.yaml at lines 3474–3487 on the PR's HEAD (commit c772387).
  2. Line 3476 (prior entry) ends in /pull/1544 — a valid PR id.
  3. Lines 3478–3487 are the new entry added by this PR.
  4. Line 3487 ends in /pull/XXX — a literal three-character placeholder, not a number.
  5. The PR description explicitly states the canvas should point to pull/1666 ("the canvas re-points to the new dump…"), and this PR is itself #1666, confirming the intended value is 1666.
  6. Conclusion: the placeholder was never substituted before commit, and will be merged as-is unless fixed.

42 changes: 42 additions & 0 deletions utils/matrix_logic/test_validation.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Comprehensive tests for validation.py"""
import pytest
from validation import (
ChangelogEntry,
Fields,
SingleNodeMatrixEntry,
SingleNodeAgenticMatrixEntry,
Expand Down Expand Up @@ -932,3 +933,44 @@ def test_validation_runs_by_default(self, tmp_path):
with pytest.raises(ValueError) as exc_info:
load_runner_file(str(runner_file))
assert "must be a list" in str(exc_info.value)


class TestChangelogEntry:
"""Tests for ChangelogEntry, incl. the benchmarks-only / evals-only options."""

def _base(self, **extra):
entry = {
"config-keys": ["minimaxm2.5-fp4-b200-vllm"],
"description": ["re-run for power capture"],
"pr-link": "https://github.com/SemiAnalysisAI/InferenceX/pull/1666",
}
entry.update(extra)
return entry

def test_defaults(self):
"""Both opt-out flags default to False."""
entry = ChangelogEntry.model_validate(self._base())
assert entry.evals_only is False
assert entry.benchmarks_only is False

def test_benchmarks_only_alias(self):
"""benchmarks-only YAML key maps to benchmarks_only."""
entry = ChangelogEntry.model_validate(self._base(**{"benchmarks-only": True}))
assert entry.benchmarks_only is True

def test_evals_only_alias(self):
entry = ChangelogEntry.model_validate(self._base(**{"evals-only": True}))
assert entry.evals_only is True

def test_evals_and_benchmarks_only_mutually_exclusive(self):
"""Setting both opt-out flags is rejected."""
with pytest.raises(ValueError) as exc_info:
ChangelogEntry.model_validate(
self._base(**{"evals-only": True, "benchmarks-only": True})
)
assert "mutually exclusive" in str(exc_info.value)

def test_unknown_field_forbidden(self):
"""extra='forbid' rejects typos like a singular 'benchmark-only'."""
with pytest.raises(ValueError):
ChangelogEntry.model_validate(self._base(**{"benchmark-only": True}))
12 changes: 12 additions & 0 deletions utils/matrix_logic/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -486,11 +486,23 @@ class ChangelogEntry(BaseModel):
description: list[str] = Field(min_length=1)
pr_link: str = Field(alias="pr-link")
evals_only: bool = Field(alias="evals-only", default=False)
benchmarks_only: bool = Field(
alias="benchmarks-only", default=False,
description="Skip the eval pass; generate benchmarks only (e.g. power-only re-runs)."
)
scenario_type: Optional[List[str]] = Field(
alias="scenario-type", default=None,
description="Restrict to specific scenario types (e.g., ['fixed-seq-len', 'agentic-coding'])"
)

@model_validator(mode='after')
def check_evals_benchmarks_exclusive(self) -> "ChangelogEntry":
if self.evals_only and self.benchmarks_only:
raise ValueError(
"'evals-only' and 'benchmarks-only' are mutually exclusive; set at most one."
)
return self


class ChangelogMetadata(BaseModel):
"""Pydantic model for validating changelog metadata structure."""
Expand Down
7 changes: 5 additions & 2 deletions utils/process_changelog.py
Original file line number Diff line number Diff line change
Expand Up @@ -175,8 +175,11 @@ def main():
raise
all_benchmark_results.extend(json.loads(result.stdout))

# Generate eval entries separately
eval_configs = [c for c in all_configs if c not in eval_configs_seen]
# Generate eval entries separately (skipped when the entry opts out via
# benchmarks-only, e.g. power-only re-runs that don't need eval scoring).
eval_configs = [] if entry.benchmarks_only else [
c for c in all_configs if c not in eval_configs_seen
]
if eval_configs:
eval_configs_seen.update(eval_configs)
base_cmd = [
Expand Down