Skip to content

test(integration): init results-dir and pass --systemname in real_accumulation tests#559

Open
FileSystemGuy wants to merge 1 commit into
mainfrom
fix/group-E-integration-real-accumulation-env
Open

test(integration): init results-dir and pass --systemname in real_accumulation tests#559
FileSystemGuy wants to merge 1 commit into
mainfrom
fix/group-E-integration-real-accumulation-env

Conversation

@FileSystemGuy

Copy link
Copy Markdown
Contributor

Summary

`main._main_impl()`'s orgname-resolution gate now requires every gated mode (closed/open/whatif) to have a `mlperf-results.yaml` sentinel pinned into the results-dir by `mlpstorage init`. The universal argparse plumbing additionally requires `--systemname` (or `MLPSTORAGE_SYSTEMNAME`) on emitting commands.

The three module fixtures in `tests/integration/test_real_accumulation.py` predated both, so every subprocess CLI invocation exited 2 with either "`--systemname/-sn` is required" or, for the vectordb/kvcache fixtures, "[E101] results-dir has not been initialized". All 12 tests in the file failed at fixture setup.

Changes

  • Define `TEST_ORGNAME` and `TEST_SYSTEMNAME` module-level constants and add a `_canonical_prefix(results_dir, mode)` helper that builds `<results_dir>///results//` — one edit if the layout changes again.
  • In each of the three module fixtures: run `mlpstorage init ` before the benchmark invocations, and add `--systemname ` to every benchmark argv.
  • Update every path-shape assertion to walk the canonical prefix instead of the (now incorrect) `<results_dir>//...` shape.
  • For vector_database, also add the `<index_type>` segment (DISKANN — the default) between engine and command. Current production splits per index_type (Rules.md §2.1.27): on-disk shape is `vector_database//<index_type>///`.
  • Production now combines engine + index_type into the metadata's `model` slot (e.g. `milvus_DISKANN`) so the per-index_type workload grouping matches the per-index_type path split. Update the metadata-schema and discovery assertions to expect the combined token.
  • The heterogeneous test now symlinks each per-fixture canonical-prefix's per-type subdir into the combined dir, so discovery walks exactly the trees the three fixtures produced.

Test plan

  • 6/6 vectordb + kvcache tests pass locally:
    `uv run python -m pytest tests/integration/test_real_accumulation.py -k 'vectordb or kvcache' -m '' -o addopts=` → 6 passed
  • The 5 training tests + heterogeneous test (which depends on the training fixture) need ~520GB free disk for unet3d's CAP-01 capacity gate; my dev box has ~133GB. The canonical-prefix change is the same mechanical pattern verified end-to-end on vectordb/kvcache, but a CI machine with sufficient disk should run them.

…umulation tests

main._main_impl()'s orgname-resolution gate now requires every gated
mode (closed/open/whatif) to have a mlperf-results.yaml sentinel pinned
into the results-dir by `mlpstorage init`. The universal argparse
plumbing additionally requires --systemname (or MLPSTORAGE_SYSTEMNAME)
on emitting commands.

The three module fixtures in test_real_accumulation.py predated both,
so every subprocess CLI invocation exited 2 with either

  error: the following arguments are required: --systemname/-sn

or, for the vectordb/kvcache fixtures,

  [E101] results-dir `...` has not been initialized.

All 12 tests in the file failed at fixture setup as a result.

Changes:

* Define TEST_ORGNAME and TEST_SYSTEMNAME module-level constants so the
  per-test path math has a single source of truth, and add a
  `_canonical_prefix(results_dir, mode)` helper that builds
  <results_dir>/<mode>/<orgname>/results/<systemname>/ for assertions
  (one edit if the layout ever changes again).
* In each of the three module fixtures: run `mlpstorage init <orgname>
  <results-dir>` BEFORE the benchmark invocations, and add
  `--systemname <test-sys>` to every benchmark CLI argv.
* Update every path-shape assertion to walk the canonical prefix
  instead of the (now incorrect) `<results_dir>/<type>/...` shape.
* For vector_database, also add the `<index_type>` (DISKANN — the
  default) segment between engine and command: the current production
  splits per index_type (Rules.md §2.1.27) so the on-disk shape is
  `vector_database/<engine>/<index_type>/<command>/<datetime>/`.
* Production now combines engine + index_type into the metadata's
  `model` slot (e.g. `milvus_DISKANN`) so the per-index_type workload
  grouping matches the per-index_type path split. Update the metadata-
  schema and discovery assertions to expect the combined token.
* The heterogeneous test now symlinks the per-fixture canonical
  prefix's per-type subdir into the combined dir, so discovery walks
  exactly the trees the three fixtures produced.

Verified end-to-end: 6/6 vectordb + kvcache tests pass locally
(`uv run python -m pytest tests/integration/test_real_accumulation.py
-k 'vectordb or kvcache' -m '' -o addopts=`). The 5 training tests and
the heterogeneous test can't run on a dev box with under ~520GB free
(unet3d's CAP-01 disk gate requirement); their canonical-prefix change
is the same mechanical pattern as vectordb/kvcache.
@FileSystemGuy FileSystemGuy requested a review from a team June 26, 2026 23:21
@github-actions

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant