test(integration): init results-dir and pass --systemname in real_accumulation tests by FileSystemGuy · Pull Request #559 · mlcommons/storage

FileSystemGuy · 2026-06-26T23:21:50Z

Summary

`main._main_impl()`'s orgname-resolution gate now requires every gated mode (closed/open/whatif) to have a `mlperf-results.yaml` sentinel pinned into the results-dir by `mlpstorage init`. The universal argparse plumbing additionally requires `--systemname` (or `MLPSTORAGE_SYSTEMNAME`) on emitting commands.

The three module fixtures in `tests/integration/test_real_accumulation.py` predated both, so every subprocess CLI invocation exited 2 with either "`--systemname/-sn` is required" or, for the vectordb/kvcache fixtures, "[E101] results-dir has not been initialized". All 12 tests in the file failed at fixture setup.

Changes

Define `TEST_ORGNAME` and `TEST_SYSTEMNAME` module-level constants and add a `_canonical_prefix(results_dir, mode)` helper that builds `<results_dir>///results//` — one edit if the layout changes again.
In each of the three module fixtures: run `mlpstorage init ` before the benchmark invocations, and add `--systemname ` to every benchmark argv.
Update every path-shape assertion to walk the canonical prefix instead of the (now incorrect) `<results_dir>//...` shape.
For vector_database, also add the `<index_type>` segment (DISKANN — the default) between engine and command. Current production splits per index_type (Rules.md §2.1.27): on-disk shape is `vector_database//<index_type>///`.
Production now combines engine + index_type into the metadata's `model` slot (e.g. `milvus_DISKANN`) so the per-index_type workload grouping matches the per-index_type path split. Update the metadata-schema and discovery assertions to expect the combined token.
The heterogeneous test now symlinks each per-fixture canonical-prefix's per-type subdir into the combined dir, so discovery walks exactly the trees the three fixtures produced.

Test plan

6/6 vectordb + kvcache tests pass locally:
`uv run python -m pytest tests/integration/test_real_accumulation.py -k 'vectordb or kvcache' -m '' -o addopts=` → 6 passed
The 5 training tests + heterogeneous test (which depends on the training fixture) need ~520GB free disk for unet3d's CAP-01 capacity gate; my dev box has ~133GB. The canonical-prefix change is the same mechanical pattern verified end-to-end on vectordb/kvcache, but a CI machine with sufficient disk should run them.

…umulation tests main._main_impl()'s orgname-resolution gate now requires every gated mode (closed/open/whatif) to have a mlperf-results.yaml sentinel pinned into the results-dir by `mlpstorage init`. The universal argparse plumbing additionally requires --systemname (or MLPSTORAGE_SYSTEMNAME) on emitting commands. The three module fixtures in test_real_accumulation.py predated both, so every subprocess CLI invocation exited 2 with either error: the following arguments are required: --systemname/-sn or, for the vectordb/kvcache fixtures, [E101] results-dir `...` has not been initialized. All 12 tests in the file failed at fixture setup as a result. Changes: * Define TEST_ORGNAME and TEST_SYSTEMNAME module-level constants so the per-test path math has a single source of truth, and add a `_canonical_prefix(results_dir, mode)` helper that builds <results_dir>/<mode>/<orgname>/results/<systemname>/ for assertions (one edit if the layout ever changes again). * In each of the three module fixtures: run `mlpstorage init <orgname> <results-dir>` BEFORE the benchmark invocations, and add `--systemname <test-sys>` to every benchmark CLI argv. * Update every path-shape assertion to walk the canonical prefix instead of the (now incorrect) `<results_dir>/<type>/...` shape. * For vector_database, also add the `<index_type>` (DISKANN — the default) segment between engine and command: the current production splits per index_type (Rules.md §2.1.27) so the on-disk shape is `vector_database/<engine>/<index_type>/<command>/<datetime>/`. * Production now combines engine + index_type into the metadata's `model` slot (e.g. `milvus_DISKANN`) so the per-index_type workload grouping matches the per-index_type path split. Update the metadata- schema and discovery assertions to expect the combined token. * The heterogeneous test now symlinks the per-fixture canonical prefix's per-type subdir into the combined dir, so discovery walks exactly the trees the three fixtures produced. Verified end-to-end: 6/6 vectordb + kvcache tests pass locally (`uv run python -m pytest tests/integration/test_real_accumulation.py -k 'vectordb or kvcache' -m '' -o addopts=`). The 5 training tests and the heterogeneous test can't run on a dev box with under ~520GB free (unet3d's CAP-01 disk gate requirement); their canonical-prefix change is the same mechanical pattern as vectordb/kvcache.

github-actions · 2026-06-26T23:21:57Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

FileSystemGuy requested a review from a team June 26, 2026 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(integration): init results-dir and pass --systemname in real_accumulation tests#559

test(integration): init results-dir and pass --systemname in real_accumulation tests#559
FileSystemGuy wants to merge 1 commit into
mainfrom
fix/group-E-integration-real-accumulation-env

FileSystemGuy commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

FileSystemGuy commented Jun 26, 2026

Summary

Changes

Test plan

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant