ENH: Machine-readable validate output with store/reload#1822
ENH: Machine-readable validate output with store/reload#1822yarikoptic wants to merge 16 commits intomasterfrom
Conversation
Design plan for enhancing `dandi validate` with: - Structured output formats (-f json/json_pp/json_lines/yaml) - Auto-save _validation.jsonl sidecar alongside .log files - --load to reload/re-render stored results with different groupings - Upload validation persistence for later inspection - Extended grouping options (severity, id, validator, standard, dandiset) - Refactoring into dandi/validate/ subpackage (git mv separately) - _record_version field on ValidationResult for forward compatibility - VisiData integration via native JSONL support Addresses #1515, #1753, #1748; enhances #1743. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1822 +/- ##
==========================================
+ Coverage 75.12% 76.46% +1.33%
==========================================
Files 84 87 +3
Lines 11930 12503 +573
==========================================
+ Hits 8963 9561 +598
+ Misses 2967 2942 -25
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…e/ subpackage Pure file move with no content changes, plus __init__.py re-exports for backward compatibility. Imports will be updated in the next commit. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Update imports across 13 files to use the new subpackage structure: - dandi.validate_types → dandi.validate.types - dandi.validate → dandi.validate.core (for explicit imports) - Relative imports adjusted accordingly Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
- test_validate.py → dandi/validate/tests/test_core.py - test_validate_types.py → dandi/validate/tests/test_types.py - Update relative imports in moved test files - Fix circular import: don't eagerly import core in __init__.py Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
…ate CLI Decompose the monolithic validate() click command into helpers: - _collect_results(): runs validation and collects results - _filter_results(): applies min-severity and ignore filters - _process_issues(): simplified, no longer handles ignore (moved to _filter) No behavior changes; all existing tests pass unchanged. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Add record_version: str = "1" for forward-compatible serialization. Uses no underscore prefix since Pydantic v2 excludes underscore-prefixed fields from serialization. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Add -f/--format {human,json,json_pp,json_lines,yaml} to produce
structured output using existing formatter infrastructure. Structured
formats suppress colored text and 'No errors found' message. Exit
code still reflects validation results.
Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
- Create dandi/validate/io.py with write/append/load JSONL utilities and validation_sidecar_path() helper - Add -o/--output option to write structured output to file - Auto-save _validation.jsonl sidecar next to logfile when using structured format without --output Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Add --summary/--no-summary flag that shows statistics after validation: total issues, breakdown by severity, validator, and standard. For human output, printed to stdout; for structured formats, printed to stderr. Also refactors _process_issues into _render_human (no exit) + _exit_if_errors, keeping _process_issues as backward-compatible wrapper. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Add --load to reload previously-saved JSONL validation results and re-render them with different formats/filters/grouping. Mutually exclusive with positional paths. Exit code reflects loaded results. Skip auto-save sidecar when loading. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
- Add validation_log_path parameter to upload() - In upload validation loop, append results to sidecar via append_validation_jsonl() when validation_log_path is set - CLI cmd_upload derives sidecar path from logfile and passes it Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Fix mypy errors by using IO[str] instead of object for file-like output parameters in _print_summary, _get_formatter, and _render_structured. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
When --output is given without explicit --format, infer the format from the file extension: .json → json_pp, .jsonl → json_lines, .yaml/.yml → yaml. Error only if extension is unrecognized. Update design doc to reflect this behavior. Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Add severity, id, validator, standard, and dandiset as --grouping options. Uses section headers with counts (e.g. "=== ERROR (5 issues) ===") for human output. Structured output is unaffected (always flat). Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Limit how many results are shown per leaf group (or in the flat list
when no grouping is used). Excess results are replaced by a
TruncationNotice placeholder — a distinct dataclass (not a
ValidationResult) so consumers can isinstance() check.
- TruncationNotice dataclass + LeafItem/TruncatedResults type aliases
- _truncate_leaves() walks the grouped tree, caps leaf lists
- Human output: "... and N more issues" in cyan
- Structured output: {"_truncated": true, "omitted_count": N} sentinel
- Headers show original counts including omitted items
- Works without grouping (flat list) and with multi-level grouping
Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
4894aa1 to
2ddd5d4
Compare
The _auto_save_sidecar() call was only in the structured-to-stdout branch, so the default human format (the most common usage) never wrote the _validation.jsonl sidecar next to the log file. Move the sidecar write and _exit_if_errors() into a shared path that runs after all rendering branches. The sidecar is now written whenever there are results, unless --output or --load is active. Also update the validate docstring/help text to document the sidecar behavior, and update the design spec (Phase 1b, Phase 3, testing strategy) to reflect the --validation-log CLI option for upload and proper CLI integration testing via CliRunner through main(). Co-Authored-By: Claude Code 2.1.81 / Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| # First produce a JSONL to load | ||
| outfile = tmp_path / "input.jsonl" | ||
| r = CliRunner().invoke( |
Check warning
Code scanning / CodeQL
Variable defined multiple times Warning test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
In general, to fix multiple redundant assignments, either remove the earlier assignment(s) or, if they were intended to be used, add the missing usage. Here, the first CliRunner().invoke(...) call is required to create the JSONL file, but its return value isn’t used, so the variable r should not be assigned at that point.
Concretely, in dandi/cli/tests/test_cmd_validate.py, inside test_validate_auto_sidecar_skipped_with_load, replace the line that assigns to r at line 830:
r = CliRunner().invoke(
main, ["validate", "-f", "json_lines", "-o", str(outfile), str(simple2_nwb)]
)with a direct call that doesn’t assign the result:
CliRunner().invoke(
main, ["validate", "-f", "json_lines", "-o", str(outfile), str(simple2_nwb)]
)The later assignment to r at line 840 must remain unchanged because its exit_code is used. No imports, helper methods, or additional definitions are needed.
| @@ -827,7 +827,7 @@ | ||
|
|
||
| # First produce a JSONL to load | ||
| outfile = tmp_path / "input.jsonl" | ||
| r = CliRunner().invoke( | ||
| CliRunner().invoke( | ||
| main, ["validate", "-f", "json_lines", "-o", str(outfile), str(simple2_nwb)] | ||
| ) | ||
| assert outfile.exists() |
| """ | ||
| # Avoid heavy imports by importing with function: | ||
| from ..upload import upload | ||
| from ..upload import upload as upload_ |
There was a problem hiding this comment.
This is better of course than clobbering the namespace, but in the long term a better name that distinguishes API vs CLI functions (with CLI usually marked private since it is odd to expose them to library) is still preferred
Which would be a breaking change, not saying do it here, just thinking out loud
There was a problem hiding this comment.
yeah -- typically we just clobbered click's interfaces (sorry didn't interleave... but at least manually pruned some unrelated)
❯ grep -l 'from \.\.\(.*\) import \1' dandi/cli/cmd_* | xargs grep '^def '
dandi/cli/cmd_delete.py:def delete(paths, skip_missing, dandi_instance, force, devel_debug=False):
dandi/cli/cmd_download.py:def download(
dandi/cli/cmd_move.py:def move(
dandi/cli/cmd_organize.py:def organize(
dandi/cli/cmd_upload.py:def upload(
dandi/cli/cmd_validate.py:def validate_bids(
dandi/cli/cmd_validate.py:def validate(
❯ grep 'from \.\.\(.*\) import \1' dandi/cli/cmd_*
dandi/cli/cmd_delete.py: from ..delete import delete
dandi/cli/cmd_download.py: from .. import download
dandi/cli/cmd_move.py: from .. import move as move_mod
dandi/cli/cmd_organize.py: from ..organize import organize
dandi/cli/cmd_upload.py: from ..upload import upload
dandi/cli/cmd_validate.py:from ..validate import validate as validate_but frankly and unfortunately here it doesn't matter much since those click functions are not usable as python interfaces! I wish it was otherwise. So, no point of giving them any special names really.
There was a problem hiding this comment.
but frankly and unfortunately here it doesn't matter much since those click functions are not usable as python interfaces!
How do you mean? Any other library can import them and manipulate the click groups; sometimes that might even be intentional, but I don't think so here
from dandi.cli.command import main
from dandi.cli.cmd_upload import upload # Implying it is not private and intended to be imported and customized
import click
@main.command("upload2") # Or even re-register under the same name if attempting some nasty injection
@click.pass_context
def wrapped_original(ctx):
click.echo("Before original") # Inject custom code here
ctx.invoke(upload)
click.echo("After original") # Inject more custom code here| jobs, jobs_per_file = jobs_pair | ||
|
|
||
| upload( | ||
| sidecar = None |
There was a problem hiding this comment.
IDK about referring to this as a 'sidecar' since that could get confusing with BIDS language
What is meant is, specifically, might be 'persistent file recording the validation results of this Dandiset'?
Even referring to it as a 'log' could get confusing with our own log file terminology (as in, the ones that contain runtime errors rather than validations)
There was a problem hiding this comment.
indeed there is a clash with "sidecar" in BIDS but overall it is the same meaning, just rarely used outside of BIDS. ... but here we could just call it validation_log_path for the variable.
as for the helper , ATM no critically better name instead of current validation_sidecar_path comes to mind.
note that the validation log is not necessarily of a dandiset -- could be of nwb files AFAIK, or potentially multiple dandisets... in this case what matters is that it is associated with the same run for which there is a log.
There was a problem hiding this comment.
Nice to see a first step to a nicely scoped namespace!
There was a problem hiding this comment.
;-) IMHO it makes sense to KISS while it is small, but as soon as it starts growing -- submodule is a logical way to group. although indeed since we do keep tests within those, even for smaller modules it would make sense since would bring tests closer to the code/easier to execute for localized changes
| DATASET = "dataset" | ||
|
|
||
|
|
||
| CURRENT_RECORD_VERSION = "1" |
There was a problem hiding this comment.
You do not wish to use more official semantic versioning? I would recommend it in general FWIW, not that much harder to handle (plus looks more professional)
If already intending to use semantic versions, then rename this to 'MAJOR' or w/e it is intended to be
There was a problem hiding this comment.
Frankly I hope that we would even never get to 2 ;) but it is more of a compatibility version, like if you check git-annex version which is currently 10.20251029 is the compatibility level 10 and then some date based one. and 10 matches version of the compatibility of internal data structure, which it might upgrade from if you had earlier one (eventually). Hard to fathom here what 'minor' vs 'patch' changes in such a simple record would be from which we could derive specific to such changes "handling". Thus I would keep it KISS as just a single number version. MAJOR -- is that about army? (as how that would be more descriptive? and who said that every version must be semver?)
There was a problem hiding this comment.
MAJOR
I have seen conventions where people split each part of semantic versioning into separate strings, such as MAJOR=1, MINOR=0, and PATCH=2
There was a problem hiding this comment.
The case for semantic version (and minor or patch) is the same as for any schematically controlled entity
If you add a new field into the record (minor), if there was an unintentionally incorrect value for a field of that record that gets fixed (patch) then bumping those versions separately would inform anyone using these records to 'filter out' undesirable
Frankly I hope that we would even never get to 2 ;)
Same, but that hope is often dashed from experience
FTR we did something like this with NWB Benchmarks records (e.g., https://github.com/NeurodataWithoutBorders/nwb-benchmarks-results/blob/dev/results/timestamp-2025-12-12-22-52-02_environment-64757cc7724d891d1dfb3fac0e624a82c0b1b20c_machine-39cf5735e421691178577ff24e6b770032d52c04_results.json#L2) and it was extremely useful when analyzing / generating plots because if you noticed some specific version was producing certain results, you could backtrace why, determine if false artifact, filter out affected fields by version, etc.
| yaml_load, | ||
| ) | ||
| from .validate_types import ( | ||
| from .validate.types import ( |
There was a problem hiding this comment.
Though I do believe these import changes could have been their own PR (which would have been much easier and faster to review and merge)
There was a problem hiding this comment.
IIRC I did it in a single commit, so moves are tracked etc. could prep as a separate PR if needed, or just mark files "viewed" with such to hide away for now
There was a problem hiding this comment.
or just mark files "viewed" with such to hide away for now
That is exactly what I do
Just letting you know for 'next time' (if there is)
| class TruncationNotice: | ||
| """Placeholder indicating omitted results in truncated output.""" | ||
|
|
||
| omitted_count: int |
There was a problem hiding this comment.
Please add a description of this dataclass field to the docstring of the class
| type=click.Choice(["human", "json", "json_pp", "json_lines", "yaml"]), | ||
| default="human", |
There was a problem hiding this comment.
"human" is a bit of an odd value to specify here, especially against the others
Other validators refer to this as a 'summary' or something like that, right?
There was a problem hiding this comment.
we could call it text or just default... I will go for text
There was a problem hiding this comment.
I saw render functions below, so came to mind rendered but that would be "suggesting more than it is"... I feel text is ok
There was a problem hiding this comment.
text is good and accurate 👍
| if not paths: | ||
| paths = (os.curdir,) | ||
| # below we are using load_namespaces but it causes HDMF to whine if there | ||
| # is no cached name spaces in the file. It is benign but not really useful | ||
| # at this point, so we ignore it although ideally there should be a formal | ||
| # way to get relevant warnings (not errors) from PyNWB | ||
| ignore_benign_pynwb_warnings() |
There was a problem hiding this comment.
Again, pointing out that is this PR had been broken into smaller modular PRs, the negative breaks of this changelog may have aligned better to the relevant changes to catch any relevant drops or alterations from current behavior
| filtered = _filter_results(results, min_severity, ignore) | ||
|
|
||
| if output_format == "human": | ||
| _render_human(filtered, grouping, max_per_group=max_per_group) |
There was a problem hiding this comment.
The odd choice of 'human' for the value, as mentioned above, is especially observed here. This function is not creating Soylent Green.
Please choose a better name, such as '_format_summary' or '_generate_summary'
There was a problem hiding this comment.
note: there is a dedicate summary option to add summary statement at the bottom!
There was a problem hiding this comment.
Hmmm so the 'text' mode is not truly a summary/aggregate, just 'more human-readable' styling than JSON?
Still reports all invalidations? (though subject to filtering/grouping rules)?
| def _get_formatter( | ||
| output_format: str, out: IO[str] | None = None | ||
| ) -> JSONFormatter | JSONLinesFormatter | YAMLFormatter: |
There was a problem hiding this comment.
Seems odd for there not to be a 'Human (pending rename) Formatter' here to make clearer what the subformat even means
| grouping: str, | ||
| ignore: str | None = None, | ||
|
|
||
| def _render_structured( |
There was a problem hiding this comment.
Well, reviewed for ~3 hours and made it down to here (halfway through source?)
Will come back to this if I get more time away from critical tasks in the next couple weeks
Summary
Design plan for machine-readable
validateoutput with store/reload capability. Adds structured output formats, automatic persistence of validation results alongside log files, and the ability to reload and re-render results with different grouping/filtering options.Key design decisions:
All structured formats (json, json_pp, json_lines, yaml) emit a uniform flat list of
ValidationResultrecords — no envelope/non-envelope splitJSONL as the primary interchange format:
cat/jq/grep/vd(VisiData) composable_record_versionfield on each record for forward-compatible deserializationGrouping affects human display only; structured output is always a stable flat schema
Auto-save
_validation.jsonlsidecar next to existing.logfilesRefactor
dandi/validate.py+dandi/validate_types.pyintodandi/validate/subpackageCloses validate: Add -f|--format option to optionally serialize into json, json_pp, json_lines or yaml #1515
Closes Provide easy means for introspecting upload validation failures #1753
Largely replaces Add filtering of issues by type/ID or by file location #1743
Enhances Add filtering of issues by type/ID or by file location #1743, upload,validate: Add --validators option #1737, Tidy up the
validatecommand function incmd_validate.py#1748TODO
dandi/validate/subpackage (git mvcommitted separately from import updates)cmd_validate.py— extract_collect_results(),_filter_results(),_render_results()_record_versiontoValidationResult--format(-f) option:human|json|json_pp|json_lines|yaml--output(-o) + auto-save_validation.jsonlsidecar--summaryflag--load(multiple paths, mutually exclusive with positional args)dandi uploadseverity,id,validator,standard,dandiset--max-per-grouptruncation — cap results per leaf group with placeholder notice--max-per-groupfeature (Step 5)Limits how many results are shown per leaf group (or in the flat list when no grouping). Excess results are replaced by a
TruncationNoticeplaceholder — a distinct data structure (not aValidationResult), so it won't be confused with real results if the output is saved/reloaded.Examples (against 147k+ validation results from bids-examples)
Flat truncation —
--max-per-group 5with no grouping:Grouped truncation —
-g severity --max-per-group 3:and actually those are colored if output is not redirected
Multi-level leaf-only truncation —
-g severity -g id --max-per-group 2:Structured output —
-g severity -f json_pp --max-per-group 2emits_truncatedplaceholders:{ "ERROR": [ { "id": "DANDI.NO_DANDISET_FOUND", "severity": "ERROR", ... }, { "id": "BIDS.NIFTI_HEADER_UNREADABLE", "severity": "ERROR", ... }, { "_truncated": true, "omitted_count": 9567 } ], "HINT": [ { "id": "BIDS.JSON_KEY_RECOMMENDED", "severity": "HINT", ... }, { "id": "BIDS.JSON_KEY_RECOMMENDED", "severity": "HINT", ... }, { "_truncated": true, "omitted_count": 138015 } ] }Headers show original counts (e.g. "9569 issues") even when only a few are displayed. The
_truncatedsentinel follows the_record_versionnaming convention for metadata fields.Test plan
--formatoutput viaclick.CliRunnerValidationResultJSONL--loadwith multi-file concatenation, mutual exclusivity enforcement--outputis used--max-per-groupflat truncation, grouped truncation, multi-level, JSON placeholder, no-truncation when under limit_truncate_leaves()helperSome demos
See also
dandi validateacross bids-examples and then using visidata for navigation of composite dump of recordsTODOs
validatewithout-odoes not store the_validation.jsonlGenerated with Claude Code