Skip to content

Add OpenPMD support#1050

Open
pgrete wants to merge 200 commits intodevelopfrom
pgrete/pmd-output
Open

Add OpenPMD support#1050
pgrete wants to merge 200 commits intodevelopfrom
pgrete/pmd-output

Conversation

@pgrete
Copy link
Copy Markdown
Collaborator

@pgrete pgrete commented Apr 12, 2024

PR Summary

Leftover todos (potentially for future PRs):

  • cleaner integration in build system
  • add logic for selection of var (rather than all)
  • double check additional logic for restart outputs (like UserWorkBeforeRestart)
  • Add doc
  • Add callback to provide/forward standard compatible downstream code info
  • Add pattern file for Visit/Paraview
  • Add support for slices

For upcoming PRs:

  • Adjust python script to read opmd output (or is this not needed because of the existing openpmd-viewer)?
  • Add support for units
  • Add support for Metadata::None
  • Add support for Metadata::Fine
  • Add support for Metadata::CoordinatesVec

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • Adds a test for any bugs fixed. Adds tests for new features.
  • Code is formatted
  • Changes are summarized in CHANGELOG.md
  • Change is breaking (API, behavior, ...)
    • Change is additionally added to CHANGELOG.md in the breaking section
    • PR is marked as breaking
    • Short summary API changes at the top of the PR (plus optionally with an automated update/fix script)
  • CI has been triggered on Darwin for performance regression tests.
  • Docs build
  • (@lanl.gov employees) Update copyright on changed files

Comment thread src/outputs/openpmd.cpp Outdated
Comment thread src/outputs/restart_opmd.cpp Outdated
@BenWibking
Copy link
Copy Markdown
Collaborator

These are Codex's review comments:

• The patch introduces several correctness and build issues in the new OpenPMD path, including disabled-dependency test
  builds failing and restart files that become inconsistent for coarsening, ghost zones, or common topological field
  layouts. These are blocking for affected configurations and should be fixed before considering the change correct.

  Full review comments:

  - [P2] Guard OpenPMD-only test includes — /Users/benwibking/parthenon_codes/parthenon/tst/unit/test_unit_params.cpp:29-33
    When configuring with -DPARTHENON_DISABLE_OPENPMD=ON while keeping unit tests enabled, these unconditional includes
    still require the external OpenPMD headers even though the dependency is not fetched or added to include paths. This
    makes the optional-disable build fail at compile time before the #ifdef PARTHENON_ENABLE_OPENPMD type list can exclude
    the OpenPMD test case; guard these includes with the same macro.
  - [P2] Reject coarsening for restart OpenPMD outputs — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    parthenon_opmd.cpp:406-408
    With the default output_type set to restart, an input block that only adds coarsening_factor > 1 writes downsampled
    field chunks but still records full-size restart metadata; RestartReaderOPMD::ReadBlocks hard-codes coarsening_factor =
    1, so restarting from such a file reads full extents from a smaller dataset and fails. Please either disallow
    coarsening for restart outputs or require output_type=data/slice before writing.
  - [P2] Honor or reject ghost-zone OpenPMD outputs — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
                 parthenon_opmd.cpp:775-777
    If a user sets ghost_zones=true for an OpenPMD block, the file advertises IncludesGhost=1, but the field writer still
    packs IndexDomain::interior here and stores the interior block size. Restart construction subtracts ghost cells when
    this attribute is set, so these OpenPMD restart files fail the block-size check before data are read; either write the
    entire domain or reject the option for OpenPMD.
  - [P2] Size unpadded topological buffers by summing elements — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    output_utils.cpp:109-114
    For OpenPMD restarts of face/edge variables on non-cubic blocks, the different topological elements do not have the
    same number of cells, e.g. F1 is (nx1+1)*nx2*nx3 while F2 is nx1*(nx2+1)*nx3. This check therefore throws for valid
    meshes; the unpadded size should be the sum over each element's total, not ntop_elems times the first total.
  - [P2] Avoid overlapping chunks for face/edge data — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    parthenon_opmd.cpp:273-275
    On multi-block meshes with face-, edge-, or node-centered variables, every block writes an extent with the +
    TopologicalOffset* point in each relevant direction. Adjacent same-level blocks therefore both write the shared face/
    edge/node indices, e.g. F1 chunks at offsets 0 and nx1 both cover nx1, which makes ADIOS/OpenPMD output order-dependent
    or rejected for overlapping chunks; boundary points need to be assigned to only one block or stored block-locally.

@pgrete
Copy link
Copy Markdown
Collaborator Author

pgrete commented May 4, 2026

All Codex comments were reasonable (I'm actually surprised it picked up the topological elements logic).
I fixed/addressed all except for the last one as this is implicitly handled by ADIOS2 (so the data model allows "overlapping" "writes" at the API level as it internally uses staged buffers to optimize actual write operations).

@pgrete pgrete enabled auto-merge (squash) May 4, 2026 10:33
@pgrete pgrete disabled auto-merge May 4, 2026 10:33
@BenWibking
Copy link
Copy Markdown
Collaborator

All Codex comments were reasonable (I'm actually surprised it picked up the topological elements logic). I fixed/addressed all except for the last one as this is implicitly handled by ADIOS2 (so the data model allows "overlapping" "writes" at the API level as it internally uses staged buffers to optimize actual write operations).

Since you are writing in chunks, I assume the writes don't actually overlap? Just the indices, right?

@pgrete
Copy link
Copy Markdown
Collaborator Author

pgrete commented May 4, 2026

All Codex comments were reasonable (I'm actually surprised it picked up the topological elements logic). I fixed/addressed all except for the last one as this is implicitly handled by ADIOS2 (so the data model allows "overlapping" "writes" at the API level as it internally uses staged buffers to optimize actual write operations).

Since you are writing in chunks, I assume the writes don't actually overlap? Just the indices, right?

Correct, the actual write only happen on flush, which is collective and how indices are written to disk is handled internally by ADIOS2.

@BenWibking
Copy link
Copy Markdown
Collaborator

Ok, I re-ran codex review:

• The patch introduces regressions in existing HDF5 meshdata_name output handling and breaks supported single-backend test builds. It
  also accepts invalid OpenPMD coarsening settings that can crash or corrupt output.

  Full review comments:

  - [P2] Use the selected MeshData when filling HDF5 buffers — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    parthenon_hdf5.cpp:399-399
    When an HDF5 output block sets meshdata_name to a non-base container, all_vars_info is built from that container above, but this
    lookup now falls back to pmb->meshblock_data.Get() with the default base name. That makes non-base outputs either throw for
    variables that only exist in the requested MeshData or silently write the base variable with the same label instead.
  - [P2] Guard backend-specific unit-test branches — /Users/benwibking/parthenon_codes/parthenon/tst/unit/test_unit_params.cpp:211-
    219
    In builds with only one restart backend enabled, these if constexpr branches still reference the other backend's non-dependent
    names while the template is parsed. For example, -DPARTHENON_DISABLE_OPENPMD=ON leaves OutputTypes as HDF5-only but
    RestartReaderOPMD/openPMD are not declared, so test_unit_params.cpp fails to compile; the symmetric HDF5-disabled case has the
    same issue for the HDF5 branch.
  - [P2] Validate OpenPMD coarsening factors before use — /Users/benwibking/parthenon_codes/parthenon/src/outputs/outputs.cpp:367-371
    If an input sets coarsening_factor to 0 or to a value that does not evenly divide each block dimension, the OpenPMD writer later
    divides by it to compute chunk extents and uses it in modulo/skipped-cell logic. That can produce a divide-by-zero or a chunk
    extent that does not match the number of values stored, e.g. nx1=5, coarsening_factor=2 selects 3 cells but advertises an extent
    of 2.

Let me know if these are real.

@pgrete
Copy link
Copy Markdown
Collaborator Author

pgrete commented May 4, 2026

  1. Was a bug (now fixed)
  2. Was true (also fixed -- though ugly)
  3. Was already hinted at with a "todo" and is now fixed.

// the public, perform publicly and display publicly, and to permit others to do so.
//========================================================================================

#ifndef OUTPUTS_OUTPUT_ATTR_HPP_
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content has been moved from the hdf5 header to a central location as it's reused for opmd

@BenWibking
Copy link
Copy Markdown
Collaborator

Ok, it keeps giving me new issues:

• The new OpenPMD path has a likely build issue and mishandles existing arbitrary-shaped Metadata::None variables in both write and
  restart logic. It also incorrectly rejects valid coarsened 2-D outputs.

  Full review comments:

  - [P1] Add the standard headers for date formatting — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    parthenon_opmd.cpp:379-379
    In OpenPMD-enabled builds this uses std::format/std::chrono without including <format>/<chrono> in the new translation unit, so
    compilation depends on external transitive includes and can fail before tests run. Add the standard headers directly, or use an
    existing date-formatting helper that is supported by the configured toolchains.
  - [P2] Handle Metadata::None fields before using cell bounds — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    parthenon_opmd.cpp:784-786
    When an OpenPMD output includes a Metadata::None field, such as the existing advection metadata_none_var when explicitly
    requested or any arbitrary-shaped Independent/Restart field, these loops use mesh cell bounds instead of the variable's raw view
    shape. That truncates shapes like nx+1 and can index the wrong extents for non-mesh-backed variables; either reject these fields
    for OpenPMD or mirror the raw-shape handling used by the HDF5 packing path.
  - [P2] Preserve raw-shape bounds for unpadded restarts — /Users/benwibking/parthenon_codes/parthenon/src/outputs/
    output_utils.hpp:344-347
    OpenPMD restarts call this helper with is_padded=false; for Metadata::None variables the code has just set kb/jb/ib from the raw
    view shape, but this branch overwrites them with mesh cell bounds. Restarting an arbitrary-shaped field will therefore unpack the
    saved buffer using the wrong extents, so skip this adjustment when info.where is Metadata::None.
  - [P2] Gate coarsening checks on active dimensions — /Users/benwibking/parthenon_codes/parthenon/src/outputs/outputs.cpp:381-383
    For 2-D OpenPMD outputs base_block_size.nx(X3DIR) is 1, so any coarsening_factor > 1 is rejected here even though the writer has
    a 2-D path and no x3 data are emitted. Only require divisibility for dimensions present in pm->ndim (and similarly for x2 if 1-D
    output is later enabled).

Let me know if these are valid.

@BenWibking
Copy link
Copy Markdown
Collaborator

It looks like issues 2 and 3 are related to Metadata::None, which is not supported yet. So maybe those can be ignored.

Comment thread src/outputs/outputs.cpp
"Writing ghost zones not supported for OPMD outputs.");

const auto output_type_str = pin->GetOrAddString(
op.block_name, "output_type", "restart",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this sets the default output_type to restart. For HDF5 outputs, the default output_type is data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants