New DeltaAI Module by wilfonba · Pull Request #1306 · MFlowCode/MFC

wilfonba · 2026-03-13T17:36:38Z

Description

Replaces old DeltaAI modules with new working module.

Fixes #(issue)

Type of change

Bug fix

Testing

I ran the examples/3D_performance_test/ case with a 360^3 grid on 1, 2, 4, and 8 GPUs on DeltaAI. See results below:

1 rank: 0.4866 ns/gp/eq/rhs
2 rank: 0.2626 ns/gp/eq/rhs
4 rank: 0.1421 ns/gp/eq/rhs
8 rank: 0.0779 ns/gp/eq/rhs (2 Nodes)

github-actions · 2026-03-13T17:37:40Z

Claude Code Review

Head SHA: 3bebe44

Files changed: 1

toolchain/modules

Summary

Replaces two separate modules (nvhpc-openmpi3/24.3 + cuda) with the unified nvidia/25.5 module on NCSA DeltaAI
The change is minimal (1 addition, 1 deletion) and isolated to the module loader configuration
Testing was performed on examples/3D_performance_test/ at 360³ resolution on 1, 2, 4, and 8 GPUs with near-ideal strong scaling observed
No Fortran source code changes; no GPU macro, precision, or parameter-system rules apply here

Findings

No issues found.

Improvement Opportunities

PR body references unfiled issue (toolchain/modules, line N/A): The body contains Fixes #(issue) as a placeholder — this looks like an unfilled template. If there's a tracking issue for the broken DeltaAI module, linking it would be helpful; otherwise consider removing the placeholder before merge.
Old module version documentation (toolchain/modules, line 72): The old module string (nvhpc-openmpi3/24.3 cuda) is simply deleted with no comment on why it broke (e.g., module was retired). A brief comment in the modules file or in the PR body about when the old module was deprecated would help future maintainers understand the change history.
dai-gpu CUDA CC list (toolchain/modules, line 73): MFC_CUDA_CC=89,90 targets SM 89 (Ada Lovelace) and SM 90 (Hopper). Verifying that the H100s on DeltaAI are indeed SM 90 (they are) and that no other GPU arch is present on DeltaAI nodes is worth confirming, but this is unchanged and pre-existing.

Copilot

Pull request overview

Updates the DeltaAI (NCSA) cluster module configuration used by the MFC toolchain’s module loader (./mfc.sh load) to reflect a newer NVIDIA toolchain module.

Changes:

Replaces the DeltaAI common module list entry from nvhpc-openmpi3/24.3 cuda to nvidia/25.5.

coderabbitai · 2026-03-13T17:42:49Z

📝 Walkthrough

Walkthrough

The pull request modifies a module configuration for NCSA DeltaAI in the toolchain/modules directory. The dai-all module line is updated to replace the nvhpc-openmpi3/24.3 cuda toolchain with nvidia/25.5. All other configuration parameters on the line, including Python, CMake, and compiler settings (CC/CXX/FC), remain unchanged. The dai-gpu module configuration is unaffected by this change. No public or exported entities are altered.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'New DeltaAI Module' is vague and generic, using non-descriptive language that doesn't convey the specific technical change.	Replace with a more specific title that clarifies the actual change, such as 'Replace DeltaAI nvhpc-openmpi3 module with nvidia/25.5' or 'Update DeltaAI toolchain module configuration'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description covers the main points but is missing critical details: the specific issue number is not filled in, and the testing section lacks context about what was being validated.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can enforce grammar and style rules using `languagetool`.

Configure the reviews.tools.languagetool setting to enable/disable rules and categories. Refer to the LanguageTool Community to learn more.

coderabbitai

Actionable comments posted: 1

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e3f345d3-3178-4fb4-be84-a650b3fcbdc0

📥 Commits

Reviewing files that changed from the base of the PR and between 598f5a5 and 3bebe44.

📒 Files selected for processing (1)

toolchain/modules

coderabbitai · 2026-03-13T17:42:52Z

toolchain/modules


 dai     NCSA DeltaAI
-dai-all python cmake nvhpc-openmpi3/24.3 cuda
+dai-all python cmake nvidia/25.5


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For NCSA DeltaAI, what components are included in the nvidia/25.5` module?
Specifically confirm whether it provides:

CUDA toolkit/runtime paths, and

MPI (OpenMPI/HPC-X) usable for multi-rank jobs with nvc/nvc++/nvfortran.
Please include official cluster/module documentation links.`

💡 Result:

What the DeltaAI docs publicly confirm (re: NVIDIA HPC SDK 25.5)

1) CUDA toolkit/runtime paths

Yes—DeltaAI provides CUDA via its cudatoolkit module, and the documented nvcc path is inside the NVIDIA HPC SDK 25.5 install tree:

cudatoolkit/25.5_12.9

nvcc at /opt/nvidia/hpc_sdk/Linux_aarch64/25.5/compilers/bin/nvcc (CUDA 12.9) (docs.ncsa.illinois.edu)

So regardless of whether you reach it through a higher-level nvidia/25.5-style module, the official DeltaAI documentation confirms CUDA paths are being set from the HPC SDK 25.5 installation. (docs.ncsa.illinois.edu)

2) MPI (OpenMPI/HPC‑X) usable for multi-rank jobs with nvc/nvc++/nvfortran

Not as OpenMPI/HPC‑X per the DeltaAI docs. DeltaAI’s documented MPI for multi-rank jobs is Cray MPICH, and GPU-aware MPI is enabled via:

export MPICH_GPU_SUPPORT_ENABLED=1 (docs.ncsa.illinois.edu)

The docs also describe the system programming environments (modules) as PrgEnv-gnu, PrgEnv-cray, and prgenv-nvidia (lowercase), i.e., a Cray Programming Environment workflow rather than an OpenMPI/HPC‑X one. (docs.ncsa.illinois.edu)

Separately, NVIDIA’s own HPC SDK 25.5 release notes indicate HPC‑X is selected by loading a dedicated nvhpc-hpcx-... environment module (i.e., it’s not implied to be automatically “the MPI” you get just by loading the base HPC SDK). (docs.nvidia.com)

Official documentation links (cluster + vendor)

DeltaAI “Programming Environment (Building Software)” (CUDA module + GPUDirect/Cray MPICH notes). (docs.ncsa.illinois.edu)

DeltaAI “Installed Software” (module system context). (docs.ncsa.illinois.edu)

NVIDIA HPC SDK 25.5 Release Notes (HPC‑X selection via nvhpc-hpcx-... module). (docs.nvidia.com)

What I could not confirm from public docs

I did not find a public DeltaAI page that enumerates the exact contents of a module named nvidia/25.5 (e.g., the full module show nvidia/25.5 output and whether it directly exposes an OpenMPI/HPC‑X tree).

If you paste the output of:

module show nvidia/25.5 module show prgenv-nvidia module spider hpcx

I can tell you definitively whether nvidia/25.5 is setting CUDA vars/paths itself and whether it also wires in HPC‑X/OpenMPI on DeltaAI.

Citations:

1: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/prog-env.html

2: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/prog-env.html

3: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/prog-env.html

4: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/prog-env.html

5: https://docs.nvidia.com/hpc-sdk/archive/25.5/hpc-sdk-release-notes/index.html?utm_source=openai

6: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/prog-env.html

7: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/user-guide/software.html

8: https://docs.nvidia.com/hpc-sdk/archive/25.5/hpc-sdk-release-notes/index.html?utm_source=openai

🏁 Script executed:

# Locate and inspect the toolchain/modules file fd -t f "modules" -path "*toolchain*" | head -20

Repository: MFlowCode/MFC

Length of output: 225

🏁 Script executed:

# Find the toolchain/modules file git ls-files | grep -E "toolchain.*modules" | head -20

Repository: MFlowCode/MFC

Length of output: 105

🏁 Script executed:

# Read the toolchain/modules file with line numbers cat -n toolchain/modules | head -90

Repository: MFlowCode/MFC

Length of output: 3374

🏁 Script executed:

# Check the git diff to see what was replaced on Line 73 git diff HEAD -- toolchain/modules | head -100

Repository: MFlowCode/MFC

Length of output: 39

🏁 Script executed:

# Check git status and recent commits for this file git status toolchain/modules echo "---" git log --oneline -10 -- toolchain/modules

Repository: MFlowCode/MFC

Length of output: 155

🏁 Script executed:

# Show the previous version of toolchain/modules (before "new delta AI module" commit) git show 3bebe447~1:toolchain/modules | sed -n '59,76p'

Repository: MFlowCode/MFC

Length of output: 97

🏁 Script executed:

# Show the diff for the commit that added "new delta AI module" git show 3bebe447 -- toolchain/modules

Repository: MFlowCode/MFC

Length of output: 3771

Add explicit MPI configuration to dai-all modules.

The nvidia/25.5 module provides CUDA, but DeltaAI's documented multi-rank workflow requires Cray MPICH with GPU support (set via MPICH_GPU_SUPPORT_ENABLED=1). Add either an explicit prgenv-nvidia module or cray-mpich module to the dai-all line to ensure MPI is available for multi-rank builds and runs, consistent with DeltaAI's programming environment guidelines.

new delta AI module

3bebe44

Copilot AI review requested due to automatic review settings March 13, 2026 17:36

Copilot started reviewing on behalf of wilfonba March 13, 2026 17:37 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

sbryngelson merged commit 92e751f into MFlowCode:master Mar 13, 2026
36 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New DeltaAI Module#1306

New DeltaAI Module#1306
sbryngelson merged 1 commit intoMFlowCode:masterfrom
wilfonba:deltaAIFix

wilfonba commented Mar 13, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

coderabbitai bot commented Mar 13, 2026

Walkthrough

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

wilfonba commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Testing

Uh oh!

github-actions bot commented Mar 13, 2026

Claude Code Review

Summary

Findings

Improvement Opportunities

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

coderabbitai bot commented Mar 13, 2026

Walkthrough

❌ Failed checks (1 inconclusive)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

What the DeltaAI docs publicly confirm (re: NVIDIA HPC SDK 25.5)

1) CUDA toolkit/runtime paths

2) MPI (OpenMPI/HPC‑X) usable for multi-rank jobs with nvc/nvc++/nvfortran

Official documentation links (cluster + vendor)

What I could not confirm from public docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

wilfonba commented Mar 13, 2026 •

edited

Loading