MFlowCode · sbryngelson · Feb 24, 2026 · Feb 23, 2026 · Feb 24, 2026 · Feb 24, 2026
@@ -0,0 +1,42 @@
+# Common Pitfalls
+
+## Array Bounds
+- Arrays use non-unity lower bounds with ghost cells
+- Riemann solver indexing: left state at `j`, right state at `j+1`
+- Off-by-one errors in ghost cell regions are a common source of bugs
+
+## Blast Radius
+- `src/common/` is shared by ALL three executables (pre_process, simulation, post_process)
+- Any change to common/ requires testing all three targets
+- Public subroutine signature changes affect all callers across all targets
+- Parameter default changes affect all existing case files
+
+## Physics Consistency
+- Pressure formula MUST match `model_eqns` setting
+- Model-specific conservative ↔ primitive conversion paths exist
+- Volume fractions must sum to 1.0
+- Boundary condition symmetry requirements must be maintained
+
+## Compiler-Specific Issues
+- Code must compile on gfortran, nvfortran, Cray ftn, and Intel ifx
+- Each compiler has different strictness levels and warning behavior
+- Fypp macros must expand correctly for both GPU and CPU builds
+- GPU builds only work with nvfortran, Cray ftn, and AMD flang
+
+## Test Golden Files
+- Tests compare output against golden files in `tests/<hash>/golden.txt`
+- If your change intentionally modifies output, regenerate golden files:
+  `./mfc.sh test --generate --only <affected_tests> -j 8`
+- Do not regenerate ALL golden files unless you understand every output change
+- Golden file diffs are compared with tolerance, not exact match
+
+## PR Checklist
+Before submitting a PR:
+- [ ] `./mfc.sh format -j 8` (auto-format)
+- [ ] `./mfc.sh precheck -j 8` (5 CI lint checks)
+- [ ] `./mfc.sh build -j 8` (compiles)
+- [ ] `./mfc.sh test --only <relevant> -j 8` (tests pass)
+- [ ] If adding parameters: all 3 locations updated
+- [ ] If modifying `src/common/`: all three targets tested
+- [ ] If changing output: golden files regenerated for affected tests
+- [ ] One logical change per commit
@@ -0,0 +1,46 @@
+# Fortran Conventions
+
+## File Format
+- Source files use `.fpp` extension (Fortran + Fypp preprocessor macros)
+- Fypp preprocesses `.fpp` → `.f90` at build time via CMake
+- Fypp supports conditional compilation, code generation, and regex macros
+
+## Module Structure
+Every Fortran module follows this pattern:
+- File: `m_<feature>.fpp`
+- Module: `module m_<feature>`
+- `implicit none` required
+- Explicit `intent(in)`, `intent(out)`, or `intent(inout)` on ALL subroutine/function arguments
+- Initialization subroutine: `s_initialize_<feature>_module`
+- Finalization subroutine: `s_finalize_<feature>_module`
+
+## Naming
+- Modules: `m_<feature>`
+- Public subroutines: `s_<verb>_<noun>`
+- Public functions: `f_<verb>_<noun>`
+- Private/local variables: no prefix required
+- Constants: descriptive names, not ALL_CAPS
+
+## Forbidden Patterns
+These are caught by `./mfc.sh precheck` (source lint step 4/5):
+- `dsqrt`, `dexp`, `dlog`, `dble`, `dabs` → use `sqrt`, `exp`, `log`, `real(..., wp)`
+- `real(8)`, `real(4)` → use `wp` or `stp` kind parameters
+- `goto`, `COMMON` blocks, global `save` variables
+- `stop`, `error stop` → use `call s_mpi_abort()`
+- Raw `!$acc` or `!$omp` directives → use Fypp GPU_* macros from `parallel_macros.fpp`
+
+## Precision Types
+- `wp` (working precision): used for computation. Double by default.
+- `stp` (storage precision): used for I/O. Double by default.
+- In single-precision mode (`--single`): both become single.
+- In mixed-precision mode (`--mixed`): wp=double, stp=single.
+- MPI type matching: `mpi_p` must match `wp`, `mpi_io_p` must match `stp`.
+- Always use generic intrinsics: `sqrt` not `dsqrt`, `abs` not `dabs`.
+- Cast with `real(..., wp)` or `real(..., stp)`, never `dble(...)`.
+
+## Size Guidelines (soft)
+- Subroutine: ≤500 lines
+- Helper routine: ≤150 lines
+- Function: ≤100 lines
+- File: ≤1000 lines
+- Arguments: ≤6 preferred
@@ -0,0 +1,109 @@
+# GPU and MPI Patterns
+
+## GPU Offloading Architecture
+
+Only `src/simulation/` is GPU-accelerated. Pre/post_process run on CPU only.
+
+MFC uses a **backend-agnostic GPU abstraction** via Fypp macros. The same source code
+compiles to either OpenACC or OpenMP target offload depending on the build flag:
+
+- `./mfc.sh build --gpu acc` → OpenACC backend (NVIDIA nvfortran, Cray ftn)
+- `./mfc.sh build --gpu mp`  → OpenMP target offload backend (Cray ftn, AMD flang)
+- `./mfc.sh build` (no --gpu) → CPU-only, GPU macros expand to plain Fortran
+
+### Macro Layers (in src/common/include/)
+- `parallel_macros.fpp` — **Use these.** Generic `GPU_*` macros that dispatch to the
+  correct backend based on `MFC_OpenACC` / `MFC_OpenMP` compile definitions.
+- `acc_macros.fpp` — OpenACC-specific `ACC_*` implementations (do not call directly)
+- `omp_macros.fpp` — OpenMP target offload `OMP_*` implementations (do not call directly)
+  - OMP macros generate **compiler-specific** directives: NVIDIA uses `target teams loop`,
+    Cray uses `target teams distribute parallel do simd`, AMD uses
+    `target teams distribute parallel do`
+- `shared_parallel_macros.fpp` — Shared helpers (collapse, private, reduction generators)
+
+### Key GPU Macros (always use the `GPU_*` prefix)
+- `@:GPU_PARALLEL_LOOP(collapse=N, private=[...], reduction=[...], reductionOp='+')` —
+  Parallel loop over GPU threads. Most common GPU macro.
+- `@:END_GPU_PARALLEL_LOOP()` — Required closing for GPU_PARALLEL_LOOP.
+- `@:GPU_PARALLEL(code, ...)` — GPU parallel region (wraps code block).
+- `@:GPU_LOOP(collapse=N, ...)` — Inner loop within a GPU parallel region.
+- `@:GPU_DATA(code, copy=..., create=..., ...)` — Scoped data region.
+- `@:GPU_ENTER_DATA(create=[...])` — Allocate device memory (unscoped).
+- `@:GPU_EXIT_DATA(delete=[...])` — Free device memory.
+- `@:GPU_UPDATE(host=[...])` — Copy device → host (before MPI send).
+- `@:GPU_UPDATE(device=[...])` — Copy host → device (after MPI receive).
+- `@:GPU_ROUTINE(function_name=..., nohost=True/False)` — Mark routine for device.
+- `@:GPU_DECLARE(copyin=[...], link=[...])` — Declare device-resident data.
+- `@:GPU_ATOMIC(atomic='update')` — Atomic operation on device.
+- `@:GPU_WAIT()` — Synchronization barrier.
+- `@:GPU_HOST_DATA(code, use_device_addr=[...])` — Host code with device pointers.
+
+NEVER write raw `!$acc` or `!$omp` directives. Always use `GPU_*` Fypp macros.
+The precheck source lint will catch raw directives and fail.
+
+### Memory Management Macros (from macros.fpp)
+- `@:ALLOCATE(var1, var2, ...)` — Fortran allocate + `GPU_ENTER_DATA(create=...)`
+- `@:DEALLOCATE(var1, var2, ...)` — `GPU_EXIT_DATA(delete=...)` + Fortran deallocate
+- Every `@:ALLOCATE` MUST have a matching `@:DEALLOCATE` in finalization
+- Conditional allocation MUST have conditional deallocation
+
+### GPU Field Setup (Cray-specific, from macros.fpp)
+- `@:ACC_SETUP_VFs(...)` / `@:ACC_SETUP_SFs(...)` — GPU pointer setup for vector/scalar fields
+- These compile only for Cray (`_CRAYFTN`); other compilers skip them
+
+### Compiler-Backend Matrix
+| Compiler        | `--gpu acc` (OpenACC) | `--gpu mp` (OpenMP) | CPU-only |
+|-----------------|----------------------|---------------------|----------|
+| GNU gfortran    | No                   | No                  | Yes      |
+| NVIDIA nvfortran| Yes (primary)        | Yes                 | Yes      |
+| Cray ftn (CCE)  | Yes                  | Yes (primary)       | Yes      |
+| Intel ifx       | No                   | No                  | Yes      |
+| AMD flang       | No                   | Yes                 | Yes      |
+
+## Preprocessor Defines (`#ifdef` / `#ifndef`)
+
+Raw `#ifdef` / `#ifndef` preprocessor guards are **normal and expected** in MFC.
+They are NOT the same as raw `!$acc`/`!$omp` pragmas (which are forbidden).
+
+Use `#ifdef` for feature, target, compiler, and library gating:
+
+### Feature gating
+- `MFC_MPI` — MPI-enabled build (`--mpi` flag, default ON)
+- `MFC_OpenACC` — OpenACC GPU backend (`--gpu acc`)
+- `MFC_OpenMP` — OpenMP target offload backend (`--gpu mp`)
+- `MFC_GPU` — Any GPU build (either OpenACC or OpenMP)
+- `MFC_DEBUG` — Debug build (`--debug`)
+- `MFC_SINGLE_PRECISION` — Single-precision mode (`--single`)
+- `MFC_MIXED_PRECISION` — Mixed-precision mode (`--mixed`)
+
+### Target gating (for code in `src/common/` shared across executables)
+- `MFC_PRE_PROCESS` — Only in pre_process builds
+- `MFC_SIMULATION` — Only in simulation builds
+- `MFC_POST_PROCESS` — Only in post_process builds
+
+### Compiler gating (for compiler-specific workarounds)
+- `_CRAYFTN` — Cray Fortran compiler
+- `__NVCOMPILER_GPU_UNIFIED_MEM` — NVIDIA unified memory (GH-200 / `--unified`)
+- `__PGI` — Legacy PGI/NVIDIA compiler
+- `__INTEL_COMPILER` — Intel compiler
+- `FRONTIER_UNIFIED` — Frontier HPC unified memory
+
+### Library-specific code
+- FFTW (`m_fftw.fpp`) uses heavy `#ifdef` gating for `MFC_GPU` and `__PGI`
+- CUDA Fortran (`cudafor` module) is gated behind `__NVCOMPILER_GPU_UNIFIED_MEM`
+- SILO/HDF5 interfaces may have conditional paths
+
+When adding new `#ifdef` blocks, always provide an `#else` or `#endif` path so
+the code compiles in all configurations (CPU-only, GPU-ACC, GPU-OMP, with/without MPI).
+
+## MPI
+
+### Halo Exchange
+- Pack/unpack offset calculations are error-prone — verify carefully
+- Buffer sizing depends on dimensionality and QBMM state
+- GPU coherence: always `GPU_UPDATE(host=...)` before MPI send,
+  `GPU_UPDATE(device=...)` after MPI receive
+
+### Error Handling
+- Use `call s_mpi_abort()` for fatal errors, never `stop` or `error stop`
+- MPI must be finalized before program exit
@@ -0,0 +1,45 @@
+# Parameter System
+
+## Overview
+MFC has ~3,300 simulation parameters defined in Python and read by Fortran via namelist files.
+
+## Parameter Flow: Python → Fortran
+
+1. **Definition**: `toolchain/mfc/params/definitions.py` — source of truth
+   - Parameters are indexed families: `patch_icpp(i)%attr`, `fluid_pp(i)%attr`, etc.
+   - Each has type, default, constraints, and tags
+
+2. **Validation**: `toolchain/mfc/case_validator.py`
+   - JSON schema validation via fastjsonschema
+   - Physics constraint checking (e.g., volume fractions sum to 1)
+   - Dependency validation (required/recommended params)
+
+3. **Input Generation**: `toolchain/mfc/run/input.py`
+   - Python case dict → Fortran namelist `.inp` file
+   - Format: `&user_inputs / ... / &end`
+
+4. **Fortran Reading**: `src/*/m_start_up.fpp`
+   - Reads `&user_inputs` namelist
+   - Each parameter must be declared in the namelist statement
+
+## Adding a New Parameter (3-location checklist)
+
+YOU MUST update all 3 locations. Missing any causes silent failures.
+
+1. **`toolchain/mfc/params/definitions.py`**: Add parameter with type, default, constraints
+2. **`src/*/m_start_up.fpp`**: Add to the Fortran `namelist` declaration in the relevant
+   target(s). If the param is used by simulation only, add it there. If shared, add to
+   all three targets' m_start_up.fpp.
+3. **`toolchain/mfc/case_validator.py`**: Add validation rules if the parameter has
+   physics constraints. Include `PHYSICS_DOCS` entry with title, category, explanation.
+
+## Case Files
+- Case files are Python scripts (`.py`) that define a dict of parameters
+- Validated with `./mfc.sh validate case.py`
+- Examples in `examples/` directory
+- Create new cases with `./mfc.sh new <name>`
+- Search parameters with `./mfc.sh params <query>`
+
+## Analytical Initial Conditions
+String expressions in parameters become Fortran code via `case.py.__get_analytic_ic_fpp()`.
+These are compiled into the binary, so syntax errors cause build failures, not runtime errors.