Skip to content

Moving EL Bubbles with MPI Decomposition#1290

Draft
wilfonba wants to merge 6 commits intoMFlowCode:masterfrom
wilfonba:MovingBubblesFresh-clean
Draft

Moving EL Bubbles with MPI Decomposition#1290
wilfonba wants to merge 6 commits intoMFlowCode:masterfrom
wilfonba:MovingBubblesFresh-clean

Conversation

@wilfonba
Copy link
Contributor

@wilfonba wilfonba commented Mar 3, 2026

Description

This PR adds support for moving Euler-Lagrange bubbles and various force models. Tracer particles that simply follow the background flow are implemented, as are Newton's 2nd Law particles that respond to body forces, pressure gradients, and drag. This PR is marked as a draft for now as there are several things that I need to complete before a proper merge can occur, but I wanted these changes to be visible to other contributors working on related features.

Fixes #(issue)

Type of change

  • New feature

Testing

16 rank CPU versus 16 rank GPU

This test was ran using the examples/3D_lagrange_bubblescreen test case and compares the results for a 16 rank CPU simulation to a 16 rank GPU simulation. The visualization shows the following things:

  1. The difference in pressure in the vertical plane, with darker colors corresponding to larger errors. There seems to be a reflection of some sort from the subsonic buffer outflow condition, and the acoustic source introduces a small amount of difference, though neither of these features was modified by this PR. (I intend to add a video with no bubbles in the domain to see if these artifacts still exist)
  2. The difference in void fraction in the horizontal plane. The difference is 1e-38 everywhere in ParaView, which corresponds to bitwise identical results.
  3. The bubbles for each simulation overlayed on top of each other. The bubbles from the CPU result are colored in blue while the bubbles from the GPU result are colored in white. The small bit of blue that is observed in the video is a result of the rendering spheres with a fixed number of facets.
  4. The pressure field in the lower front quadrant as a surface render with darker colors indicating higher pressure.
  5. The MPI domain decomposition as a black lattice.
test.mp4

Checklist

  • I added or updated tests for new behavior
  • I updated documentation if user-facing behavior changed

See the developer guide for full coding standards.

GPU changes (expand if you modified src/simulation/)
  • GPU results match CPU results
  • Tested on NVIDIA GPU or AMD GPU

AI code reviews

Reviews are not triggered automatically. To request a review, comment on the PR:

  • @coderabbitai review — incremental review (new changes only)
  • @coderabbitai full review — full review from scratch
  • /review — Qodo review
  • /improve — Qodo code suggestions

wilfonba added 2 commits March 3, 2026 12:46
…and MPI communication

Features:
- Lagrangian bubble movement with projection-based void fraction smearing
- Kahan summation for accurate void fraction boundary conditions
- Extended MPI communication for moving EL bubbles
- New 2D and 3D moving Lagrangian bubble examples
- Updated test cases and golden files
@MFlowCode MFlowCode deleted a comment from github-actions bot Mar 4, 2026
@MFlowCode MFlowCode deleted a comment from github-actions bot Mar 4, 2026
@github-actions
Copy link

github-actions bot commented Mar 4, 2026

Claude Code Review

Head SHA: 54d388d
Files changed: 47 (+5320 / -1238)

Changed files (key):

  • src/simulation/m_bubbles_EL.fpp — Moving EL bubble dynamics, MPI migration, BC enforcement
  • src/simulation/m_bubbles_EL_kernels.fpp — Cell-list smearing, Lagrange interpolation, force models
  • src/simulation/m_mpi_proxy.fpp — Particle MPI send/recv infrastructure
  • src/common/m_mpi_common.fpp — MPI reductions, beta-variable buffer exchange
  • src/common/m_boundary_common.fpp — Boundary condition handling for beta fields
  • src/simulation/m_bubbles.fppelementalGPU_ROUTINE for device compatibility
  • src/simulation/m_sim_helpers.fppmax_dt array → scalar + GPU reduction
  • src/pre_process/m_start_up.fpp — Namelist additions

Summary:

  • Adds Newton's 2nd Law (vel_model 2/3) and tracer (vel_model 1) particle motion with drag, pressure gradient, and gravity forces
  • Replaces GPU-atomic smearing with a cell-list approach (eliminates non-determinism for Gaussian/delta kernels)
  • Adds full MPI particle migration: bubbles crossing subdomain boundaries are packed/unpacked and transferred to neighbor ranks
  • Switches adap_dt_stop_max (MAX reduction) to adap_dt_stop_sum (GPU atomic add) in both EE and EL paths
  • Simplifies max_dt from a per-cell array to a scalar via GPU loop reduction

Findings

🔴 Bug: Wrong argument passed to f_interpolate_velocity in f_advance_step (adap_dt + vel_model 1)

File: src/simulation/m_bubbles.fpp, inside f_advance_step, vel_model case(1)

vTemp(l) = f_interpolate_velocity(fR, cell, l, q_prim_vf)  ! ← fR is bubble RADIUS

f_interpolate_velocity(pos, cell, i, q_prim_vf) expects a spatial coordinate pos. fR is the current bubble radius, not a position. The non-adap_dt path (in s_compute_bubble_EL_dynamics) correctly passes myPos(l). This will silently produce wrong velocities for tracer particles when adap_dt = .true..

Fix: Replace fR with fPos(l):

vTemp(l) = f_interpolate_velocity(fPos(l), cell, l, q_prim_vf)

🔴 Bug: Wrong denominator in 4th-order Lagrange basis function L(1)

File: src/simulation/m_bubbles_EL_kernels.fpp, f_interpolate_velocity (fd_order == 4)

L(1) = ((pos - xi(2))*(pos - xi(3))*(pos - xi(4))*(pos - xi(5)))/ &
       ((xi(1) - xi(2))*(xi(1) - xi(3))*(xi(1) - xi(4))*(xi(2) - xi(5)))  ! ← last factor WRONG

The Lagrange basis requires (xi(1) - xi(5)) in the last factor, not (xi(2) - xi(5)).

Fix:

L(1) = ((pos - xi(2))*(pos - xi(3))*(pos - xi(4))*(pos - xi(5)))/ &
       ((xi(1) - xi(2))*(xi(1) - xi(3))*(xi(1) - xi(4))*(xi(1) - xi(5)))

🟡 Issue: Duplicate hyper_cleaning in pre_process namelist

File: src/pre_process/m_start_up.fpp, around line 148

g0_ic, p0_ic, hyper_cleaning, hyper_cleaning  ! ← 'hyper_cleaning' listed twice

Fortran namelist variables must be unique. Some compilers will silently ignore the duplicate; others will raise a compile-time or runtime error. This should be deduplicated.


🟡 Issue: Optional arguments in f_advance_step used without present() guards

File: src/simulation/m_bubbles.fpp

f_advance_step declares fPos, fVel, fRe, cell, q_prim_vf as optional. Inside the select case (lag_vel_model) block these are dereferenced without any if (present(...)) guard. The EE path calls the function without these optionals and relies on lag_vel_model == dflt_int reaching the case default branch. This is safe in practice but fragile — any future change to lag_vel_model initialization could cause an undefined-behavior dereference of absent optionals. Consider adding a GPU_ROUTINE-compatible guard (e.g., pass a sentinel moving_lag_bubbles logical) or use non-optional arguments.


🟡 Issue: Missing @:DEALLOCATE for MPI particle buffers

File: src/simulation/m_mpi_proxy.fpp

p_send_buff, p_recv_buff, and p_send_ids are allocated in s_initialize_particles_mpi but have no matching @:DEALLOCATE in any finalization routine (s_finalize_mpi_proxy_module or equivalent). Per project convention every @:ALLOCATE requires a matching @:DEALLOCATE. This is a memory leak for long or repeated runs (though the OS reclaims memory on exit).


🟢 Improvements (non-blocking)

  • src/simulation/m_bubbles_EL.fpp line ~1987: 3*fV*fVel(l)/fR — bare integer 3 in floating-point arithmetic; prefer 3._wp for consistency with coding conventions (though standard Fortran promotes correctly).
  • The Kahan summation introduced in delta/Gaussian smearing is a good accuracy improvement for bubble projection; the cell-list restructuring correctly eliminates atomic-write non-determinism.
  • The max_dt scalar + GPU min reduction is cleaner than the previous array approach.

This is a draft PR; findings above are intended to help prioritize before merge. The GPU vs CPU comparison video is compelling evidence that the MPI decomposition is correct.

@MFlowCode MFlowCode deleted a comment from github-actions bot Mar 4, 2026
@github-actions
Copy link

github-actions bot commented Mar 9, 2026

Claude Code Review

Head SHA: 6b06c84f6ca1e089fa9df87a486acf2c6b687827
Files changed: 48 — key files: m_bubbles_EL.fpp, m_bubbles_EL_kernels.fpp, m_mpi_proxy.fpp, m_mpi_common.fpp, m_boundary_common.fpp, m_global_parameters.fpp (3 targets), m_ib_patches.fpp, m_ibm.fpp + toolchain + tests


Summary

  • Adds translational motion for EL bubbles/particles under body forces, pressure-gradient force, and drag (Newton's 2nd law and tracer modes).
  • Introduces a full MPI particle migration framework: neighbor-rank discovery, pack/send/unpack with periodic and non-periodic support.
  • Replaces x_domain/y_domain/z_domain scalars with glb_bounds(3) computed via MPI allreduce from actual grid coordinates, fixing a long-standing IB issue.
  • Adds Gaussian-smearing beta-variable buffer communication (s_mpi_reduce_beta_variables_buffers) and beta boundary conditions for periodic/reflective walls.
  • Adds two new example cases and two new golden test entries; regenerates several existing golden files.

Findings

1. Duplicate namelist variable in src/pre_process/m_start_up.fpp (probable compile error)

The modified namelist declaration ends with:

& g0_ic, p0_ic, hyper_cleaning, hyper_cleaning

hyper_cleaning appears twice. Duplicate namelist variables are a Fortran error on most compilers. The second entry is almost certainly meant to be hyper_cleaning_speed and/or hyper_cleaning_tau — those were in the original namelist but appear to have been dropped. If they are still read/used by the pre_process, this will silently fail to read them.

2. Missing intent on dummy arguments in new subroutines

Project convention (CLAUDE.md) requires explicit intent on all subroutine arguments. Several new routines violate this:

  • src/simulation/m_mpi_proxy.fpp, s_initialize_particles_mpi: lag_num_ts declared in the type list without intent.
  • src/simulation/m_mpi_proxy.fpp, s_add_particles_to_transfer_list: pos, posPrev, and the integer dummy nBub/nbub all lack intent.
  • src/simulation/m_mpi_proxy.fpp, s_add_particle_to_direction (referenced but not in this diff — verify).

3. s_mpi_reduce_int_sum leaves output uninitialized in non-MPI builds
(src/common/m_mpi_common.fpp, newly added subroutine)

subroutine s_mpi_reduce_int_sum(var_loc, sum)
    integer, intent(in)  :: var_loc
    integer, intent(out) :: sum
#ifdef MFC_MPI
    call MPI_REDUCE(...)
#endif
end subroutine

In non-MPI builds the intent(out) variable sum is never written. The caller will read garbage. Add #else; sum = var_loc (analogous to the non-MPI branches in s_mpi_reduce_stability_criteria_extrema).

4. Fragile magic number in particle-buffer sizing
(src/simulation/m_mpi_proxy.fpp, s_initialize_particles_mpi)

nReal = 7 + 16*2 + 10*lag_num_ts

This hardcoded layout count must stay in exact sync with the pack/unpack loops in s_mpi_sendrecv_lag_bubbles. If a field is added or removed from the particle state, this number produces a silent buffer over/underrun that will corrupt data or segfault. A named parameter or a computed offset should be used instead.

5. Golden files regenerated only with nvfortran/GPU

The updated metadata for all modified golden files (016C1B8B, 2EE0C3AA, 3008BA80, CE7B0BC7) and both new golden files (B048D7C4, F6A8B601) were generated exclusively with NVHPC 25.11 + OpenACC on wingtip-gpu3. CI runs these tests with gfortran (CPU). Tolerance-based comparison should absorb GPU/CPU floating-point differences, but it is worth regenerating the existing four tests with gfortran CPU to confirm no tolerance-exceeding drift was introduced.

6. s_beta_periodic and s_beta_reflective called inside GPU_PARALLEL_LOOP without GPU_ROUTINE macro
(src/common/m_boundary_common.fpp)

s_beta_periodic and s_beta_reflective are tagged with $:GPU_ROUTINE(parallelism='[seq]', ...). s_beta_extrapolation (defined in the same file) is also called from a GPU loop context in the extrapolation/default case but is missing the GPU_ROUTINE decoration. Verify it is never reached on GPU, or add the decorator.


Minor / Improvement Opportunities

  • Case inconsistency: mapCells (capital C) and mapcells (lowercase) are used interchangeably across new code (Fortran is case-insensitive, so not a bug, but inconsistent with the rest of the file).
  • PR is draft: The PR body notes several items still to complete before merge; this review is offered as early feedback only.
  • s_compute_dt_from_cfl logic change (m_sim_helpers.fpp): if (any(Re_size > 0))if (viscous). These may not be identical in all configurations — worth a targeted test with non-Newtonian or partial-viscosity setups.

Review generated by Claude Code (claude-sonnet-4-6). This is a draft PR — findings are intended as early-stage feedback.

@github-actions
Copy link

Claude Code Review

Head SHA: 4113e012f02b1c0aecce0eead9088e77f8d09580
Files changed: 48 (5322 additions / 1243 deletions)

Files reviewed (top 10 of 48)
  • src/simulation/m_bubbles_EL.fpp
  • src/simulation/m_bubbles_EL_kernels.fpp
  • src/simulation/m_bubbles.fpp
  • src/simulation/m_mpi_proxy.fpp
  • src/common/m_mpi_common.fpp
  • src/common/m_boundary_common.fpp
  • src/common/include/3dHardcodedIC.fpp
  • toolchain/mfc/params/definitions.py
  • examples/2D_moving_lag_bubs/case.py
  • examples/3D_moving_lag_particles/case.py

Summary

  • Adds tracer (vel_model=1) and Newton's 2nd Law (vel_model=2,3) translational motion for EL bubbles with drag, pressure-gradient, and gravity forces
  • Introduces full MPI particle migration (send/recv across subdomain boundaries) and periodic wrapping
  • Replaces GPU-atomic Gaussian smearing with a cell-list + cell-centric GPU loop + Kahan compensation for better accuracy and portability
  • Refactors stability-criteria computation to use GPU reductions instead of per-cell scalar fields, removing icfl_sf/vcfl_sf/Rc_sf arrays
  • Promotes x_domain/y_domain/z_domain to a single glb_bounds(3) struct computed with MPI_ALLREDUCE (correct for stretched grids on arbitrary decompositions)

Findings

Bug — Lagrange 4th-order velocity interpolation formula incorrect
src/simulation/m_bubbles_EL_kernels.fpp (f_interpolate_velocity, fd_order == 4 branch)

Three of the five Lagrange basis polynomials are wrong. The correct Lagrange formula for node k is: L(k) = product_{j≠k}(pos - xi(j)) / product_{j≠k}(xi(k) - xi(j)). Issues:

  • L(1) denominator: last factor is (xi(2) - xi(5)) but should be (xi(1) - xi(5))
  • L(4) numerator: has *(pos - xi(4)) but should have *(pos - xi(5)) (xi(4) is the excluded node)
  • L(5) numerator: identical to L(4) — both exclude xi(4) from the product instead of excluding their own index

This produces incorrect interpolated velocities for all moving-bubble simulations when fd_order == 4. The fd_order == 2 path is correct.


Bug — f_advance_step accesses optional arguments without present() guards
src/simulation/m_bubbles.fpp (inside the select case (lag_vel_model) block)

The case default branch unconditionally writes to fVel(l) and fPos(l), which are optional dummy arguments absent in the EE call site (m_bubbles_EE.fpp):

case default
    fVel(l) = fVel(l)   ! not present for EE path — undefined behavior / crash
    fPos(l) = fPos(l)   ! same

This will crash (or silently corrupt memory) for any Euler-Euler simulation using adap_dt = T. Either add if (present(fVel)) guards or remove the dead no-op default case.


Bug — s_mpi_reduce_int_sum leaves output uninitialized in non-MPI builds
src/common/m_mpi_common.fpp

The new subroutine only assigns sum inside #ifdef MFC_MPI. For CPU-only serial builds the argument is never set:

#ifdef MFC_MPI
    call MPI_REDUCE(var_loc, sum, ...)
#endif
! No #else — 'sum' is undefined for serial builds

Add #else / sum = var_loc / #endif to handle the non-MPI path.


Hardcoded physical constants in hcid=304/305 cases
src/common/include/3dHardcodedIC.fpp (~lines 608–629)

The density ratios 950._wp/1000._wp and 1._wp/950._wp are hardcoded for a specific fluid pair and will produce wrong results for any other material. These should be derived from user parameters (e.g., normFac/normMag) or at minimum documented as simulation-specific stubs that must be modified.


Example case files: _v / _g subscript mismatch
examples/2D_moving_lag_bubs/case.py lines 141–150; examples/3D_moving_lag_particles/case.py lines 353–362

"bub_pp%gam_v": gamma_g,   # gamma_g assigned to gam_v
"bub_pp%gam_g": gamma_v,   # gamma_v assigned to gam_g
"bub_pp%M_v":   MW_g,      # MW_g → M_v
"bub_pp%M_g":   MW_v,      # MW_v → M_g
"bub_pp%k_v":   k_g * ..., # k_g → k_v
"bub_pp%k_g":   k_v * ...

Every vapor/gas subscript pair is swapped relative to the variable names. Please confirm whether this matches the convention used in the existing 3D_lagrange_shbubcollapse example (which uses the same pattern and is working), or whether this is a copy-paste error with physical consequences.


Binary file committed to examples directory
examples/3D_moving_lag_particles/sum.png — This PNG appears to be a scratch artifact. Example directories should contain only case.py.


Minor

  • The no-op case default body in f_advance_step (fVel(l) = fVel(l)) is dead code beyond the safety issue above and can be removed entirely.
  • lag_params%T0, rho0, c0, x0, Thost were removed from descriptions.py but remain in definitions.py; confirm whether these are intentionally retained without descriptions or should also be removed.
  • examples/3D_moving_lag_particles/case.py sets "dt": 1.0 alongside "adap_dt": "T" — the fixed dt is unused and may confuse readers; consider removing or commenting it.

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants