Skip to content

Stokes_Constrained: parallel-correctness residuals (partition-dependent [p,λ] gauge + constraint-assembly drift + lossy knockout) #254

@lmoresi

Description

@lmoresi

Summary

While root-causing the #248 Zhong-benchmark velocity catastrophe (now fixed by #242 — it was a broken SphericalShellInternalBoundary Lower/Internal label bug), three smaller but real parallel-correctness issues in Stokes_Constrained (in-saddle multiplier free-slip) were isolated with reliable integral diagnostics. Tracking them here for a focused fix.

Method note: point evaluation (uw.function.evaluate) is unreliable for serial-vs-parallel field comparison here (gave 134%-of-|v| pointwise noise while volume L2 differed 0.4%). Use partition-independent integrals only.

Ruled out: velocity rigid-rotation nullspace (angular momentum ∫r×v dV reproducible to ~1e-5 serial vs np=8); iterative under-convergence (serial & np=8 each bit-identical at tol 1e-7 vs 1e-11 — both fully converge, to different answers).

1. [p,λ] / topography gauge is partition-inconsistent (the nullspace bug)

The combined constant-(pressure, multiplier) gauge (_build_block_gauge_nullspace_vector) is a true velocity null mode, but its level is partition-dependent: mean pressure = 3.333 (serial) vs 2.983 (np=8), 10.5%. Because topography is read from the multiplier, this corrupts topography across ranks unless mean-stripped. set_pressure_gauge("Upper", 0.0) (#250, global BdIntegral callback) pins it consistently → 0.4%, bit-identical on velocity.
Fix: make the gauge partition-reproducible by construction (default-on pin, or enforce topography(reference="mean")); add a parallel regression asserting meanP/topography match serial to ~1e-6. Highest priority — topography correctness, fix already exists.

2. Residual ~0.4% velocity partition-dependence in constraint/augmentation assembly (separate, NOT a nullspace)

With the gauge pinned, velocity L2/|Ut|/|Ub| still differ 0.36/0.37/0.41% serial vs np=8 (np=8 bit-identical with/without the pin). Fully converged ⇒ the assembled operator/RHS differs ~0.4% by partition — locus is the add_constraint_bc / augmented-Lagrangian (augmentation_base=1e4) boundary assembly, not a nullspace. 3D-specific (2D test_1063 is bit-identical to 1e-9; boundary areas match exactly, so it's the v·n / augmentation term at inter-rank seams on the 2-D free-slip surface).
Fix: find and fix the partition-dependent constraint/augmentation assembly; goal = converged velocity partition-independent to round-off.

3. Interior-multiplier knockout is lossy (docstring says "lossless")

_constrain_interior_multipliers_in_section (gated by _reduce_interior_multiplier, default True) pins interior multiplier DOFs. Disabling it changes the serial answer by 2% (L2/Ut) to 5% (Ub) — so the pinned DOFs are not inert. DOF count is partition-identical (5170 at np=1/2/4/8), so not gross loss. (Off makes parallel spread worse — 1.7% at CMB — so it's a distinct issue from #2, consistent with block-error trade-off.)
Fix: determine whether the boundary-trace closure drops DOFs that matter (or "inert interior multiplier" is the wrong model); fix or re-document. Also: monolithic MUMPS LU on the constrained system gives wrong physics (|Ut|=4e-3 vs validated 1e-2) and segfaults at np=8 — capture as known-bad or fix.

Reproduction

~/+Simulations/repro_248_internal_load.py (lifts the test_1064 Zhong setup; reports volume L2, |Ut|/|Ub|, angular momentum, meanP — all partition-independent). Args: <method> <write|check> <solver> <tol> <knockout on|off> <gauge none|pin>. Build/run in the amr-dev pixi env; parallel needs --with-mpi.

Related: #248 (velocity half resolved by #242; topography half open), #244 (constrained slow in parallel).

cc @gthyagi

Underworld development team with AI support from Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions