Refactor `EnsembleReducedFunctional` by JHopeCollins · Pull Request #4965 · firedrakeproject/firedrake

JHopeCollins · 2026-03-12T15:37:29Z

Previous implementation

The EnsembleReducedFunctional implements a functional with many independent terms all depending on the same control, which are calculated in parallel over the ensemble:

$$J(m) = \sum J_i(m)$$

This is three operations composed together:

Broadcast $m$ to all $J_i$
Transform all $J_i(m)$
Reduce $J=\sum J_i$

It also had limited support for using distributed controls, i.e. different $m_i$ for each $J_i$.

$$J(m) = \sum J_i(m_i)$$

This is two operations composed together:

Transform all $J_i(m_i)$
Reduce $J=\sum J_i$

Issues with previous implementation

In the distributed controls case the Controls were Functions on each spatial comm and were not collective over the global comm - this breaks the ReducedFunctional contract and meant that this version would fail the taylor test (e.g. the xfailed tests here:

firedrake/tests/firedrake/adjoint/test_ensemble_reduced_functional.py

Lines 99 to 101 in a101140

    
           @pytest.mark.xfail(reason="Taylor's test fails because the inner product \ 
        
                              between the perturbation and gradient is not allreduced \ 
        
                              for `scatter_control=False`.")

In the case with a single control $m$, the user had to implement the local part of the sum (i.e. manually sum the $J_i$ on the local rank themselves).

What does this PR do?

This PR splits the EnsembleReducedFunctional into separate ReducedFunctional classes for the Bcast, Transform, and Reduce. EnsembleReducedFunctional is then re-implemented in terms of these operations.

These all use EnsembleFunction as either the control and/or functional as appropriate so we get collective behaviour and the Taylor tests for a distributed control pass (and we can use them in optimisers).

It also implements an additional OverloadedType called EnsembleAdjVec which is a distributed vector of AdjFloat. It is to AdjFloat what EnsembleFunction is to Function.

What changes will users see?

To use distributed controls a user will need to create an EnsembleFunctionSpace to use an EnsembleFunction as the control. But in return you get the collective behaviour.

A significant API change for both individual and collective controls is that now the $J_i$ on each spatial comm are passed as a separate ReducedFunctional, rather than taping all the $J_i$ and the local reduction on one tape.

This means that the user only has to implement $J_i$ rather than any of the reduction.
It ensures that each $J_i$ has a different tape, so can be run in any order without polluting state for the derivative calculation.

Ig-dolci · 2026-03-12T15:44:35Z

demos/full_waveform_inversion/full_waveform_inversion.py.rst

+**Steps 2-3**: Solve the wave equation and compute the functional.
+We create a ``ReducedFunctional`` for each source, which for our
+case means one per ensemble member. Creating a ``ReducedFunctional``
+per component that we are parallelising over (i.e. per source) -
+rather than creating one per ensemble member - we can change
+the ensemble parallel partition with minimal changes to the code.::
+
+    from firedrake.adjoint import *


This is better 🙃

Ig-dolci · 2026-03-12T15:45:34Z

demos/full_waveform_inversion/full_waveform_inversion.py.rst

-                                      my_ensemble)
+    continue_annotation()
+    J_val = 0.0
+    with set_working_tape() as tape:


Why is with set_working_tape() as tape: better here?

It makes it really clear which bit of the code is being recorded on that tape, and it makes sure that the ReducedFunctional for each $J_i$ has a different tape.

dham

Needs a manual section in the ensemble parallelism chapter to explain the overall mathematical model for ensemble parallel.

Split the classes for the two cases and introduce them in a way that avoids an immediate backward-incompatible change.

connorjward · 2026-03-13T09:55:13Z

firedrake/adjoint/ensemble_adjvec.py

+from firedrake.adjoint_utils.checkpointing import disk_checkpointing
+
+
+class EnsembleAdjVec(OverloadedType):


Why not EnsembleAdjFloat?

connorjward · 2026-03-13T10:02:30Z

firedrake/adjoint/ensemble_reduced_functional.py

+        # Adjoint action is a reduction so we just piggyback.
+        # Possibly don't do this if we're being created by the
+        # reduction rf to avoid infinite recursion.
+        if not _only_forward:


Could this be avoided if you make self._reduce into a cached_property and hence lazily evaluated?

connorjward · 2026-03-13T10:09:51Z

firedrake/adjoint/ensemble_reduced_functional.py

+        return self.derivative(hessian_input, apply_riesz=apply_riesz)
+
+
+class EnsembleTransformReducedFunctional(AbstractReducedFunctional):


This name is a bit confusing to me as it makes it seem very generic. It could be equivalent to EnsembleReducedFunctional or it could be an ABC for EnsembleAllgatherReducedFunctional etc.

Maybe something like EnsembleNoReduceReducedFunctional (lol)? EnsemblePipelineReducedFunctional? EnsemblePassthroughReducedFunctional?

JHopeCollins added 4 commits March 12, 2026 15:18

Ensemble.allgather

10b7752

small ensemble function adjoint_utils fixes

248271f

EnsembleReducedFunctional refactor

1ca9ad9

update fwi demo with ensemblerf refactor

9e5ec00

JHopeCollins self-assigned this Mar 12, 2026

JHopeCollins added enhancement firedrake-adjoint labels Mar 12, 2026

Ig-dolci reviewed Mar 12, 2026

View reviewed changes

JHopeCollins marked this pull request as ready for review March 12, 2026 16:01

dham reviewed Mar 12, 2026

View reviewed changes

connorjward reviewed Mar 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `EnsembleReducedFunctional`#4965

Refactor `EnsembleReducedFunctional`#4965
JHopeCollins wants to merge 4 commits intomainfrom
JHopeCollins/ensemble-rf-refactor

JHopeCollins commented Mar 12, 2026 •

edited

Loading

Uh oh!

Ig-dolci Mar 12, 2026

Uh oh!

Ig-dolci Mar 12, 2026

Uh oh!

JHopeCollins Mar 12, 2026

Uh oh!

dham left a comment

Uh oh!

connorjward Mar 13, 2026

Uh oh!

connorjward Mar 13, 2026

Uh oh!

connorjward Mar 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	@pytest.mark.xfail(reason="Taylor's test fails because the inner product \
	between the perturbation and gradient is not allreduced \
	for `scatter_control=False`.")

		from firedrake.adjoint_utils.checkpointing import disk_checkpointing


		class EnsembleAdjVec(OverloadedType):

		return self.derivative(hessian_input, apply_riesz=apply_riesz)


		class EnsembleTransformReducedFunctional(AbstractReducedFunctional):

Conversation

JHopeCollins commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Previous implementation

Issues with previous implementation

What does this PR do?

What changes will users see?

Uh oh!

Ig-dolci Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Ig-dolci Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

JHopeCollins Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

dham left a comment

Choose a reason for hiding this comment

Uh oh!

connorjward Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

connorjward Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

connorjward Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

JHopeCollins commented Mar 12, 2026 •

edited

Loading