diff --git a/docs/src/dev/constraints.md b/docs/src/dev/constraints.md new file mode 100644 index 00000000..5b1e9c84 --- /dev/null +++ b/docs/src/dev/constraints.md @@ -0,0 +1,83 @@ + +# Distributed Constraints + +This is a framework for the implementation of distributed constraints. To reduce the number of communications, we rely on one core assumption: + +> If a slave DoF is **owned** by a process, all its master DoFs are **local** to that process (owned or ghost). + +We later will discuss what steps require this invariant, and how to get around it if it cannot be satisfied. + +Hanging node constraints, periodicity constraints, and AgFEM constraints can all be implemented within this framework (provided the local triangulation is correctly built, see the AgFEM note for more details). + +We will now describe the algorithm at a high-level for both the single-constraint-source and multiple-constraint-source cases, then describe each shared step in more detail. + +## Case 1: single constraint source + +This is the conceptually simple case: Our assumption implies that each process can fully resolve its owned constraints. Thus, we can proceed as follows: + +- Provided a globally consistent `DOF_to_is_slave` array, we build (in a single communication step) three global communicators: A slave global numbering (sDOF_gids), a free master global numbering (mfdof_gids), and a Dirichlet master global numbering (mddof_gids). +- Given sDOF_gids, we call a user-provided callback that allocated all LOCAL constraints and fills up all OWNED constraints. This requires no communication, since all masters of owned slaves are local by assumption. +- At this point, we have correct OWNED constraints. When then proceed to communicate the GHOST constraints from their owners to the processes that hold them as ghosts (one round of nearest-neighbor communication, similar to consistent!). +- After this, we have a complete set of globally consistent constraints in DOF local numbering. We can then reindex them to mdof numbering (which is what the space constructor expects). + +## Case 2: multiple constraint sources + +This case is a bit more complicated. The key concept is the following: + +- Because our assumption, we can merge the OWNED constraints locally without communication. +- However, the merged set of OWNED constraints CANNOT be closed (chain resolution) before making it consistent accros processors. Otherwise, a local owned slave might be constrained by a local ghost slave whose masters are not yet available locally, so the chain resolution would fail (or would be incomplete). +- On the other hand, once the merget set of constraints is made consistent, the chain resolution can be done locally without communication, since every process has a complete local view of the constraint DAG restricted to its local DoFs (owned + ghost + fictitious). + +The process has to be "locally merge, then consistent, then locally close". +Thus, we can proceed as follows: + +- We now provide a globally consistent `DOF_to_constraint` array, which assigns a constraint source to each DOF (0 for unconstrained). Then `DOF_to_is_slave` is given by `DOF_to_constraint .> 0`. We build the same three global communicators as in Case 1, that is a slave global numbering (sDOF_gids), a free master global numbering (mfdof_gids), and a Dirichlet master global numbering (mddof_gids). Note that these are the final global communicators, shared by all constraint sources, not one per source. +- For each constraint source, we extract the subset of slaves belonging to that source and build the corresponding sub-communicator `csDOF_gids`. This communicator is then provided to the source's callback, which fills up the OWNED constraints for that source. +- We proceed to merging the OWNED constraints from all sources locally, without communication. This may create chains of master-slave relationships across sources. All these masters are local. +- We then perform the same communication step as in Case 1 to make the merged set of constraints consistent across processes. +- At this point, we have a complete, consistent local view of the merged constraint DAG on each process. We can then run `close!` locally on each process to resolve all chains, without any further communication. +- Finally, we reindex to mdof numbering as in Case 1. + +## Building-block overview + +### Building the global communicators + +Currently, we do this in a very similar way to how we build DOF communicators for generic spaces. That is, we use an available `cell_gids` communicator together with a `cell_to_DOFs` table. The key thing here is that the slave, free master, and Dirichlet master split creates a coloring of the local DOFs (each DOF has only one color), so all three communicators can be built with a single cell-based table communication step. + +This step is the one that strictly requires the assumption at the beginning. We cannot account for DOFs that do not belong to the local spaces because they do not belong to the local triangulation. + +### Consistent constraints + +To obtain consistent constraints accross processes, we need to have a communicator for the slaves (i.e `sDOF_gids`) and the local constraint tables `sDOF_to_DOFs` and `sDOF_to_coeffs` filled with the OWNED constraints. + +Additionally, we need some kind of globally consistent key to be able to communicate master DOFs unambiguously across processes. +To this end, we create a single global numbering of all local DOFs (`DOF_gids`) built as a (local and global) concatenation of the three global communicators (slaves first, then free masters, then Dirichlet masters). Each process can compute this numbering locally without communication. + +The procedure is then simple: Map the DOFs to global ids, communicate using `consistent!` to overwrite ghost slave rows with their owners' definitions, then map back to local DOF ids. + +After the communication step, we have a consistent set of constraints. However, some ghost constraints may reference master DOFs that do not belong to the local index space. We can these `fictitious` or `remote` DOFs. This means that when mapping back to local DOF ids, we need to keep track and assign new local ids to these fictitious masters, so that they can be referenced in the local constraint tables for chain resolution. + +### Reindexing to mDOF numbering + +After consistency, we have a complete set of constraints in DOF local numbering. We then need to reindex them to mDOF numbering, which is what the space constructor expects. This is a purely local step, no communication needed. This reindexing also extends the global communicators for free and Dirichlet masters to cover any fictitious masters that were introduced in the consistency step. + +This reindexing and extension do not require any communication. + +### Local merging of constraints + +Merging of constraints from multiple sources is purely local, since we are only merging the OWNED constraints and our assumption guarantees that all masters of owned slaves are local. + +The main thing to lookout for is that (of course) all constraints must be defined on the same local index space. +This is also where the assumption is needed: If we allow for remote masters in the owned constraints, the local numbering of these remote masters might be different within each source. + +Thus, to remove this assumption we would need to first unify the local index space across sources. This means we would need a global numbering of the original DOFs to be able to identify and match remote masters accross sources. + +### Chain resolution (closing) and topological ordering + +`close!` can now be run **independently and locally on each process**, with no further communication. The only thing we need is a **global topological ordering** of the DAGs. What I mean by this: + +- Imagine we have a slave `A` that is constraines by two other slaves `B` and `C`, such that `B` and `C` do not depend on each other. +- Because `A` depends on `B` (or `C`), `B` will be resolved BEFORE `A` in ALL processes (since local DAGs are consistent). So this relationship is unambiguous and naturally global. +- However, since `B` and `C` do not depend on each other, their ordering is not fixed by the DAG structure. Then two different processes might resolve `B` before `C` and the other way around, which would lead to different final constraints after closing. + +This second case is the one where we need a global tie-breaking rule to ensure that all processes make the same choice. In practice, this is very simple: We just need to provode a globally-consistent key (e.g. global DoF id) to be used within a priority queue to create the topological ordering. This way, all processes will break ties in the same way and end up with the same closed constraints. diff --git a/src/Constraints.jl b/src/Constraints.jl new file mode 100644 index 00000000..5e3e1465 --- /dev/null +++ b/src/Constraints.jl @@ -0,0 +1,459 @@ +############################# +## OPTION 1: Single-constraint set, generated from local spaces and local masters. + +# This function REQUIRES that : +# - all `DOFs` are LOCAL (i.e. no masters on other processors) +# - The OWNED constraints are filled in, GHOST constraints are allocated (but will be thrown away) +# +# The reason we require this is because we rely on the local space cell_to_dofs to generate +# the global constraint and master gids. +function generate_distributed_constraints( + cell_gids::PRange, spaces::AbstractArray{<:FESpace}, + _sDOF_to_dof, sDOF_to_DOFs, sDOF_to_coeffs +) + dof_is_slave = map(spaces, _sDOF_to_dof) do space, _sDOF_to_dof + dof_is_slave = fill(false, num_free_dofs(space)) + dof_is_slave[_sDOF_to_dof] .= true + return dof_is_slave + end + + sDOF_gids, mfdof_gids, mddof_gids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof = + generate_constraint_gids(cell_gids, spaces, dof_is_slave) + + map(sDOF_to_DOF, _sDOF_to_dof, sDOF_to_DOFs, sDOF_to_coeffs) do sDOF_to_DOF, _sDOF_to_dof, sDOF_to_DOFs, sDOF_to_coeffs + @notimplementedif sDOF_to_DOF != _sDOF_to_dof """ + TODO: Apply permutation to tables + """ + end + + return generate_distributed_constraints( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, sDOF_to_coeffs + ) +end + +# Same as above, but with a callback to generate the master dofs and coefficients. +# What we callback allows you to do is to wait until after the global constraint gids have been generated. +function generate_distributed_constraints( + cell_gids::PRange, spaces::AbstractArray{<:FESpace}, callback::Function, dof_is_slave +) + sDOF_gids, mfdof_gids, mddof_gids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof = + generate_constraint_gids(cell_gids, spaces, dof_is_slave) + + sDOF_to_DOFs, sDOF_to_coeffs = callback(sDOF_to_DOF, sDOF_gids) + + return generate_distributed_constraints( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, sDOF_to_coeffs + ) +end + +function generate_distributed_constraints( + sDOF_gids::PRange, mfdof_gids::PRange, mddof_gids::PRange, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, sDOF_to_coeffs +) + + DOF_gids, offsets = concatenate_constraint_gids( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF + ) + new_DOFs = consistent_constraints!( + sDOF_gids, DOF_gids, sDOF_to_DOFs, sDOF_to_coeffs + ) + + new_mfdof_gids, new_mddof_gids, mDOF_to_dof, sDOF_to_dof = reindex_constraints!( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, new_DOFs, offsets + ) + + return sDOF_gids, new_mfdof_gids, new_mddof_gids, mDOF_to_dof, sDOF_to_dof, sDOF_to_DOFs, sDOF_to_coeffs +end + +########################################################################################### +########################################################################################### +########################################################################################### +## OPTION 2: Multiple-constraint sets, generated from generated from local spaces and local masters. +# See `/docs/src/dev/constraints.md` for the rationale behind this implementation. + +# In the below implementation, `dof_to_constraint` is an array that, for each local dof, +# gives the constraint set it belongs to (0 for unconstrained dofs). +# In particular we have `dof_is_slave = dof_to_constraint .> 0`. +# +# This also means that only ONE constraint can be applied to each DOF +# (i.e. we cannot arbitrarily combine constraints from different sets on the same DOF). +# Also, this means that, for each constraint set, +# `` c_callback(csDOF_to_DOF, csDOF_gids) `` +# might not asked for ALL possible slaves in that set, but only a subset of them +# (e.g. only the ones that have been selected). The callbacks need to account for that. +function generate_distributed_constraints( + cell_gids::PRange, spaces::AbstractArray{<:FESpace}, callback::Tuple, dof_to_constraint +) + + # Create constraint gids + dof_is_slave = map(x -> x .> 0, dof_to_constraint) + sDOF_gids, mfdof_gids, mddof_gids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof = + generate_constraint_gids(cell_gids, spaces, dof_is_slave) + + # Generate partial constraints and merge them into (inconsistent but complete) slave tables + nc = length(callback) + sDOF_to_c = map(getindex, dof_to_constraint, sDOF_to_DOF) + c_to_csDOF_gids, sDOF_to_csDOF = split_gids_by_color(sDOF_gids, sDOF_to_c, nc) + + c_to_csDOF_to_DOF = map(sDOF_to_DOF, sDOF_to_c) do sDOF_to_DOF, sDOF_to_c + ntuple(nc) do c + sDOF_to_DOF[findall(==(c), sDOF_to_c)] + end + end |> tuple_of_arrays + + partials = map(x -> Vector{Any}(undef, nc), partition(cell_gids)) + for (c, (cb, csDOF_to_DOF, csDOF_gids)) in enumerate(zip(callback, c_to_csDOF_to_DOF, c_to_csDOF_gids)) + csDOF_to_DOFs, csDOF_to_coeffs = cb(csDOF_to_DOF, csDOF_gids) + map(partials, csDOF_to_DOFs, csDOF_to_coeffs) do partials, csDOF_to_DOFs, csDOF_to_coeffs + partials[c] = (csDOF_to_DOFs, csDOF_to_coeffs) + end + end + + sDOF_to_DOFs, sDOF_to_coeffs = map(sDOF_to_c,sDOF_to_csDOF,partials) do sDOF_to_c, sDOF_to_csDOF, partials + c_to_csDOF_DOFs = map(first, partials) + sDOF_to_DOFs = Arrays.merge_entries(sDOF_to_c, sDOF_to_csDOF, c_to_csDOF_DOFs...) + c_to_csDOF_coeffs = map(last, partials) + sDOF_to_coeffs = Arrays.merge_entries(sDOF_to_c, sDOF_to_csDOF, c_to_csDOF_coeffs...) + return sDOF_to_DOFs, sDOF_to_coeffs + end |> tuple_of_arrays + + # Make constraints consistent + DOF_gids, offsets = concatenate_constraint_gids( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF + ) + new_DOFs = consistent_constraints!( + sDOF_gids, DOF_gids, sDOF_to_DOFs, sDOF_to_coeffs + ) + + # Close fully consistent constraint tables + sDOF_indices, sDOF_to_DOF, sDOF_to_DOFs, sDOF_to_coeffs = map( + spaces, new_DOFs, sDOF_to_DOF, sDOF_to_DOFs, sDOF_to_coeffs, partition(sDOF_gids) + ) do space, new_DOFs, sDOF_to_DOF, sDOF_to_DOFs, sDOF_to_coeffs, sDOF_ids + n_DOFs = num_free_dofs(space) + num_dirichlet_dofs(space) + length(new_DOFs) + new_sDOF_to_DOF, new_sDOF_to_DOFs, new_sDOF_to_coeffs, _ = FESpaces.close_slave_constraint_tables( + n_DOFs, n_DOFs, sDOF_to_DOF, sDOF_to_DOFs, sDOF_to_coeffs; keys = local_to_global(sDOF_ids) + ) + + # close! reindexes the constraints locally to follow the topological ordering of the DAG. + # We need to apply the same permutation to the sDOF_gids to keep consistency. + DOF_to_sDOF = find_inverse_index_map(sDOF_to_DOF, n_DOFs) + perm = DOF_to_sDOF[new_sDOF_to_DOF] + new_sDOF_ids = LocalIndices( + global_length(sDOF_ids), part_id(sDOF_ids), + local_to_global(sDOF_ids)[perm], local_to_owner(sDOF_ids)[perm] + ) + + return new_sDOF_ids, new_sDOF_to_DOF, new_sDOF_to_DOFs, new_sDOF_to_coeffs + end |> tuple_of_arrays + sDOF_gids = PRange(sDOF_indices) + + # Reindex DOF tables to signed mDOF indices, expand mfdof/mddof gids, etc.. + new_mfdof_gids, new_mddof_gids, mDOF_to_dof, sDOF_to_dof = reindex_constraints!( + sDOF_gids, mfdof_gids, mddof_gids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, new_DOFs, offsets + ) + sDOF_to_mdofs = sDOF_to_DOFs # Has been reindexed! + + return sDOF_gids, new_mfdof_gids, new_mddof_gids, mDOF_to_dof, sDOF_to_dof, sDOF_to_mdofs, sDOF_to_coeffs +end + +########################################################################################### +########################################################################################### +########################################################################################### +# Helpers + +function generate_constraint_gids( + cell_gids::PRange, spaces::AbstractArray{<:FESpace}, dof_is_slave +) + cell_to_DOFs, DOF_is_slave, DOF_to_dof = map(spaces, dof_is_slave) do space, dof_is_slave + nfree = num_free_dofs(space) + ndir = num_dirichlet_dofs(space) + if iszero(ndir) + nlDOFs = nfree + cell_to_DOFs = get_cell_dof_ids(space) + DOF_is_slave = dof_is_slave + DOF_to_dof = collect(Int32(1):Int32(nlDOFs)) + else + nlDOFs = nfree + ndir + dof_reindex = PosNegReindex(Int32(1):Int32(nfree),Int32(nfree+1):Int32(nlDOFs)) + cell_to_DOFs = lazy_map(Broadcasting(dof_reindex),get_cell_dof_ids(space)) + DOF_is_slave = vcat(dof_is_slave, zeros(eltype(dof_is_slave),ndir)) + DOF_to_dof = vcat(Int32(1):Int32(nfree),-(Int32(1):Int32(ndir))) + end + return cell_to_DOFs, DOF_is_slave, DOF_to_dof + end |> tuple_of_arrays + + return generate_constraint_gids( + cell_gids, cell_to_DOFs, DOF_is_slave, DOF_to_dof + ) +end + +function generate_constraint_gids( + cell_gids::PRange, cell_to_DOFs::AbstractArray{<:Table}, DOF_is_slave, DOF_to_dof +) + # Create pos/neg local numberings + DOF_to_color = map(DOF_is_slave, DOF_to_dof) do DOF_is_slave, DOF_to_dof + DOF_to_color = zeros(Int8,length(DOF_is_slave)) + for DOF in eachindex(DOF_is_slave) + if !iszero(DOF_is_slave[DOF]) # Slave DOF (1) + DOF_to_color[DOF] = Int8(1) + else # Master DOF: free (2) / dirichlet (3) + DOF_to_color[DOF] = Int8(2) + Int8(DOF_to_dof[DOF] < 1) + end + end + return DOF_to_color + end + + # Generate global master and slave dof ids + gids, DOF_to_clid, color_to_clid_to_lid = generate_gids_by_color( + cell_gids, cell_to_DOFs, DOF_to_color, 3 + ) + sDOF_gids, mfdof_gids, mddof_gids = gids + + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF = map( + DOF_to_clid, color_to_clid_to_lid + ) do DOF_to_clid, color_to_clid_to_lid + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF = color_to_clid_to_lid + n_mfdofs = length(mfdof_to_DOF) + DOF_to_mDOF = zeros(Int32,length(DOF_to_clid)) + for (mfdof, DOF) in enumerate(mfdof_to_DOF) + DOF_to_mDOF[DOF] = mfdof + end + for (mddof, DOF) in enumerate(mddof_to_DOF) + DOF_to_mDOF[DOF] = n_mfdofs + mddof + end + return sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF + end |> tuple_of_arrays + + return sDOF_gids, mfdof_gids, mddof_gids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof +end + +function consistent_constraints!( + sDOF_gids::PRange, DOF_gids::PRange, sDOF_to_DOFs, sDOF_to_coeffs +) + + # Map to global dofs + map(sDOF_to_DOFs, partition(DOF_gids)) do sDOF_to_DOFs, DOF_ids + l2g = local_to_global(DOF_ids) + data = sDOF_to_DOFs.data + for k in eachindex(data) + iszero(data[k]) && continue + data[k] = l2g[data[k]] + end + end + + t1 = consistent!(PVector(map(jagged_array,sDOF_to_DOFs), partition(sDOF_gids))) + t2 = consistent!(PVector(map(jagged_array,sDOF_to_coeffs), partition(sDOF_gids))) + wait(t1) + + # Map to local dofs, gather external DOFs + new_DOFs = map(sDOF_to_DOFs, partition(DOF_gids), partition(sDOF_gids)) do sDOF_to_DOFs, DOF_ids, sDOF_ids + rank = part_id(DOF_ids) + n_DOF = length(DOF_ids) + ptrs = sDOF_to_DOFs.ptrs + data = sDOF_to_DOFs.data + g2l = global_to_local(DOF_ids) + new_DOFs = Dict{Int,Tuple{Int32,Int32}}() + for (sdof,owner) in enumerate(local_to_owner(sDOF_ids)) + for k in ptrs[sdof]:ptrs[sdof+1]-1 + @assert !iszero(data[k]) # All entries should be nonzero after communication + gid = data[k] + lid = g2l[gid] + if iszero(lid) # Remote DOF + lid, dof_owner = get!(new_DOFs,gid,(n_DOF+1,owner)) + @assert isequal(dof_owner, owner) && !isequal(owner, rank) + n_DOF += isequal(lid,n_DOF+1) # Only increment if new + end + data[k] = lid + end + end + return new_DOFs + end + + wait(t2) + return new_DOFs +end + +# Reindexes the master DOF entries in sDOF_to_DOFs (currently holding extended local DOF +# indices after consistent_constraints!) to signed local mDOF indices (positive = mfdof, +# negative = mddof). Also expands mfdof_gids and mddof_gids to include any fictitious +# master DOFs that arrived from other processors, and produces mDOF_to_dof / sDOF_to_dof. +# +# offsets = cumsum((0, global_length(sDOF_gids), global_length(mfdof_gids), global_length(mddof_gids))) +# new_DOFs: per-part Dict{Int,Tuple{Int32,Int32}} mapping DOF_global_gid → (DOF_lid, owner), +# built by consistent_constraints!. +function reindex_constraints!( + sDOF_gids::PRange, mfdof_gids::PRange, mddof_gids::PRange, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, new_DOFs, offsets +) + new_mfdof_indices, new_mddof_indices, mDOF_to_dof, sDOF_to_dof = map( + partition(mfdof_gids), partition(mddof_gids), + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, new_DOFs + ) do mfdof_ids, mddof_ids, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof, + sDOF_to_DOFs, new_DOFs + + n_DOFs = length(DOF_to_mDOF) + n_mfdofs = length(mfdof_to_DOF) + n_mddofs = length(mddof_to_DOF) + + # Classify each fictitious DOF as mfdof or mddof using its global DOF gid, + # assign it a new local mDOF lid, and record the DOF_lid → signed_mDOF_lid + # mapping needed for the sDOF_to_DOFs conversion below. + new_mfdof = Dict{Int,Tuple{Int32,Int32}}() + new_mddof = Dict{Int,Tuple{Int32,Int32}}() + DOF_lid_to_signed_mDOF = Dict{Int,Int32}() + n_new_mfdofs = 0 + n_new_mddofs = 0 + for (DOF_gid, (DOF_lid, owner)) in new_DOFs + if DOF_gid <= offsets[3] # mfdof: offsets[2] < DOF_gid <= offsets[3] + n_new_mfdofs += 1 + mfdof_gid = DOF_gid - offsets[2] + mfdof_lid = Int32(n_mfdofs + n_new_mfdofs) + new_mfdof[mfdof_gid] = (mfdof_lid, owner) + DOF_lid_to_signed_mDOF[DOF_lid] = mfdof_lid + else # mddof: offsets[3] < DOF_gid <= offsets[4] + n_new_mddofs += 1 + mddof_gid = DOF_gid - offsets[3] + mddof_lid = Int32(n_mddofs + n_new_mddofs) + new_mddof[mddof_gid] = (mddof_lid, owner) + DOF_lid_to_signed_mDOF[DOF_lid] = -mddof_lid + end + end + + # Reindex sDOF_to_DOFs.data in-place: local DOF indices → signed local mDOF indices + data = sDOF_to_DOFs.data + for k in eachindex(data) + DOF = data[k] + if DOF <= n_DOFs + mDOF = DOF_to_mDOF[DOF] + data[k] = (mDOF <= n_mfdofs) ? mDOF : -(mDOF - n_mfdofs) + else # fictitious DOF introduced by consistent_constraints! + data[k] = DOF_lid_to_signed_mDOF[DOF] + end + end + + # mDOF_to_dof layout: [orig mfdofs | new fictitious mfdofs (0) | orig mddofs | new fictitious mddofs (0)] + mDOF_to_dof = zeros(Int32, n_mfdofs + n_new_mfdofs + n_mddofs + n_new_mddofs) + for (mfdof, DOF) in enumerate(mfdof_to_DOF) + mDOF_to_dof[mfdof] = DOF_to_dof[DOF] + end + mddof_offset = n_mfdofs + n_new_mfdofs + for (mddof, DOF) in enumerate(mddof_to_DOF) + mDOF_to_dof[mddof_offset + mddof] = DOF_to_dof[DOF] + end + + sDOF_to_dof = DOF_to_dof[sDOF_to_DOF] + + new_mfdof_ids = expand_gids(mfdof_ids, new_mfdof) + new_mddof_ids = expand_gids(mddof_ids, new_mddof) + return new_mfdof_ids, new_mddof_ids, mDOF_to_dof, sDOF_to_dof + end |> tuple_of_arrays + + new_mfdof_gids = PRange(new_mfdof_indices) + new_mddof_gids = PRange(new_mddof_indices) + return new_mfdof_gids, new_mddof_gids, mDOF_to_dof, sDOF_to_dof +end + +# Can be replaced by union_ghost in PartitionedArrays v0.5 +function expand_gids(gids, new_gids) + rank = part_id(gids) + n_global = global_length(gids) + n_old = local_length(gids) + n_new = n_old + length(new_gids) + lid_to_gid = Vector{Int}(undef,n_new) + lid_to_owner = Vector{Int32}(undef,n_new) + lid_to_gid[1:n_old] .= local_to_global(gids) + lid_to_owner[1:n_old] .= local_to_owner(gids) + for (gid, (lid, owner)) in new_gids + lid_to_gid[lid] = gid + lid_to_owner[lid] = owner + end + return LocalIndices(n_global, rank, lid_to_gid, lid_to_owner) +end + +function concatenate_constraint_gids( + sDOF_gids::PRange, mfdof_gids::PRange, mddof_gids::PRange, + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF +) + function map_b_to_a!(a_l2g,a_l2o,b2a,b_ids,o) + @views a_l2g[b2a] .= local_to_global(b_ids) .+ o + @views a_l2o[b2a] .= local_to_owner(b_ids) + end + offsets = cumsum((0, map(length, (sDOF_gids, mfdof_gids, mddof_gids))...)) + DOF_gids_indices = map( + partition(sDOF_gids), partition(mfdof_gids), partition(mddof_gids), + sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF + ) do s_ids, mf_ids, md_ids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF + n_DOFs = length(sDOF_to_DOF) + length(mfdof_to_DOF) + length(mddof_to_DOF) + lid_to_gid = Vector{Int}(undef, n_DOFs) + lid_to_owner = Vector{Int32}(undef, n_DOFs) + map_b_to_a!(lid_to_gid, lid_to_owner, sDOF_to_DOF, s_ids, offsets[1]) + map_b_to_a!(lid_to_gid, lid_to_owner, mfdof_to_DOF, mf_ids, offsets[2]) + map_b_to_a!(lid_to_gid, lid_to_owner, mddof_to_DOF, md_ids, offsets[3]) + LocalIndices(offsets[end], part_id(s_ids), lid_to_gid, lid_to_owner) + end + return PRange(DOF_gids_indices), offsets +end + +########################################################################################### +########################################################################################### +########################################################################################### +## OPTION 3: Single-constraint set, generated from global space and (possibly) non-local masters. +# TODO: I am not 100% sure how to make this useful. In what kind of situations would +# we have non-local masters but no global numbering? It is weird. + +# This function allows for non-local masters, but REQUIRES a pre-existing global numbering +# for the original dofs. +# function generate_constraint_gids( +# DOF_gids::PRange, sDOF_to_DOF::AbstractArray{<:AbstractVector}, sDOF_to_DOFs, DOF_to_isdirichlet +# ) +# DOF_to_color = map( +# partition(DOF_gids), sDOF_to_DOF, sDOF_to_DOFs, DOF_to_isdirichlet +# ) do DOF_ids, sDOF_to_DOF, sDOF_to_DOFs, DOF_to_isdirichlet +# DOF_to_color = zeros(Int8, length(DOF_ids)) +# DOF_to_oDOF = local_to_own(DOF_ids) +# for (sDOF, DOF) in enumerate(sDOF_to_DOF) +# iszero(DOF_to_oDOF[DOF]) && continue # Ghost DOF, ignore +# DOF_to_color[DOF] = 1 +# for DOF2 in dataview(sDOF_to_DOFs, sDOF) +# DOF_to_color[DOF2] = 2 + Int8(DOF_to_isdirichlet[DOF2]) +# end +# end +# return DOF_to_color +# end +# wait(consistent!(PVector(DOF_to_color, partition(DOF_gids)))) +# +# sDOF_gids, mfdof_gids, mddof_gids = split_gids_by_color( +# DOF_gids, DOF_to_color, 3 +# ) +# +# mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF = map( +# DOF_to_color +# ) do DOF_to_clid, color_to_clid_to_lid +# sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF = color_to_clid_to_lid +# n_mfdofs = length(mfdof_to_DOF) +# DOF_to_mDOF = zeros(Int32,length(DOF_to_clid)) +# for (mfdof, DOF) in enumerate(mfdof_to_DOF) +# DOF_to_mDOF[DOF] = mfdof +# end +# for (mddof, DOF) in enumerate(mddof_to_DOF) +# DOF_to_mDOF[DOF] = n_mfdofs + mddof +# end +# return sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF +# end |> tuple_of_arrays +# +# return sDOF_gids, mfdof_gids, mddof_gids, sDOF_to_DOF, mfdof_to_DOF, mddof_to_DOF, DOF_to_mDOF, DOF_to_dof +# end diff --git a/src/FESpaces.jl b/src/FESpaces.jl index 01109490..a26b46b9 100644 --- a/src/FESpaces.jl +++ b/src/FESpaces.jl @@ -67,36 +67,44 @@ function FESpaces.gather_free_and_dirichlet_values(f::DistributedFESpace,cell_va return free_values, dirichlet_values end -function dof_wise_to_cell_wise!(cell_wise_vector,dof_wise_vector,cell_to_ldofs,cell_prange) - map(cell_wise_vector, - dof_wise_vector, - cell_to_ldofs, - partition(cell_prange)) do cwv,dwv,cell_to_ldofs,indices +function dof_wise_to_cell_wise!(cell_wise_vector,dof_wise_vector,cell_to_ldofs,cell_ids) + map(cell_wise_vector,dof_wise_vector,cell_to_ldofs,cell_ids) do cwv,dwv,cell_to_ldofs,cell_ids cache = array_cache(cell_to_ldofs) - ncells = length(cell_to_ldofs) - ptrs = cwv.ptrs - data = cwv.data - cell_own_to_local = own_to_local(indices) - for cell in cell_own_to_local + for cell in cell_ids ldofs = getindex!(cache,cell_to_ldofs,cell) - p = ptrs[cell]-1 + p = cwv.ptrs[cell]-1 for (i,ldof) in enumerate(ldofs) if ldof > 0 - data[i+p] = dwv[ldof] + cwv.data[i+p] = dwv[ldof] end end end end + return cell_wise_vector end -function cell_wise_to_dof_wise!(dof_wise_vector,cell_wise_vector,cell_to_ldofs,cell_range) - map(dof_wise_vector, - cell_wise_vector, - cell_to_ldofs, - partition(cell_range)) do dwv,cwv,cell_to_ldofs,indices +function posneg_wise_to_cell_wise!(cell_wise_vector,pos_wise_vector,neg_wise_vector,cell_to_posneg,cell_ids) + map(cell_wise_vector,pos_wise_vector,neg_wise_vector,cell_to_posneg,cell_ids) do cwv,pwv,nwv,cell_to_posneg,cell_ids + cache = array_cache(cell_to_posneg) + for cell in cell_ids + lids = getindex!(cache,cell_to_posneg,cell) + p = cwv.ptrs[cell]-1 + for (i,lid) in enumerate(lids) + if lid > 0 + cwv.data[i+p] = pwv[lid] + elseif lid < 0 + cwv.data[i+p] = nwv[-lid] + end + end + end + end + return cell_wise_vector +end + +function cell_wise_to_dof_wise!(dof_wise_vector,cell_wise_vector,cell_to_ldofs,cell_ids) + map(dof_wise_vector,cell_wise_vector,cell_to_ldofs,cell_ids) do dwv,cwv,cell_to_ldofs,cell_ids cache = array_cache(cell_to_ldofs) - cell_ghost_to_local = ghost_to_local(indices) - for cell in cell_ghost_to_local + for cell in cell_ids ldofs = getindex!(cache,cell_to_ldofs,cell) p = cwv.ptrs[cell]-1 for (i,ldof) in enumerate(ldofs) @@ -106,25 +114,42 @@ function cell_wise_to_dof_wise!(dof_wise_vector,cell_wise_vector,cell_to_ldofs,c end end end + return dof_wise_vector end -function dof_wise_to_cell_wise(dof_wise_vector,cell_to_ldofs,cell_prange) - cwv = map(cell_to_ldofs) do cell_to_ldofs - cache = array_cache(cell_to_ldofs) - ncells = length(cell_to_ldofs) - ptrs = Vector{Int32}(undef,ncells+1) - for cell in 1:ncells - ldofs = getindex!(cache,cell_to_ldofs,cell) - ptrs[cell+1] = length(ldofs) +function cell_wise_to_posneg_wise!(pos_wise_vector,neg_wise_vector,cell_wise_vector,cell_to_posneg,cell_ids) + map(pos_wise_vector,neg_wise_vector,cell_wise_vector,cell_to_posneg,cell_ids) do pwv,nwv,cwv,cell_to_posneg,cell_ids + cache = array_cache(cell_to_posneg) + for cell in cell_ids + lids = getindex!(cache,cell_to_posneg,cell) + p = cwv.ptrs[cell]-1 + for (i,lid) in enumerate(lids) + if lid > 0 + pwv[lid] = cwv.data[i+p] + elseif lid < 0 + nwv[-lid] = cwv.data[i+p] + end + end end - PArrays.length_to_ptrs!(ptrs) - ndata = ptrs[end]-1 - data = Vector{Int}(undef,ndata) - data .= -1 + end + return pos_wise_vector, neg_wise_vector +end + +function allocate_cell_wise_vector(T, cell_to_lids) + map(cell_to_lids) do cell_to_lids + ptrs = Arrays.generate_ptrs(cell_to_lids) + data = zeros(T,ptrs[end]-1) JaggedArray(data,ptrs) end - dof_wise_to_cell_wise!(cwv,dof_wise_vector,cell_to_ldofs,cell_prange) - cwv +end + +function dof_wise_to_cell_wise( + dof_wise_vector, cell_to_ldofs, cell_ids; + T = eltype(eltype(dof_wise_vector)) +) + cwv = allocate_cell_wise_vector(T,cell_to_ldofs) + dof_wise_to_cell_wise!(cwv,dof_wise_vector,cell_to_ldofs,cell_ids) + return cwv end function fetch_vector_ghost_values_cache(vector_partition,partition) @@ -136,131 +161,445 @@ function fetch_vector_ghost_values!(vector_partition,cache) assemble!((a,b)->b, vector_partition, cache) end +""" + generate_gids(cell_gids::PRange, cell_to_lids, nlids) -> PRange + +Given a set of cell global ids, a distributed array of local tables mapping cells to local dof ids, +and a distributed array with the number of local dofs in each partition, this function +generates the global dof ids and returns them as a `PRange` of `LocalIndices`. + +Ignores negative local dof ids (usually used for Dirichlet dofs). +""" function generate_gids( cell_range::PRange, cell_to_ldofs::AbstractArray{<:AbstractArray}, - nldofs::AbstractArray{<:Integer}) + nldofs::AbstractArray{<:Integer} +) + ranks = linear_indices(partition(cell_range)) + cell_ldofs_to_data = allocate_cell_wise_vector(Int, cell_to_ldofs) + cache_fetch = fetch_vector_ghost_values_cache(cell_ldofs_to_data,partition(cell_range)) # Find and count number owned dofs ldof_to_owner, nodofs = map(partition(cell_range),cell_to_ldofs,nldofs) do indices,cell_to_ldofs,nldofs ldof_to_owner = fill(Int32(0),nldofs) cache = array_cache(cell_to_ldofs) - lcell_to_owner = local_to_owner(indices) - for cell in 1:length(cell_to_ldofs) - owner = lcell_to_owner[cell] + for (cell, owner) in enumerate(local_to_owner(indices)) ldofs = getindex!(cache,cell_to_ldofs,cell) for ldof in ldofs - if ldof>0 - # TODO this simple approach concentrates dofs - # in the last part and creates imbalances + if ldof > 0 + # NOTE: this approach concentrates dofs in the last processor ldof_to_owner[ldof] = max(owner,ldof_to_owner[ldof]) end end end - me = part_id(indices) - nodofs = count(p->p==me,ldof_to_owner) + nodofs = count(isequal(part_id(indices)),ldof_to_owner) ldof_to_owner, nodofs end |> tuple_of_arrays - cell_ldofs_to_part = dof_wise_to_cell_wise(ldof_to_owner, - cell_to_ldofs, - cell_range) - - # Note1 : this call potentially updates cell_prange with the - # info required to exchange info among nearest neighbours - # so that it can be re-used in the future for other exchanges - - # Note2 : we need to call reverse() as the senders and receivers are - # swapped in the AssemblyCache of partition(cell_range) - - # Exchange the dof owners - cache_fetch=fetch_vector_ghost_values_cache(cell_ldofs_to_part,partition(cell_range)) - fetch_vector_ghost_values!(cell_ldofs_to_part,cache_fetch) |> wait - - cell_wise_to_dof_wise!(ldof_to_owner, - cell_ldofs_to_part, - cell_to_ldofs, - cell_range) - # Find the global range of owned dofs first_gdof = scan(+,nodofs,type=:exclusive,init=one(eltype(nodofs))) - - # Distribute gdofs to owned ones - ldof_to_gdof = map(first_gdof,ldof_to_owner,partition(cell_range)) do first_gdof,ldof_to_owner,indices - me = part_id(indices) + + # Exchange the dof owners. Cell owner always has correct dof owner. + cell_ldofs_to_owner = dof_wise_to_cell_wise!( + cell_ldofs_to_data,ldof_to_owner,cell_to_ldofs,own_to_local(cell_range) + ) + fetch_vector_ghost_values!(cell_ldofs_to_owner,cache_fetch) |> wait + cell_wise_to_dof_wise!( + ldof_to_owner,cell_ldofs_to_owner,cell_to_ldofs,ghost_to_local(cell_range) + ) + + # Fill owned gids + ldof_to_gdof = map(ranks,first_gdof,ldof_to_owner) do rank,first_gdof,ldof_to_owner offset = first_gdof-1 - ldof_to_gdof = Vector{Int}(undef,length(ldof_to_owner)) + ldof_to_gdof = zeros(Int,length(ldof_to_owner)) odof = 0 for (ldof,owner) in enumerate(ldof_to_owner) - if owner == me + if owner == rank odof += 1 - ldof_to_gdof[ldof] = odof - else - ldof_to_gdof[ldof] = 0 - end - end - for (ldof,owner) in enumerate(ldof_to_owner) - if owner == me - ldof_to_gdof[ldof] += offset + ldof_to_gdof[ldof] = odof + offset end end ldof_to_gdof end - # Create cell-wise global dofs - cell_to_gdofs = dof_wise_to_cell_wise(ldof_to_gdof, - cell_to_ldofs, - cell_range) - - # Exchange the global dofs + # Exchange gids + cell_to_gdofs = dof_wise_to_cell_wise!( + cell_ldofs_to_data,ldof_to_gdof,cell_to_ldofs,own_to_local(cell_range) + ) fetch_vector_ghost_values!(cell_to_gdofs,cache_fetch) |> wait - # Distribute global dof ids also to ghost - map(cell_to_ldofs,cell_to_gdofs,ldof_to_gdof,ldof_to_owner,partition(cell_range)) do cell_to_ldofs,cell_to_gdofs,ldof_to_gdof,ldof_to_owner,indices - gdof = 0 + # Fill ghost gids with exchanged information + map( + cell_to_ldofs,cell_to_gdofs,ldof_to_gdof,ldof_to_owner,partition(cell_range) + ) do cell_to_ldofs,cell_to_gdofs,ldof_to_gdof,ldof_to_owner,indices cache = array_cache(cell_to_ldofs) - cell_ghost_to_local = ghost_to_local(indices) - cell_local_to_owner = local_to_owner(indices) - for cell in cell_ghost_to_local - ldofs = getindex!(cache,cell_to_ldofs,cell) + lcell_to_owner = local_to_owner(indices) + for cell in ghost_to_local(indices) p = cell_to_gdofs.ptrs[cell]-1 + ldofs = getindex!(cache,cell_to_ldofs,cell) + cell_owner = lcell_to_owner[cell] for (i,ldof) in enumerate(ldofs) - if ldof > 0 && ldof_to_owner[ldof] == cell_local_to_owner[cell] + if (ldof > 0) && isequal(ldof_to_owner[ldof],cell_owner) ldof_to_gdof[ldof] = cell_to_gdofs.data[i+p] end end end end - dof_wise_to_cell_wise!(cell_to_gdofs, - ldof_to_gdof, - cell_to_ldofs, - cell_range) - + dof_wise_to_cell_wise!(cell_to_gdofs,ldof_to_gdof,cell_to_ldofs,own_to_local(cell_range)) fetch_vector_ghost_values!(cell_to_gdofs,cache_fetch) |> wait - - cell_wise_to_dof_wise!(ldof_to_gdof, - cell_to_gdofs, - cell_to_ldofs, - cell_range) + cell_wise_to_dof_wise!(ldof_to_gdof,cell_to_gdofs,cell_to_ldofs,ghost_to_local(cell_range)) # Setup DoFs LocalIndices ngdofs = reduction(+,nodofs,destination=:all,init=zero(eltype(nodofs))) - local_indices = map(ngdofs,partition(cell_range),ldof_to_gdof,ldof_to_owner) do ngdofs, - indices, - ldof_to_gdof, - ldof_to_owner - me = part_id(indices) - LocalIndices(ngdofs,me,ldof_to_gdof,ldof_to_owner) + local_indices = map(LocalIndices,ngdofs,ranks,ldof_to_gdof,ldof_to_owner) + + return PRange(local_indices) +end + +""" + generate_posneg_gids(cell_gids::PRange, cell_to_lposneg, nlpos, nlneg) -> (PRange,PRange) + +Similar to `generate_gids`, but also handles negative local dof ids. Returns two sets of +global ids: one for positive local ids and another for negative local ids. +This can be used to generate simultaneously free and dirichlet dof global ids. +""" +function generate_posneg_gids( + cell_range::PRange, + cell_to_lposneg::AbstractArray{<:AbstractArray}, + nlpos::AbstractArray{<:Integer}, + nlneg::AbstractArray{<:Integer} +) + ranks = linear_indices(partition(cell_range)) + cell_lids_to_data = allocate_cell_wise_vector(Int, cell_to_lposneg) + cache_fetch = fetch_vector_ghost_values_cache(cell_lids_to_data,partition(cell_range)) + + # Find and count number owned dofs + lpos_to_owner, lneg_to_owner, nopos, noneg = map( + partition(cell_range),cell_to_lposneg,nlpos,nlneg + ) do indices,cell_to_lposneg,nlpos,nlneg + lpos_to_owner = fill(zero(Int32),nlpos) + lneg_to_owner = fill(zero(Int32),nlneg) + cache = array_cache(cell_to_lposneg) + for (cell, owner) in enumerate(local_to_owner(indices)) + lids = getindex!(cache,cell_to_lposneg,cell) + for lid in lids + if lid > 0 + lpos_to_owner[lid] = max(owner,lpos_to_owner[lid]) + elseif lid < 0 + lneg_to_owner[-lid] = max(owner,lneg_to_owner[-lid]) + end + end + end + rank = part_id(indices) + nopos = count(isequal(rank),lpos_to_owner) + noneg = count(isequal(rank),lneg_to_owner) + return lpos_to_owner, lneg_to_owner, nopos, noneg + end |> tuple_of_arrays + + # Find the global range of owned dofs + first_gpos = scan(+,nopos,type=:exclusive,init=1) + first_gneg = scan(+,noneg,type=:exclusive,init=1) + + # Exchange the dof owners. Cell owner always has correct dof owner. + cell_ldofs_to_owner = posneg_wise_to_cell_wise!( + cell_lids_to_data,lpos_to_owner,lneg_to_owner,cell_to_lposneg,own_to_local(cell_range) + ) + fetch_vector_ghost_values!(cell_ldofs_to_owner,cache_fetch) |> wait + cell_wise_to_posneg_wise!( + lpos_to_owner,lneg_to_owner,cell_ldofs_to_owner,cell_to_lposneg,ghost_to_local(cell_range) + ) + + # Fill owned gids + lpos_to_gpos = map(ranks,first_gpos,lpos_to_owner) do rank,first_gpos,lpos_to_owner + offset = first_gpos-1 + lpos_to_gpos = zeros(Int,length(lpos_to_owner)) + opos = 0 + for (lpos,owner) in enumerate(lpos_to_owner) + if owner == rank + opos += 1 + lpos_to_gpos[lpos] = opos + offset + end + end + lpos_to_gpos + end + + lneg_to_gneg = map(ranks,first_gneg,lneg_to_owner) do rank,first_gneg,lneg_to_owner + offset = first_gneg-1 + lneg_to_gneg = zeros(Int,length(lneg_to_owner)) + oneg = 0 + for (lneg,owner) in enumerate(lneg_to_owner) + if owner == rank + oneg += 1 + lneg_to_gneg[lneg] = oneg + offset + end + end + lneg_to_gneg end - # Setup dof range - dofs_range = PRange(local_indices) + # Exchange gids + cell_to_gposneg = posneg_wise_to_cell_wise!( + cell_lids_to_data,lpos_to_gpos,lneg_to_gneg,cell_to_lposneg,own_to_local(cell_range) + ) + fetch_vector_ghost_values!(cell_to_gposneg,cache_fetch) |> wait + + # Fill ghost gids with exchanged information + map( + cell_to_lposneg,cell_to_gposneg,lpos_to_gpos,lneg_to_gneg,lpos_to_owner,lneg_to_owner,partition(cell_range) + ) do cell_to_lposneg,cell_to_gposneg,lpos_to_gpos,lneg_to_gneg,lpos_to_owner,lneg_to_owner,indices + cache = array_cache(cell_to_lposneg) + lcell_to_owner = local_to_owner(indices) + for cell in ghost_to_local(indices) + p = cell_to_gposneg.ptrs[cell]-1 + lids = getindex!(cache,cell_to_lposneg,cell) + cell_owner = lcell_to_owner[cell] + for (i,lid) in enumerate(lids) + if (lid > 0) && isequal(lpos_to_owner[lid],cell_owner) + lpos_to_gpos[lid] = cell_to_gposneg.data[i+p] + elseif (lid < 0) && isequal(lneg_to_owner[-lid],cell_owner) + lneg_to_gneg[-lid] = cell_to_gposneg.data[i+p] + end + end + end + end + + posneg_wise_to_cell_wise!(cell_to_gposneg,lpos_to_gpos,lneg_to_gneg,cell_to_lposneg,own_to_local(cell_range)) + fetch_vector_ghost_values!(cell_to_gposneg,cache_fetch) |> wait + cell_wise_to_posneg_wise!(lpos_to_gpos,lneg_to_gneg,cell_to_gposneg,cell_to_lposneg,ghost_to_local(cell_range)) + + # Setup DoFs LocalIndices + ngpos = reduction(+,nopos,destination=:all,init=0) + ngneg = reduction(+,noneg,destination=:all,init=0) + local_indices_pos = map(LocalIndices,ngpos,ranks,lpos_to_gpos,lpos_to_owner) + local_indices_neg = map(LocalIndices,ngneg,ranks,lneg_to_gneg,lneg_to_owner) + + return PRange(local_indices_pos), PRange(local_indices_neg) +end + +""" + generate_gids_by_color(cell_gids::PRange, cell_to_lids, lid_to_color, ncolors) + +Similar to `generate_gids`, but uses a global partition given by `lid_to_color` to generate +a different set of global ids per color. Returns: + +- a tuple with a `PRange` of `LocalIndices` per color. +- a mapping `lid_to_clid` that maps local ids to local color ids +- a mapping `color_to_clid_to_lid` that maps, for each color, local color ids to local ids. +""" +function generate_gids_by_color( + cell_range::PRange, + cell_to_lids::AbstractArray{<:AbstractArray}, + lid_to_color::AbstractArray, + ncolors = getany(reduction(max,map(maximum,lid_to_color);destination=:all)) +) + cell_lids_to_data = allocate_cell_wise_vector(Int, cell_to_lids) + cache_fetch = fetch_vector_ghost_values_cache(cell_lids_to_data,partition(cell_range)) + + lid_to_owner, color_to_noids = map( + partition(cell_range),cell_to_lids,lid_to_color + ) do indices, cell_to_lids, lid_to_color + lid_to_owner = fill(zero(Int32),length(lid_to_color)) + cache = array_cache(cell_to_lids) + for (cell, owner) in enumerate(local_to_owner(indices)) + lids = getindex!(cache,cell_to_lids,cell) + for lid in lids + (lid < 0) && continue + lid_to_owner[lid] = max(owner,lid_to_owner[lid]) + end + end + + rank = part_id(indices) + color_to_noids = zeros(Int,ncolors) + for (owner, color) in zip(lid_to_owner,lid_to_color) + color_to_noids[color] += isequal(owner,rank) + end + + return lid_to_owner, color_to_noids + end |> tuple_of_arrays + + # Find first gid per color + color_to_fgid = map(1:ncolors) do c + scan(+,map(Base.Fix2(getindex,c), color_to_noids); type=:exclusive, init=1) + end |> to_parray_of_arrays + + # Start exchanging dof owners + cell_lids_to_owner = dof_wise_to_cell_wise!( + cell_lids_to_data,lid_to_owner,cell_to_lids,own_to_local(cell_range) + ) + t1 = fetch_vector_ghost_values!(cell_lids_to_owner,cache_fetch) + + # Note: lid_to_owner is still not consistent, but owned data is correct + lid_to_gid, lid_to_clid, color_to_clid_to_lid = map( + partition(cell_range), lid_to_color, lid_to_owner, color_to_fgid + ) do indices, lid_to_color, lid_to_owner, color_to_fgid + + # Count number of dofs per color + rank = part_id(indices) + color_to_nldofs = zeros(Int,ncolors) + lid_to_clid = zeros(Int32,length(lid_to_color)) + lid_to_gid = zeros(Int,length(lid_to_color)) + for (lid, (color, owner)) in enumerate(zip(lid_to_color,lid_to_owner)) + color_to_nldofs[color] += 1 + lid_to_clid[lid] = color_to_nldofs[color] + if isequal(owner,rank) + lid_to_gid[lid] = color_to_fgid[color] + color_to_fgid[color] += 1 + end + end + + # Find clid to lid mapping + color_to_clid_to_lid = [fill(zero(Int32),n) for n in color_to_nldofs] + for (lid, (color, clid)) in enumerate(zip(lid_to_color,lid_to_clid)) + color_to_clid_to_lid[color][clid] = lid + end + return lid_to_gid, lid_to_clid, color_to_clid_to_lid + end |> tuple_of_arrays + + # Finish exchanging the dof owners. + wait(t1) + cell_wise_to_dof_wise!( + lid_to_owner,cell_lids_to_owner,cell_to_lids,ghost_to_local(cell_range) + ) + + # Start exchanging gids + cell_to_gids = dof_wise_to_cell_wise!( + cell_lids_to_data,lid_to_gid,cell_to_lids,own_to_local(cell_range) + ) + t2 = fetch_vector_ghost_values!(cell_to_gids,cache_fetch) - return dofs_range + # Find the global range of owned dofs + color_to_ngids = map(1:ncolors) do c + reduction(+,map(Base.Fix2(getindex,c), color_to_noids); destination=:all, init=0) + end |> to_parray_of_arrays + + # Finish exchanging the gids. + wait(t2) + map( + cell_to_lids,cell_to_gids,lid_to_gid,lid_to_owner,partition(cell_range) + ) do cell_to_lids,cell_to_gids,lid_to_gid,lid_to_owner,indices + cache = array_cache(cell_to_lids) + lcell_to_owner = local_to_owner(indices) + for cell in ghost_to_local(indices) + p = cell_to_gids.ptrs[cell]-1 + lids = getindex!(cache,cell_to_lids,cell) + cell_owner = lcell_to_owner[cell] + for (i,lid) in enumerate(lids) + if (lid > 0) && isequal(lid_to_owner[lid],cell_owner) + lid_to_gid[lid] = cell_to_gids.data[i+p] + end + end + end + end + + dof_wise_to_cell_wise!(cell_to_gids,lid_to_gid,cell_to_lids,own_to_local(cell_range)) + fetch_vector_ghost_values!(cell_to_gids,cache_fetch) |> wait + cell_wise_to_dof_wise!(lid_to_gid,cell_to_gids,cell_to_lids,ghost_to_local(cell_range)) + + # Setup DoFs LocalIndices per color + color_to_indices = map( + partition(cell_range), lid_to_color, lid_to_gid, lid_to_owner, color_to_clid_to_lid, color_to_ngids + ) do indices, lid_to_color, lid_to_gid, lid_to_owner, color_to_clid_to_lid, color_to_ngids + rank = part_id(indices) + + color_to_indices = () + for c in 1:ncolors + cids = LocalIndices( + color_to_ngids[c], rank, + lid_to_gid[color_to_clid_to_lid[c]], + lid_to_owner[color_to_clid_to_lid[c]] + ) + color_to_indices = (color_to_indices..., cids) + end + return color_to_indices + end |> tuple_of_arrays + + return map(PRange,color_to_indices), lid_to_clid, color_to_clid_to_lid +end + +""" + split_gids_by_color(gids::PRange, lid_to_color::AbstractArray, ncolors) -> NTuple{ncolors,PRange}, lid_to_clid + +Given a set of global ids and a mapping from local ids to colors, this function splits +the global ids into different sets of global ids per color. Returns a tuple with a `PRange` +of `LocalIndices` per color and a mapping `lid_to_clid` that maps local ids to local ids per color. +""" +function split_gids_by_color( + gids::PRange, lid_to_color::AbstractArray, + ncolors = getany(reduction(max,map(maximum,lid_to_color);destination=:all)) +) + color_to_nlids, color_to_noids = map(partition(gids), lid_to_color) do ids, lid_to_color + rank = part_id(ids) + lid_to_owner = local_to_owner(ids) + color_to_nlids = zeros(Int,ncolors) + color_to_noids = zeros(Int,ncolors) + for (owner,color) in zip(lid_to_owner,lid_to_color) + color_to_nlids[color] += 1 + color_to_noids[color] += isequal(owner,rank) + end + return color_to_nlids, color_to_noids + end |> tuple_of_arrays + + # I wish we could reduce arrays directly... + color_to_fgid = map(1:ncolors) do c + scan(+,map(Base.Fix2(getindex,c), color_to_noids); type=:exclusive, init=1) + end |> to_parray_of_arrays + color_to_ngids = map(1:ncolors) do c + reduction(+,map(Base.Fix2(getindex,c), color_to_noids); destination=:all, init=0) + end |> to_parray_of_arrays + + lid_to_cgid = map(partition(gids), lid_to_color, color_to_fgid) do ids, lid_to_color, color_to_fgid + rank = part_id(ids) + lid_to_cgid = zeros(Int,length(lid_to_color)) + color_to_offset = zeros(Int,ncolors) + for (lid,(owner,color)) in enumerate(zip(local_to_owner(ids),lid_to_color)) + if isequal(owner,rank) + lid_to_cgid[lid] = color_to_fgid[color] + color_to_offset[color] + color_to_offset[color] += 1 + end + end + return lid_to_cgid + end + + t = consistent!(PVector(lid_to_cgid,partition(gids))) + + lid_to_clid = map(lid_to_color) do lid_to_color + lid_to_clid = zeros(Int32,length(lid_to_color)) + offsets = zeros(Int, ncolors) + for (lid, color) in enumerate(lid_to_color) + offsets[color] += 1 + lid_to_clid[lid] = offsets[color] + end + return lid_to_clid + end + + wait(t) + + color_to_gids = map( + partition(gids), lid_to_color, lid_to_clid, lid_to_cgid, color_to_nlids, color_to_ngids + ) do ids, lid_to_color, lid_to_clid, lid_to_cgid, color_to_nlids, color_to_ngids + color_to_lid_to_gid = [zeros(Int, n) for n in color_to_nlids] + color_to_lid_to_owner = [zeros(Int32, n) for n in color_to_nlids] + for (color, clid, cgid, owner) in zip(lid_to_color, lid_to_clid, lid_to_cgid, local_to_owner(ids)) + color_to_lid_to_gid[color][clid] = cgid + color_to_lid_to_owner[color][clid] = owner + end + + color_to_ids = () + for color in 1:ncolors + cids = LocalIndices( + color_to_ngids[color], part_id(ids), + color_to_lid_to_gid[color], color_to_lid_to_owner[color] + ) + color_to_ids = (color_to_ids..., cids) + end + return color_to_ids + end |> tuple_of_arrays + + return map(PRange,color_to_gids), lid_to_clid end # FEFunction related + """ """ struct DistributedFEFunctionData{T<:AbstractVector} <: GridapType @@ -329,6 +668,11 @@ function FESpaces.zero_dirichlet_values(U::DistributedSingleFieldFESpace) map(zero_dirichlet_values,U.spaces) end +function FESpaces.get_dirichlet_dof_ids(U::DistributedSingleFieldFESpace) + spaces = map(FESpaces.DirichletFESpace,U.spaces) + return generate_gids(get_triangulation(U),spaces) +end + function FESpaces.FEFunction( f::DistributedSingleFieldFESpace,free_values::AbstractVector,isconsistent=false) _EvaluationFunction(FEFunction,f,free_values,isconsistent) @@ -901,7 +1245,7 @@ function FESpaces.ZeroMeanFESpace(space::DistributedSingleFieldFESpace,dΩ::Dist vol = sum(dvol) return vol, dvol end |> tuple_of_arrays - vol = reduce(+,_vol,init=zero(eltype(vol))) + vol = reduce(+,_vol,init=zero(eltype(_vol))) metadata = DistributedZeroMeanCache(dvol,vol) return DistributedSingleFieldFESpace( diff --git a/src/GridapDistributed.jl b/src/GridapDistributed.jl index da799997..0dc73a47 100644 --- a/src/GridapDistributed.jl +++ b/src/GridapDistributed.jl @@ -74,4 +74,6 @@ include("Redistribution.jl") include("Adaptivity.jl") +include("Constraints.jl") + end # module diff --git a/src/PArraysExtras.jl b/src/PArraysExtras.jl index 3e5264ee..10af235f 100644 --- a/src/PArraysExtras.jl +++ b/src/PArraysExtras.jl @@ -82,3 +82,7 @@ function _local_range(p,np,n,ghost=false,periodic=false) periodic && return start:stop return max(1, start):min(n,stop) end + +# This is type piracy, but I have no other option... + +PArrays.jagged_array(t::Table) = jagged_array(t.data, t.ptrs) diff --git a/test/Constraints.jl b/test/Constraints.jl new file mode 100644 index 00000000..e4ed5a60 --- /dev/null +++ b/test/Constraints.jl @@ -0,0 +1,122 @@ + +using Gridap +using Gridap.Arrays: Table +using Gridap.Geometry: get_node_coordinates +using GridapDistributed +using PartitionedArrays +using Test + +# 4×2 Cartesian mesh on [0,1]², 2-part partition (left/right). No Dirichlet BCs. +# +# Node layout (global IDs): +# y=1: 5(0,1) 6(¼,1) 13(½,1) 14(¾,1) 15(1,1) +# y=½: 3(0,½) 4(¼,½) 10(½,½) 11(¾,½) 12(1,½) +# y=0: 1(0,0) 2(¼,0) 7(½,0) 8(¾,0) 9(1,0) +# +# P1 owns x∈{0,¼}, P2 owns x∈{½,¾,1}. +# +# Three constraints stressing all cases: +# S1: (¼,½) = 0.5·(¼,0) + 0.5·(¼,1) — owned by P1, masters local on both parts +# S2: (¾,0) = 1.0·(½,0) — owned by P2, master local on both parts (ghost on P1) +# S3: (¾,½) = 0.5·(½,½) + 0.5·(1,½) — owned by P2, (1,½) is fictitious on P1 + +np = (2, 1) +ranks = DebugArray(LinearIndices((prod(np),))) +model = CartesianDiscreteModel(ranks, np, (0,1,0,1), (4,2)) +V = FESpace(model, ReferenceFE(lagrangian, Float64, 1)) +cell_gids = GridapDistributed.get_cell_gids(model) +spaces = local_views(V) +dof_ids = partition(get_free_dof_ids(V)) + +# Local DOF index of the node at (x,y), or nothing if not in this part's local index. +function dof_at(sp, x, y; tol=1e-10) + for (i, c) in enumerate(get_node_coordinates(get_triangulation(sp))) + abs(c[1]-x) < tol && abs(c[2]-y) < tol && return i + end + return nothing +end + +# Owned rows carry real master local dofs; ghost rows are pre-sized with zeros +# (consistent! fills them in from the owner). +owned_constraints = Dict( + (0.25, 0.5) => ([(0.25, 0.0), (0.25, 1.0)], [0.5, 0.5]), # S1, owned by P1 + (0.75, 0.0) => ([(0.50, 0.0)], [1.0] ), # S2, owned by P2 + (0.75, 0.5) => ([(0.50, 0.5), (1.00, 0.5)], [0.5, 0.5]), # S3, owned by P2 +) + +_sDOF_to_dof_parts, sDOF_to_mdofs_parts, sDOF_to_coeffs_parts = map( + spaces, dof_ids +) do sp, ids + l2own = local_to_own(ids) + entries = Tuple{Int32, Vector{Int32}, Vector{Float64}}[] + for ((sx, sy), (mcoords, cs)) in owned_constraints + ld = dof_at(sp, sx, sy) + ld === nothing && continue + if !iszero(l2own[ld]) + mdofs = Int32[dof_at(sp, mc...) for mc in mcoords] + push!(entries, (Int32(ld), mdofs, Float64.(cs))) + else + push!(entries, (Int32(ld), zeros(Int32, length(mcoords)), zeros(Float64, length(mcoords)))) + end + end + sort!(entries; by = first) + sdofs = Int32[e[1] for e in entries] + ptrs = Int32[1; 1 .+ cumsum(length(e[2]) for e in entries)] + mdata = reduce(vcat, getindex.(entries, 2); init = Int32[]) + cdata = reduce(vcat, getindex.(entries, 3); init = Float64[]) + sdofs, Table(mdata, ptrs), Table(cdata, ptrs) +end |> tuple_of_arrays + +sDOF_gids, new_mfdof_gids, new_mddof_gids, +mDOF_to_dof, sDOF_to_dof, sDOF_to_mdofs, sDOF_to_coeffs = + GridapDistributed.generate_distributed_constraints( + cell_gids, spaces, + _sDOF_to_dof_parts, sDOF_to_mdofs_parts, sDOF_to_coeffs_parts + ) + +# For the slave at (sx,sy), return sorted [(master_local_dof, coeff)] pairs. +# master_local_dof = mDOF_to_dof[m] for an mfdof entry m > 0; +# entries with mDOF_to_dof = 0 are fictitious (master not in this part's local index). +function slave_data(sp, sdof_vec, smdofs, scoeffs, mdof_to_dof, sx, sy) + ld = dof_at(sp, sx, sy) + ld === nothing && return nothing + idx = findfirst(==(Int32(ld)), sdof_vec) + idx === nothing && return nothing + mrow = smdofs.data[smdofs.ptrs[idx]:smdofs.ptrs[idx+1]-1] + coeffs = scoeffs.data[scoeffs.ptrs[idx]:scoeffs.ptrs[idx+1]-1] + mdofs = [m > 0 ? Int(mdof_to_dof[m]) : error("unexpected mddof") for m in mrow] + return sort!(collect(zip(mdofs, coeffs))) +end + +@testset "generate_distributed_constraints" begin + map(spaces, sDOF_to_dof, sDOF_to_mdofs, sDOF_to_coeffs, mDOF_to_dof) do sp, sdof_vec, smdofs, scoeffs, mdof_to_dof + + # S1: (¼,½) = 0.5·(¼,0) + 0.5·(¼,1) + # Slave owned by P1, ghost on P2. Masters are local on both parts. + if dof_at(sp, 0.25, 0.5) !== nothing + d_m1 = dof_at(sp, 0.25, 0.0) + d_m2 = dof_at(sp, 0.25, 1.0) + @test slave_data(sp, sdof_vec, smdofs, scoeffs, mdof_to_dof, 0.25, 0.5) == + sort([(d_m1, 0.5), (d_m2, 0.5)]) + end + + # S2: (¾,0) = 1.0·(½,0) + # Slave owned by P2, ghost on P1. Master (½,0) is local on both parts. + if dof_at(sp, 0.75, 0.0) !== nothing + d_m = dof_at(sp, 0.5, 0.0) + @test slave_data(sp, sdof_vec, smdofs, scoeffs, mdof_to_dof, 0.75, 0.0) == + [(d_m, 1.0)] + end + + # S3: (¾,½) = 0.5·(½,½) + 0.5·(1,½) + # Slave owned by P2, ghost on P1. On P1, (1,½) is fictitious: mDOF_to_dof = 0. + # On P2, both masters are local: mDOF_to_dof gives their local dof. + if dof_at(sp, 0.75, 0.5) !== nothing + d_m1 = dof_at(sp, 0.5, 0.5) + d_m2 = something(dof_at(sp, 1.0, 0.5), 0) # 0 when fictitious on P1 + @test slave_data(sp, sdof_vec, smdofs, scoeffs, mdof_to_dof, 0.75, 0.5) == + sort([(d_m1, 0.5), (d_m2, 0.5)]) + end + + end +end diff --git a/test/FESpacesTests.jl b/test/FESpacesTests.jl index 76cf57cb..2965bc68 100644 --- a/test/FESpacesTests.jl +++ b/test/FESpacesTests.jl @@ -9,6 +9,44 @@ using Test u((x,y)) = x+y +function gid_generation_tests(distribute,parts) + ranks = distribute(LinearIndices((prod(parts),))) + + model = CartesianDiscreteModel(ranks,parts,(0,1,0,1),(8,4)) + reffe = ReferenceFE(lagrangian,Float64,1) + cgids = GridapDistributed.get_cell_gids(model) + + V0 = FESpace(model,reffe) + I0 = partition(get_free_dof_ids(V0)) + + V1 = FESpace(model,reffe;dirichlet_tags=["tag_1","tag_2","tag_3","tag_4","tag_5","tag_6"]) + I1 = partition(get_free_dof_ids(V1)) + + fgids_1, dgids_1 = GridapDistributed.generate_posneg_gids( + cgids,get_cell_dof_ids(V1),map(num_free_dofs,local_views(V1)),map(num_dirichlet_dofs,local_views(V1)) + ) + + # 1 -> free, 2 -> dirichlet + lid_to_color = map(local_views(V0),local_views(V1)) do V0, V1 + lid_to_color = zeros(Int16,num_free_dofs(V0)) + for (I,J) in zip(get_cell_dof_ids(V0),get_cell_dof_ids(V1)) + lid_to_color[I] .= map(j -> ifelse(j > 0, 1, 2), J) + end + return lid_to_color + end + + (fgids_2, dgids_2), _ = GridapDistributed.split_gids_by_color( + get_free_dof_ids(V0), lid_to_color + ) + + (fgids_3, dgids_3), _, _ = GridapDistributed.generate_gids_by_color( + cgids, get_cell_dof_ids(V0), lid_to_color + ) + + @test fgids_1 == fgids_2 == fgids_3 + @test dgids_1 == dgids_2 == dgids_3 +end + function assemble_tests(das,dΩ,dΩass,U,V) # Assembly dv = get_fe_basis(V)