Skip to content

[Patch] migrate SerialPatchTree & PatchtreeField to sham::DeviceBuffer#1798

Open
TApplencourt wants to merge 15 commits into
Shamrock-code:mainfrom
TApplencourt:migrate/serial-patch-tree-devicebuffer
Open

[Patch] migrate SerialPatchTree & PatchtreeField to sham::DeviceBuffer#1798
TApplencourt wants to merge 15 commits into
Shamrock-code:mainfrom
TApplencourt:migrate/serial-patch-tree-devicebuffer

Conversation

@TApplencourt
Copy link
Copy Markdown
Contributor

Continues the sycl::buffer -> sham::DeviceBuffer migration (#490, #672, #1461). Swap the underlying storage in:

  • PatchtreeField: internal_buf is now std::optional<DeviceBuffer>. allocate() now requires a DeviceScheduler_ptr.
  • SerialPatchTree: serial_tree_buf and linked_patch_ids_buf become std::optional<DeviceBuffer<...>>. attach_buf/detach_buf API preserved for caller compatibility.
  • compute_patch_owner: returns DeviceBuffer, kernel rewritten with sham::kernel_call + MultiRef.
  • make_patch_tree_field: takes DeviceScheduler_ptr instead of sycl::queue, reduction loop migrated to sham::kernel_call.
  • host_for_each_leafs: uses BufferMirror instead of sycl::host_accessor.

Propagate the DistributedData<DeviceBuffer> change through ReattributeDataUtility (compute_new_pid, extract_elements, reatribute_patch_objects).

Update callers: BasicSPHGhosts, GSPHGhostHandler, SPHSetup, SPHUtilities, GSPHUtilities.

Delete the now-dead reduce_field method and patch_reduc_tree.hpp (PatchFieldReduction had zero callers).

Assisted-by: Claude Opus 4.7 noreply@anthropic.com

Continues the sycl::buffer -> sham::DeviceBuffer migration (Shamrock-code#490, Shamrock-code#672,
Shamrock-code#1461). Swap the underlying storage in:

- PatchtreeField<T>: internal_buf is now std::optional<DeviceBuffer<T>>.
  allocate() now requires a DeviceScheduler_ptr.
- SerialPatchTree<T>: serial_tree_buf and linked_patch_ids_buf become
  std::optional<DeviceBuffer<...>>. attach_buf/detach_buf API preserved
  for caller compatibility.
- compute_patch_owner: returns DeviceBuffer<u64>, kernel rewritten with
  sham::kernel_call + MultiRef.
- make_patch_tree_field: takes DeviceScheduler_ptr instead of sycl::queue,
  reduction loop migrated to sham::kernel_call.
- host_for_each_leafs: uses BufferMirror<host> instead of sycl::host_accessor.

Propagate the DistributedData<DeviceBuffer<u64>> change through
ReattributeDataUtility (compute_new_pid, extract_elements,
reatribute_patch_objects).

Update callers: BasicSPHGhosts, GSPHGhostHandler, SPHSetup,
SPHUtilities, GSPHUtilities.

Delete the now-dead reduce_field method and patch_reduc_tree.hpp
(PatchFieldReduction had zero callers).

Assisted-by: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

Thanks @TApplencourt for opening this PR!

You can do multiple things directly here:
1 - Comment pre-commit.ci run to run pre-commit checks.
2 - Comment pre-commit.ci autofix to apply fixes.
3 - Add label autofix.ci to fix authorship & pre-commit for every commit made.
4 - Add label light-ci to only trigger a reduced & faster version of the CI (need the full one before merge).
5 - Add label trigger-ci to create an empty commit to trigger the CI.

Once the workflow completes a message will appear displaying informations related to the run.

Also the PR gets automatically reviewed by gemini, you can:
1 - Comment /gemini review to trigger a review
2 - Comment /gemini summary for a summary
3 - Tag it using @gemini-code-assist either in the PR or in review comments on files

@TApplencourt
Copy link
Copy Markdown
Contributor Author

pre-commit.ci autofix

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the codebase to replace SYCL buffers and accessors with sham::DeviceBuffer and BufferMirror abstractions, while transitioning from direct SYCL queue submissions to sham::kernel_call. Key changes include updating PatchtreeField and SerialPatchTree to use std::optionalsham::DeviceBuffer and replacing sycl::host_accessor with mirror_tosham::host() or copy_to_stdvec(). Feedback identifies several performance bottlenecks where data transfers or heap allocations occur within loops. Furthermore, the reviewer pointed out multiple safety regressions where std::optional buffers and scheduler pointers are dereferenced without proper validation, which could lead to runtime crashes.

Comment on lines +303 to +304
auto tree = sptree.serial_tree_buf->template mirror_to<sham::host>();
auto lpid = sptree.linked_patch_ids_buf->template mirror_to<sham::host>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Creating and destroying BufferMirror objects inside the triple loop over periodic offsets is highly inefficient. Each creation triggers a Device-to-Host copy, and each destruction triggers a Host-to-Device copy (write-back). These mirrors should be moved outside the loops (before line 296) to avoid redundant data transfers.

Comment on lines +369 to +370
auto tree = sptree.serial_tree_buf->template mirror_to<sham::host>();
auto lpid = sptree.linked_patch_ids_buf->template mirror_to<sham::host>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Similar to the periodic case, creating mirrors inside the for_each_patch_shift lambda causes repeated data transfers for every shift. These mirrors should be moved outside the for_each_patch_shift call (before line 362).

Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp Outdated
Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp Outdated
Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp Outdated
Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp Outdated
Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp Outdated
bool err_id_in_newid = false;
{
sycl::host_accessor nid{newid_buf_map.get(id), sycl::read_only};
std::vector<u64> nid = newid_buf_map.get(id).copy_to_stdvec();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

copy_to_stdvec() performs a heap allocation and a full buffer copy from device to host. Calling it inside for_each_patchdata is inefficient and can become a performance bottleneck as the number of patches increases. Consider if this logic can be moved to a device kernel or if a more efficient host access method is available.

if (!pdat.is_empty()) {

sycl::host_accessor nid{new_pid.get(current_pid), sycl::read_only};
std::vector<u64> nid = new_pid.get(current_pid).copy_to_stdvec();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the usage in compute_new_pid, calling copy_to_stdvec() inside a loop over patches is inefficient due to repeated allocations and data transfers.

TApplencourt and others added 5 commits May 8, 2026 16:37
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@tdavidcl
Copy link
Copy Markdown
Member

tdavidcl commented May 8, 2026

pre-commit.ci autofix

Comment thread src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp
@github-actions
Copy link
Copy Markdown
Contributor

Workflow report

workflow report corresponding to commit 08d1a1b
Commiter email is timothee.davidcleris@proton.me

Pre-commit check report

Some failures were detected in base source checks checks.
Check the On PR / Linting / Base source checks (pull_request) job in the tests for more detailed output

Suggested changes

Detailed changes :
diff --git a/src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp b/src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp
index 203a79a8..1d691a2e 100644
--- a/src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp
+++ b/src/shamrock/include/shamrock/scheduler/SerialPatchTree.hpp
@@ -256,10 +256,9 @@ sham::DeviceBuffer<u64> SerialPatchTree<vec>::compute_patch_owner(
     sham::DeviceScheduler_ptr dev_sched, sham::DeviceBuffer<vec> &position_buffer, u32 len) {
     sham::DeviceBuffer<u64> new_owned_id(len, dev_sched);
 
-
-   if(len == 0){
-   return new_owned_id;
-   }
+    if (len == 0) {
+        return new_owned_id;
+    }
 
     using namespace shamrock::patch;
 

tdavidcl added a commit that referenced this pull request May 30, 2026
This PR cherry pick code deletions from #1798

---------

Co-authored-by: tapplencourt <tapplencourt@anl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants