-
Notifications
You must be signed in to change notification settings - Fork 250
ncclAllReduce may require additional barriers #1587
Copy link
Copy link
Open
Labels
bugSomething isn't workingSomething isn't working
Description
| checkNCCL(ncclAllReduce(w_grad_ptr, |
A thread on Zulip mentioned some additional care required for NCCL within a Legion task. Rohan spotted a problem in FlexFlow's use of ncclAllReduce. You may need to add concurrent_task_barrier before and after the call, and call set_concurrent_barrier on the task. More info is in the comment for that barrier.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working