-
Notifications
You must be signed in to change notification settings - Fork 2k
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
In non-preserve-order repartitioning mode, all input partition tasks share clones of the same SpillPoolWriter for each output partition. SpillPoolWriter used #[derive(Clone)] but its Drop implementation unconditionally set writer_dropped = true and finalized the current spill file. This meant that when the first input task finishes and its clone is dropped, the SpillPoolReader sees writer_dropped = true on an empty queue and returns EOF — silently discarding every batch subsequently written by the still-running input tasks.
This bug requires three conditions to trigger:
- Non-preserve-order repartitioning (so spill writers are cloned across input tasks)
- Memory pressure causing batches to spill to disk
- Input tasks finishing at different times (the common case with varying partition sizes)
To Reproduce
No response
Expected behavior
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working