feat(taskbroker): Batch Status Updates by george-sentry · Pull Request #618 · getsentry/taskbroker

george-sentry · 2026-04-30T21:02:08Z

Linear

Description

On the usual workload of 100 millisecond tasks, with the new "claimed" status, we can do around 5K tasks per second in the sandbox. By batching status updates, we reduce DB load, making all queries take less time. This can increase throughput by 1K to 2K tasks per second.

linear-code · 2026-04-30T21:02:11Z

STREAM-918 Batch Status Updates

STREAM-919 Delete Completed Tasks Immediately

george-sentry · 2026-05-01T00:45:30Z

Since we may want to treat claimed → processing updates the same way, I'm actually going to create a more general Flusher struct that can be used by both push threads and the gRPC server.

…eorge/push-taskbroker/batch-updates

george-sentry · 2026-05-01T08:10:27Z

+/// Run flusher that receives values of type T from a channel and flushes
+/// them using the provided async `flush` function either when the batch is
+/// full or when the max flush interval has elapsed.
+pub async fn run_flusher<T, F>(


I created this function because I'm also planning to batch claimed → processing updates in the push pool, which will use basically identical machinery.

fpacifici · 2026-05-01T23:56:57Z

+                }
+            }
+
+            _ = interval.tick() => {


When does this trigger ?

This code now lives in flusher.rs, but it contains a similar loop. This condition triggers every interval_ms after the previous tick.

The flusher only handles the tick when select! actually chooses this branch. If messages keep arriving and the rx.recv() arm keeps winning before the tick is ready, the tick still advances in the background. When the tick is ready and this arm is selected, the buffer is flushed.

fpacifici · 2026-05-02T00:01:01Z

+            for id in ids {
+                buffer.push((id, status));
+            }


Let's say there is a DB issue, would we keep appending to the buffer indefinitely? I think we should add a limit after which we stop and retry on the DB.

This was actually dead code, but similar logic now lives elsewhere.

No, we only append to the buffer while it hasn't reached the desired batch size. So if there's a DB issue, here's what should happen.

Timer runs out or buffer fills up → call flush (this function)

As long as flush is running, the (now empty) buffer does not receive any more IDs

Flush fails because store is unresponsive or some other problem

IDs are pushed back onto the buffer (which was emptied right before attempting the flush)

Flush function exits

So if the DB has a problem, we will keep retrying the same batch of IDs over and over again until it succeeds.

…eorge/push-taskbroker/batch-updates

Co-authored-by: Markus Unterwaditzer <markus-github@unterwaditzer.net>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 25dce9e. Configure here.}

….com/getsentry/taskbroker into george/push-taskbroker/batch-updates

markstory

Makes sense to me. I didn't see any paths that would lead to data loss. The additional queue could result in writes that were accepted from the client perspective to be lost during a crash. While we'll run the task an additional time, we shouldn't lose any data, as the activations will still be in postgres.

….com/getsentry/taskbroker into george/push-taskbroker/batch-updates

george-sentry · 2026-05-14T21:53:45Z

The additional queue could result in writes that were accepted from the client perspective to be lost during a crash. While we'll run the task an additional time, we shouldn't lose any data, as the activations will still be in Postgres.

True. I also thought about it this way.

If I batch updates, set_task_status will run faster so the taskworker's internal result queue will drain quicker. If the taskbroker crashes, the queued updates will be lost.

If I don't batch updates, set_task_status will run slower so the taskworker's internal result queue will drain slower. If the taskworker crashes, the queued updates will be lost.

Regardless, the outcome is roughly the same -- a certain number tasks are lost and executed a second time. The only difference is whether they are queued in the taskbroker or the taskworker.

fpacifici · 2026-05-14T21:24:48Z

                        }

                        _ = async {
+                            let start = Instant::now();


Why this change ?

Not sure. I'm moving it back to where it was in #637.

fpacifici · 2026-05-14T22:20:08Z

+                for id in ids {
+                    buffer.push((id, status));
+                }


Let's say the request fails because the DB has a problem (not a transient one) or it is saturated.
Are we going to pile up requests till the broker goes out of memory? This would leave the database in an inconsistent state that will be very hard to troubleshoot.

Do we go through this code only when the worker updates the status or also when we claim the task or reset them during the upkeep ? Upkeep and claim phase have an easier and more deterministic way to batch requests, so they should not go through this.

I don't think that'll happen because the buffer is bounded, meaning requests won't pile up. It's not obvious here because the logic that limits buffer size is in flusher.rs.

When I simulated various database issues during testing (such as high latency and connection reset errors), this did not appear to be a problem. Taskbroker was able to recover on its own as soon as those issues were resolved.

But it may be worth testing again because that was a while ago.

Correction... there was a bug in the buffering logic in flusher.rs that would result in unbounded growth. I'm fixing it in #637.

Batch Status Updates and Delete Completed Tasks Immediately

e707b7e

george-sentry requested a review from a team as a code owner April 30, 2026 21:02

sentry Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread src/main.rs Outdated

Comment thread src/main.rs Outdated

cursor Bot reviewed Apr 30, 2026

View reviewed changes

Comment thread src/main.rs Outdated

Comment thread src/main.rs Outdated

Comment thread src/main.rs Outdated

george-sentry marked this pull request as draft April 30, 2026 23:24

Create Generic Flusher Function

b7ef805

george-sentry marked this pull request as ready for review May 1, 2026 08:02

george-sentry added 2 commits May 1, 2026 01:05

Fix Copy/Paste Error in Log Messages

a2fab2c

Merge branch 'main' of https://github.com/getsentry/taskbroker into g…

c96955e

…eorge/push-taskbroker/batch-updates

sentry Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/grpc/server.rs

Comment thread src/store/adapters/sqlite.rs Outdated

george-sentry commented May 1, 2026

View reviewed changes

cursor Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/grpc/server.rs

Comment thread src/grpc/status_flusher.rs Outdated

markstory reviewed May 1, 2026

View reviewed changes

Comment thread src/store/adapters/sqlite.rs Outdated

Comment thread src/store/adapters/sqlite.rs Outdated

Comment thread src/store/adapters/postgres.rs Outdated

Comment thread src/store/adapters/postgres.rs Outdated

george-sentry changed the title ~~feat(taskbroker): Batch Status Updates and Delete Completed Tasks Immediately~~ feat(taskbroker): Batch Status Updates May 1, 2026

fpacifici reviewed May 2, 2026

View reviewed changes

evanh reviewed May 4, 2026

View reviewed changes

Comment thread src/grpc/server.rs Outdated

Comment thread src/grpc/server_tests.rs

Comment thread src/flusher.rs Outdated

Comment thread src/main.rs Outdated

Various Changes

1e79bfc

sentry Bot reviewed May 4, 2026

View reviewed changes

Comment thread src/main.rs

Add Tests for Batching

5c60034

sentry Bot reviewed May 7, 2026

View reviewed changes

Comment thread src/main.rs Outdated

george-sentry added 2 commits May 11, 2026 07:20

Merge branch 'main' of https://github.com/getsentry/taskbroker into g…

f46c1f2

…eorge/push-taskbroker/batch-updates

Add Switch for Batching Status Updates

a932f39

sentry Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/grpc/server.rs

untitaker reviewed May 11, 2026

View reviewed changes

Comment thread src/fetch/tests.rs Outdated

Comment thread src/grpc/server.rs Outdated

george-sentry and others added 2 commits May 11, 2026 09:15

Minor Tests Change

2e77b13

Minor Batch Processing Edit

25dce9e

Co-authored-by: Markus Unterwaditzer <markus-github@unterwaditzer.net>

sentry Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/grpc/server.rs

cursor Bot reviewed May 11, 2026

View reviewed changes

Comment thread src/main.rs Outdated

george-sentry added 3 commits May 11, 2026 09:26

Ensure Update Channel Size ≥ 1

37c336c

Merge branch 'george/push-taskbroker/batch-updates' of https://github…

e719dd3

….com/getsentry/taskbroker into george/push-taskbroker/batch-updates

Merge branch 'main' into george/push-taskbroker/batch-updates

9bef975

markstory reviewed May 14, 2026

View reviewed changes

Comment thread src/flusher.rs Outdated

markstory approved these changes May 14, 2026

View reviewed changes

george-sentry added 2 commits May 14, 2026 14:49

Clarify run_flusher Doc Comment

099f363

Merge branch 'george/push-taskbroker/batch-updates' of https://github…

727b378

….com/getsentry/taskbroker into george/push-taskbroker/batch-updates

george-sentry merged commit 4567466 into main May 14, 2026
24 checks passed

george-sentry deleted the george/push-taskbroker/batch-updates branch May 14, 2026 22:01

fpacifici reviewed May 14, 2026

View reviewed changes

george-sentry mentioned this pull request May 14, 2026

feat(taskbroker): Batch Claimed → Processing Updates #637

Open

Uh oh!

Conversation

george-sentry commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Linear

Description

Uh oh!

linear-code Bot commented Apr 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

george-sentry commented May 1, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

markstory left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

george-sentry commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

george-sentry commented Apr 30, 2026 •

edited

Loading

george-sentry commented May 14, 2026 •

edited

Loading