fix: forward controlActionCancel to cancelCh in poll-mode fetchAndRunLoop#1245
fix: forward controlActionCancel to cancelCh in poll-mode fetchAndRunLoop#1245ata-the-legend wants to merge 1 commit intoriverqueue:masterfrom
Conversation
…Loop When using drivers that don't support LISTEN/NOTIFY (e.g. riverdatabasesql), job cancel events are routed in-process via queueControlCh. The controlActionCancel case was missing from fetchAndRunLoop's queueControlCh handler, causing cancel events to be silently dropped and ctx.Done() to never fire inside a running Work() call. Forward the job ID to cancelCh so the existing maybeCancelJob call handles it, matching the behaviour of the LISTEN/NOTIFY path in handleControlNotification. Adds a test that verifies ctx.Done() fires in a running job after JobCancel is called when using a poll-only driver (SupportsListener() == false).
|
I see two issues here:
Rather than going down this path, I think it's more likely we'd want to think about a non-Postgres notifier option of some kind (i.e. Redis). This would lose transactionality, but would be far more scalable and compatible with non-NOTIFY Postgres environments, dbsql drivers, or other databases. Another option which we've talked about before is letting you use a pgx listener/notifer even if you're using database/sql for the rest of your driver: #352 This potentially covers some Postgres use cases that leverage database/sql, albeit only if you have access to a conn or pool that does have |
|
Thanks for the detailed explanation. One small clarification: the fix doesn't rely on Postgres NOTIFY at all. You're right though that this only works single-process — in a multi-process deployment that channel send never reaches the other server's producer. So it's not a general solution and I understand why you wouldn't want to merge it as-is. For my use case I'm working around it by polling the contract status from inside |
Problem
When using River with a driver that does not support LISTEN/NOTIFY (e.g.
riverdatabasesql), callingJobCancelon a running job has no effect: the job's context is never cancelled andctx.Done()never fires insideWork().Root Cause
JobCancel(non-tx) callsnotifyProducerWithoutListenerQueueControlEvent, which sends acontrolEventPayload{Action: "cancel"}toqueueControlCh. However, infetchAndRunLoop, thequeueControlChreceive block only handledcontrolActionPause,controlActionResume, andcontrolActionMetadataChanged. ThecontrolActionCancelcase was absent and fell through todefault, silently discarding the event.The
controlActionCancelcase exists correctly inhandleControlNotification(the LISTEN path), but was never added to the parallel poll-mode path.Fix
Add
controlActionCanceltofetchAndRunLoop'squeueControlChswitch, forwardingmsg.JobIDtocancelCh. This is consumed by the existingmaybeCancelJobcall, which cancels the running job's context viacontext.WithCancelCause.Testing
Added a test that verifies
ctx.Done()fires insideWork()afterJobCancelis called when the driver is in poll mode.