Skip to content

Add optional stale-timeout to QueuedJobsTable::isQueued()#503

Closed
dereuromark wants to merge 1 commit into
masterfrom
feature/isqueued-stale-timeout
Closed

Add optional stale-timeout to QueuedJobsTable::isQueued()#503
dereuromark wants to merge 1 commit into
masterfrom
feature/isqueued-stale-timeout

Conversation

@dereuromark
Copy link
Copy Markdown
Owner

Problem

QueuedJobsTable::isQueued() counts any row with completed IS NULL. A job that a worker fetched and then died on (OOM, PHP timeout, container kill) without ever marking the row completed or failed stays completed IS NULL indefinitely.

Callers that gate on isQueued() then wedge behind that ghost row. The concrete case: a non-concurrent QueueScheduler row (allow_concurrent = false) checks isQueued($reference, $task) before dispatching the next run. Once a prior job is stuck "running" (fetched, never completed), every subsequent dispatch — cron and manual "Run" alike — is held back forever, even though nothing is actually executing.

Change

Adds an optional $staleTimeout (seconds) parameter to isQueued():

public function isQueued(string $reference, ?string $jobTask = null, ?int $staleTimeout = null): bool

When provided, a row that was fetched longer ago than the timeout and is still not completed is presumed abandoned and excluded. Rows not yet fetched, and rows fetched within the window, still count.

$staleTimeout defaults to null, which preserves the exact original behaviour — fully backward compatible. Existing testIsQueued() is untouched and still passes.

Tests

Added testIsQueuedStaleTimeout() and testIsQueuedStaleTimeoutIgnoresUnfetched() covering the fetched-but-abandoned, still-within-window, and never-fetched cases. Full QueuedJobsTableTest green; phpcs and phpstan clean on the changed file.

isQueued() counts any row with `completed IS NULL`. A job that a worker
fetched and then died on (OOM, timeout, kill) without marking it
completed or failed stays `completed IS NULL` forever, so callers that
gate on isQueued() — notably a non-concurrent scheduler deciding
whether to dispatch the next run — can wedge permanently behind a job
that will never make progress.

Adds an optional `$staleTimeout` (seconds) parameter: a row fetched
longer ago than the timeout and still not completed is presumed
abandoned and excluded. Not-yet-fetched rows, and rows fetched within
the window, still count. The default `null` preserves the original
behaviour, so this is fully backward compatible.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 28, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.36%. Comparing base (25b2c08) to head (77da2bc).
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #503      +/-   ##
============================================
+ Coverage     78.32%   78.36%   +0.03%     
- Complexity      978      979       +1     
============================================
  Files            45       45              
  Lines          3313     3318       +5     
============================================
+ Hits           2595     2600       +5     
  Misses          718      718              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@dereuromark
Copy link
Copy Markdown
Owner Author

Superseded by #504. The stale-fetch discriminator here was wrong: a job with retries remaining is one the queue itself re-runs, so the scheduler would re-dispatch a duplicate and pile up failed rows. #504 instead persists the terminal 'aborted' verdict the queue already computes, which fixes the wedge without that side effect.

@dereuromark dereuromark deleted the feature/isqueued-stale-timeout branch May 28, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants