Skip to content

Comments

Support cancelling ML async jobs#1144

Draft
carlosgjs wants to merge 9 commits intoRolnickLab:mainfrom
uw-ssec:carlos/asynccancel3
Draft

Support cancelling ML async jobs#1144
carlosgjs wants to merge 9 commits intoRolnickLab:mainfrom
uw-ssec:carlos/asynccancel3

Conversation

@carlosgjs
Copy link
Collaborator

@carlosgjs carlosgjs commented Feb 20, 2026

Summary

This pull request introduces improvements to job cancellation and cleanup logic, particularly for asynchronous jobs using NATS/Redis. It adds better handling for unknown progress states, ensures cleanup routines are consistently invoked, and improves code clarity by renaming and refactoring functions.

Job cancellation and cleanup improvements:

  • The cancel method in the Job model now calls cleanup_async_job_if_needed to ensure async resources are cleaned up when a job is cancelled.

  • The cleanup function _cleanup_job_if_needed has been renamed to cleanup_async_job_if_needed and its type signature clarified; all references throughout the codebase have been updated to use the new name.
    Handling unknown progress states:

  • The JobStateProgress class now includes an unknown flag to indicate when progress cannot be determined, such as missing Redis keys.

  • When progress information is unknown during NATS pipeline result processing, a warning is logged and the task returns early, preventing further processing and retrying as needed.

  • The _commit_update method returns a JobStateProgress with unknown=True when Redis keys are missing, making this state distinguishable from other errors.

Resource cleanup enhancements:

  • The cleanup method for async job state now deletes the lock key in addition to other cache keys, ensuring all Redis resources are properly released.

Testing

  • Created an async ML job with a 1000 image collection
  • Waited until results started to come in
  • Clicked cancel
  • Verified in the debugger and logs that tasks stop being processed
celeryworker-1  | [2026-02-20 17:22:33,724: WARNING/ForkPoolWorker-1] Progress info is unknown for job 111 when processing results. Job may be cancelled.Or this could be a transient Redis error and the NATS task will be retried.
  • Verified NATS stream/consumers are removed
  • Job status is set to revoked:
image - Verified clicking Retry is able to restart the job

Checklist

  • I have tested these changes appropriately.
  • I have added and/or modified relevant tests. FYI, test_cancel_job() is currently just a stub
  • I updated relevant documentation or comments.
  • I have verified that this PR follows the project's coding standards.
  • Any dependent changes have already been merged to main.

@netlify
Copy link

netlify bot commented Feb 20, 2026

👷 Deploy request for antenna-ssec pending review.

Visit the deploys page to approve it

Name Link
🔨 Latest commit 9827ed2

@netlify
Copy link

netlify bot commented Feb 20, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 9827ed2
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/6998e31bc0893a0008c9d33e

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 20, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@carlosgjs carlosgjs changed the title Carlos/asynccancel3 Support cancelling ML async jobs Feb 20, 2026
@carlosgjs carlosgjs requested a review from mihow February 20, 2026 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants