Skip to content

Fix data integrity: ensure FINISHED status only after successful score upload#2447

Open
Didayolo wants to merge 2 commits into
developfrom
m8-finished-must-have-scores
Open

Fix data integrity: ensure FINISHED status only after successful score upload#2447
Didayolo wants to merge 2 commits into
developfrom
m8-finished-must-have-scores

Conversation

@Didayolo

@Didayolo Didayolo commented Jun 26, 2026

Copy link
Copy Markdown
Member

Original PR: #2424

Description

Fixes a data integrity bug where submissions were marked as FINISHED before their scores were successfully uploaded to the server, resulting in FINISHED submissions with no scores.

Problem: The compute worker set submission status to FINISHED before uploading scores to the Django server. If the score upload failed (network error, server timeout, etc.), the submission would be marked FINISHED but have no scores in the database.

Root cause:

# Old code (compute_worker.py)
run._update_status(SubmissionStatus.FINISHED)  # Mark FINISHED first
run.push_scores()  # Upload scores second - might fail!

Solution:

  1. Reorder operations: Upload scores BEFORE setting FINISHED status
  2. Add retry logic: Exponential backoff (3 retries) for transient failures
  3. Preserve atomicity: Status update only happens after successful upload

Code changes:

  • compute_worker/compute_worker.py:
    • Moved push_scores() before _update_status(FINISHED) in run_wrapper()
    • Added retry logic with exponential backoff in push_scores()
    • Added 30s timeout for score POST requests
# New code
if run.is_scoring:
    run.push_scores()  # Upload scores first with retry logic
run.push_output()
if run.is_scoring:
    run._update_status(SubmissionStatus.FINISHED)  # Mark FINISHED only after success

Issues this PR resolves

Fixes #2423

Background

This bug was discovered during the EEG Foundation Challenge incident analysis (8,328 submissions, 51% failure rate). Analysis showed submissions stuck in FINISHED state with no scores in the database.

Checklist for hand testing

  • Create a competition with at least one scoring phase
  • Submit a valid submission
  • Verify submission reaches FINISHED status
  • Verify scores are present in the database and displayed on leaderboard
  • Test with network interruptions to verify retry logic

Checklist

  • Code review by me
  • Hand tested by me
  • I'm proud of my work
  • Code review by reviewer
  • Hand tested by reviewer
  • CircleCI tests are passing
  • Ready to merge

hananechrif and others added 2 commits June 22, 2026 11:11
…e upload

The compute_worker must upload scores before marking submission as FINISHED.
Previously, status was updated first, creating a race condition where the
leaderboard could read FINISHED submissions without scores.

Changes:
- Reorder run_wrapper: call push_scores() and push_output() before _update_status(FINISHED)
- Add retry logic in push_scores() with exponential backoff (3 attempts)
- Increase timeout to 30s for score uploads
Fix data integrity: ensure FINISHED status only after successful score upload
@Didayolo Didayolo self-assigned this Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data integrity: score or leaderboard write fails after Finished status

2 participants