Skip to content

Add pre-flight BAM integrity check before scAbsolute calling#34

Merged
shafighi merged 1 commit into
mainfrom
bam-preflight-validation
Jun 1, 2026
Merged

Add pre-flight BAM integrity check before scAbsolute calling#34
shafighi merged 1 commit into
mainfrom
bam-preflight-validation

Conversation

@shafighi
Copy link
Copy Markdown
Collaborator

@shafighi shafighi commented Jun 1, 2026

New validate_bams rule runs samtools quickcheck on every BAM in the sample sheet before any qc / scale_scAbsolute job is scheduled. If any file is truncated or otherwise unreadable, the workflow aborts up front with the full list of bad files so the user can fix them all in one pass (typically by re-copying from the source) rather than discovering failures one cell at a time deep into a multi-hour run.

Failures are recorded in the SAME canonical CSV used by the downstream combine / merge step:

results/<binSize>/<sampleName>_<binSize>_failed_cells.csv

with a new failure_reason value "truncated_bam" alongside the existing "missing_output" and "process_crash". One file, one schema, one place to look regardless of which stage caught the failure.

Also updates README.md to document the failure_reason vocabulary and recommends running snakemake with --keep-going so that single-cell failures later in the run do not abort the entire batch.

New validate_bams rule runs samtools quickcheck on every BAM in the
sample sheet before any qc / scale_scAbsolute job is scheduled. If any
file is truncated or otherwise unreadable, the workflow aborts up front
with the full list of bad files so the user can fix them all in one
pass (typically by re-copying from the source) rather than discovering
failures one cell at a time deep into a multi-hour run.

Failures are recorded in the SAME canonical CSV used by the downstream
combine / merge step:

    results/<binSize>/<sampleName>_<binSize>_failed_cells.csv

with a new failure_reason value "truncated_bam" alongside the existing
"missing_output" and "process_crash". One file, one schema, one place
to look regardless of which stage caught the failure.

Also updates README.md to document the failure_reason vocabulary and
recommends running snakemake with --keep-going so that single-cell
failures later in the run do not abort the entire batch.
Copilot AI review requested due to automatic review settings June 1, 2026 13:49
@shafighi shafighi merged commit 73f9dff into main Jun 1, 2026
1 check failed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants