RNome Minimal Modification Analysis Pipeline

Nextflow pipeline used for aligning nanopore modBAM files using minimap2 (preserving modification tags) and summarising modifications using modkit.

Data processing notes:

The output (raw) modBAMs from MinKNOW were first merged and sorted using samtools (1.19.1):

 samtools merge -  data/bam_pass/*.bam -@ 16 | samtools sort -@ 16 -o PBG54229_pass_2aeee96b_4a0fe7c1_merged.bam

Overview

The pipeline:

Converts modBAMs to fastq files preserving RNA modification tags using samtools and aligns to a reference (genome or transcriptome) using minimap2
Runs 'pileup' (summarises) RNA modifications on merged BAMs with modkit & outputs as bedRmod format. Specifically m6A, m5C, inosine & pseudouridine mods.

The pipeline is written to run on an HPC cluster, or AWS Server/EC2 instance with all data locally available on the server and requires Nextflow (≥22.04) and Docker.

Running the pipeline

Uses a samplesheet to specify inputs (modbam paths, sample_id and reference against which to align the data). Use unique sample_id values to avoid filename collisions. The same sample can be aligned to different references by using different row entries.

Example samplesheet.csv:

sample_id,reference_type,reference,bam
sample1_gm,genome,/path/to/GRCh38.fa.gz,/path/to/sample1.bam
sample1_tx,transcriptome,/path/to/gencode.fa.gz,/path/to/sample1.bam

Run:

nextflow run main.nf \
    -profile aws_ec2,docker \
    --samplesheet samplesheet.csv \
    --outdir results/

Includes a convenience script to run the nextflow command on an EC2 instance or an HPC cluster using slurm scheduler & singularity.

bash run_ec2.sh

sbatch run_slurm.sh

Output Structure

results/
├── alignment/              # Individual aligned BAMs
│   ├── sample1_gm.bam
│   └── sample1_tx.bam
├── bedRmod/               # Modification calls
│   ├── sample1_gm_bedRmod.bed.gz
│   ├── sample1_tx_bedRmod.bed.gz
├── reference/             # Processed references
└── pipeline_info/         # Execution reports

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
run_ec2.sh		run_ec2.sh
run_slurm.sh		run_slurm.sh
samplesheet.csv		samplesheet.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNome Minimal Modification Analysis Pipeline

Data processing notes:

Overview

Running the pipeline

Output Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RNome Minimal Modification Analysis Pipeline

Data processing notes:

Overview

Running the pipeline

Output Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages