sedef_smk

Snakemake pipeline for running SEDEF on haplotype-resolved assemblies and generating final segmental duplication BED and BigBed files.

This workflow is adapted from DongAhn’s segmental duplication pipeline and is mainly intended for internal use in the lab.

Overview

This pipeline takes:

haplotype FASTA files
RepeatMasker BED
TRF BED

and performs:

FASTA preparation and indexing
WindowMasker
repeat BED processing
FASTA masking
SEDEF
downstream filtering
final BED and BigBed generation

Input

The pipeline expects a tab-delimited manifest.tab file in the working directory.

Required columns:

SAMPLE
H1
H2
RM
TRF

Example:

SAMPLE	H1	H2	RM	TRF
ASM  ASM/hap1.fasta	ASM/hap2.fasta	ASM/RM.bed.gz	ASM/trf.bed.gz

Output

Outputs are written under:

results/{sample}/

Main final outputs:

results/{sample}/final_outputs/beds/{hap}.SDs.bed
results/{sample}/final_outputs/beds/{hap}.SDs.lowid.bed
results/{sample}/final_outputs/bigbeds/{hap}.SDs.bb
results/{sample}/final_outputs/bigbeds/{hap}.SDs.lowid.bb

Run

Local

./runlocal 8

Cluster

./runcluster 30

Notes

This pipeline is mainly intended for the Eichler lab compute environment.
Contig names containing # are temporarily rewritten during intermediate steps and restored in final outputs.
RM and TRF are expected to be RepeatMasker and TRF BED outputs could be generated by the Rhodonite annotation workflow https://github.com/vollgerlab/Rhodonite.
Sample and haplotype expansion is driven by manifest.tab.

Repository structure

.
├── Snakefile
├── runlocal
├── runcluster
├── rules/
├── scripts/
├── schema/
└── resources/

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sedef_smk

Overview

Input

Output

Run

Local

Cluster

Notes

Repository structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
resources/wrapbin		resources/wrapbin
rules		rules
schema		schema
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Snakefile		Snakefile
manifest.tab		manifest.tab
runcluster		runcluster
runlocal		runlocal

Folders and files

Latest commit

History

Repository files navigation

sedef_smk

Overview

Input

Output

Run

Local

Cluster

Notes

Repository structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages