TAFFISH app for StringTie, a transcript assembly and quantification tool for RNA-seq alignments.
- name:
stringtie - command:
taf-stringtie - version:
3.0.3-r2 - kind:
tool - image:
ghcr.io/taffish/stringtie:3.0.3-r2 - upstream: StringTie
3.0.3 - runtime version:
3.0.3 - default command:
stringtie - native platforms:
linux/amd64
This app builds StringTie from the official stringtie-3.0.3.offline.tar.gz
source package, which includes the upstream test data and bundled build-time
source dependencies. The main stringtie binary and the Python 3-compatible
prepDE.py3 helper are installed in PATH.
taf install stringtieTAFFISH wrapper help:
taf-stringtie --help
taf-stringtie --version
taf-stringtie --compileUpstream StringTie help and version:
taf-stringtie -- -h
taf-stringtie -- --version
taf-stringtie stringtie --versionAssemble transcripts from a coordinate-sorted BAM:
taf-stringtie -p 8 -o sample.gtf sample.sorted.bamUse a guide annotation:
taf-stringtie -p 8 -G reference.gtf -o sample.guided.gtf sample.sorted.bamEstimate expression against a guide annotation:
taf-stringtie -e -B -G reference.gtf -o ballgown/sample.gtf sample.sorted.bam
taf-stringtie prepDE.py3 -i samples.txtLong-read, mixed-read, nascent-aware, and merge examples:
taf-stringtie -L -o long.gtf long.sorted.bam
taf-stringtie --mix -o mix.gtf short.sorted.bam long.sorted.bam
taf-stringtie -N -G reference.gtf -o total-rna.mature.gtf sample.sorted.bam
taf-stringtie --nasc -G reference.gtf -o total-rna.with-nascent.gtf sample.sorted.bam
taf-stringtie --merge -G reference.gtf -o merged.gtf gtf_list.txttaf-stringtie defaults to the upstream stringtie command. Because StringTie
uses option-led arguments, common calls such as taf-stringtie -o out.gtf input.bam are passed directly to StringTie.
Command mode remains enabled. If the first argument is another executable name, TAFFISH runs that command in the same container:
taf-stringtie stringtie --version
taf-stringtie prepDE.py3 -h
taf-stringtie prepDE.py -hThe historical prepDE.py command name is provided as a symlink to the
Python 3-compatible prepDE.py3, because upstream's original prepDE.py
script has a Python 2 shebang and Python 2 is not included in Debian 12.
The Dockerfile verifies this upstream archive checksum:
stringtie-3.0.3.offline.tar.gz
884bc6523c4af0d7b05518db5bbcf38e39d1804be9b3f8ee01d7078712b1262a
Installed runtime commands include:
stringtieprepDE.py3prepDE.pyas a compatibility symlink toprepDE.py3python3,bash, anddiff
The image also keeps the upstream StringTie test data under:
/opt/stringtie/tests
The stringtie binary is built with upstream's bundled HTSlib-related source
components from the offline package. Runtime dynamic dependencies are kept
small and include libstdc++6 and zlib1g.
This release is a native linux/amd64 package. The upstream offline build
currently drives bundled HTSlib with x86-specific compiler flags such as
-msse4.1, -mssse3, and -mpopcnt, so native linux/arm64 builds are not
declared for this app.
The TAFFISH wrapper embeds Docker and Podman --platform linux/amd64 runtime
arguments. On an arm64 host with Docker or Podman emulation configured, normal
TAFFISH calls still run the amd64 image through container emulation. This is
not the same as native arm64 support. Apptainer compatibility depends on the
host and image execution setup.
StringTie expects coordinate-sorted SAM, BAM, or CRAM alignments from RNA-seq
read mapping. Spliced short-read alignments generally need XS strand tags;
long-read alignments from minimap2 can use ts tags. CRAM input can use
--ref or --cram-ref to provide the reference FASTA.
Typical outputs are GTF files containing assembled transcripts and expression
attributes such as coverage, FPKM, and TPM. With -A, StringTie writes gene
abundance tables. With -B or -b, it writes Ballgown table files. With
-e -G, prepDE.py3 can convert StringTie GTF outputs into gene and
transcript count matrices.
This core StringTie app does not bundle read aligners or preprocessing tools such as HISAT2, STAR, minimap2, GMAP, or samtools. Those tools can be run as separate TAFFISH apps before StringTie. StringTie itself does not call them during normal assembly or quantification.
The optional upstream SuperReads_RNA module is not built or exposed in this
core image. That module has its own installation path and runtime dependencies
including Python, Perl, HISAT2, GMAP, and samtools; it is better handled as a
separate app or flow if needed.
The smoke tests validate representative local runs on upstream test BAM/GFF fixtures. They do not replace full biological validation on large RNA-seq projects, remote CRAM/URL access testing, or aligner/index preparation.
The TAFFISH app packaging files are licensed under Apache-2.0. The packaged upstream StringTie software is covered by: MIT with bundled third-party source components in the offline package. Bundled third-party components, datasets, models, and external resources keep their own license terms.
StringTie is released under the MIT License. The offline source package also contains bundled third-party source components used during compilation; see the upstream package for their notices.
Please cite the relevant StringTie publications:
- Pertea et al. 2015, Nature Biotechnology. DOI:
10.1038/nbt.3122; PMID:25690850 - Pertea et al. 2016, Nature Protocols. DOI:
10.1038/nprot.2016.095; PMID:27560171 - Kovaka et al. 2019, Genome Biology. DOI:
10.1186/s13059-019-1910-1; PMID:31842956 - Shumate et al. 2022, PLOS Computational Biology.
DOI:
10.1371/journal.pcbi.1009730; PMID:35793599 - Shinder et al. 2025, StringTie 3 preprint.
DOI:
10.1101/2025.05.21.655404
Smoke tests cover:
- version binding and dynamic-library checks
- upstream help for
--merge,--mix, and nascent-aware options prepDE.py3and the compatibilityprepDE.pyname- short-read assembly
- long-read guided assembly
- mixed-read and nascent-aware modes
- transcript merge mode
- expression-only/Ballgown output and
prepDE.py3count-matrix generation