Skip to content

Commit ffcb905

Browse files
committed
Rust rewrite
1 parent 7e15d40 commit ffcb905

1 file changed

Lines changed: 6 additions & 6 deletions

File tree

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,17 @@
22

33
# Seqsum
44

5-
Robust checksums for nucleotide sequences. Accepts input from either standard input or `fast[a|q][.gz|.zst|.xz|.bz2]` files. Generates individual checksums for each sequence, plus an aggregate checksum for a collection. Warnings are shown for duplicate sequences and within-collection checksum collisions at the selected bit depth. Sequences are uppercased before hashing with [rapidhash](https://github.com/Nicoshev/rapidhash) (`v3`) and may be normalised (with `-n`) to use only `ACGTN-`. Read IDs and FASTQ base quality scores do not inform the checksum. Output is tab-delimited text to stdout.
5+
> [!WARNING]
6+
> Seqsum was rewritten in Rust in 0.3.0. The original Python version of seqsum and how to use is archived in the [`python`](https://github.com/bede/seqsum/tree/python) branch. It remains available on PyPI.
67
7-
By default, seqsum outputs individual checksums and, when there is more than one sequence, an aggregate checksum. This can be modified with `--individual` (`-i`) or `--aggregate` (`-a`).
8+
Robust checksums for nucleotide sequences. Accepts input from either standard input or `fast[a|q][.gz|.zst]` files. Generates *individual* checksums for each sequence, plus an *aggregate* checksum for a collection. Warnings are shown for duplicate sequences and within-collection checksum collisions at the selected bit depth. Sequences are uppercased before hashing with [RapidHash](https://github.com/Nicoshev/rapidhash) (v3) and may be normalised (with `-n`) to use only `ACGTN-`. Read IDs and FASTQ base quality scores do not inform the checksum. Output is tab-delimited text to stdout.
89

9-
Uses [`paraseq`](https://github.com/mbhall88/paraseq) for efficient FASTA/FASTQ parsing.
10+
By default, seqsum outputs individual checksums and, when there is more than one sequence, an aggregate checksum. This can be modified with `--individual` (`-i`) or `--aggregate` (`-a`).
1011

1112
## Install
1213

1314
```bash
14-
cargo install --path .
15+
cargo install seqsum
1516
```
1617

1718
## Development
@@ -20,8 +21,6 @@ cargo install --path .
2021
git clone https://github.com/bede/seqsum.git
2122
cd seqsum
2223
cargo test
23-
cargo fmt --all --check
24-
cargo clippy --all-targets -- -D warnings
2524
```
2625

2726
## Command line usage
@@ -52,3 +51,4 @@ $ cat tests/data/MN908947.fasta | seqsum -
5251
```bash
5352
$ seqsum -h
5453
```
54+

0 commit comments

Comments
 (0)