Skip to content

cloudstreet-dev/The-Big-Book-of-Compression-Algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Big Book of Compression Algorithms

Everything compresses differently.

A survey of the major algorithms — Huffman to Zstandard, lossless to lossy, zip files to video codecs — covering mechanics, tradeoffs, and ideal contexts. Closes with a bonus section on why LLMs are, at their core, just really expensive lossy compressors.

Read Online

The book is published at: https://cloudstreet-dev.github.io/The-Big-Book-of-Compression-Algorithms/

What's Inside

Part I — Foundations

Chapter Topic
What is Compression? Entropy, Shannon's theorem, lossless vs lossy
Huffman Coding Frequency trees, prefix codes, canonical Huffman
Arithmetic Coding Interval subdivision, range coding, ANS/FSE

Part II — Dictionary and String Methods

Chapter Topic
LZ77, LZ78, LZW Sliding windows, growing dictionaries, how ZIP works
Deflate, Brotli, Zstandard Why Zstd won, tunable tradeoffs
LZMA and 7-Zip Ultra-high compression, Markov models, range coding

Part III — Specialized and Simple Methods

Chapter Topic
RLE and Simple Methods RLE, delta encoding, BWT, bzip2 pipeline
Image Compression PNG, JPEG, WebP, AVIF — what you lose and when it matters
Audio Compression FLAC, MP3, AAC, Opus — psychoacoustics and bitrate tradeoffs

Part IV — Systems and Scale

Chapter Topic
Video Compression H.264, H.265, AV1 — inter-frame prediction, why video is different
Database and Columnar Compression Snappy, LZ4, Parquet, dictionary/RLE/delta encodings
Network Compression HTTP compression, HPACK, when to compress in transit vs at rest

Part V — Making Decisions

Chapter Topic
Choosing the Right Algorithm Decision framework, speed vs ratio vs compatibility

Bonus

Chapter Topic
LLMs as Lossy Compressors Training as compression, inference as decompression, hallucinations as artifacts

Building Locally

Requires mdBook:

cargo install mdbook
mdbook build     # outputs to ./book/
mdbook serve     # serves at http://localhost:3000 with live reload

About

Published by CloudStreet. Written for developers who want to understand not just which algorithm to use, but why it works and where it breaks down.

Licensed under CC BY 4.0.

About

Everything compresses differently. A survey of the major algorithms — Huffman to Zstandard, lossless to lossy, zip files to video codecs — covering mechanics, tradeoffs, and ideal contexts. Closes with a bonus section on why LLMs are, at their core, just really expensive lossy compressors.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors