Add compression-oriented function reordering pass#8696
Conversation
Implement the --reorder-functions-by-similarity optimization pass in wasm-opt. Gzip and Brotli compression algorithms rely on finding repetitive byte patterns inside a sliding window (e.g., 32KB for Gzip). If structurally similar functions are placed far apart in the Wasm binary, the compressor cannot detect matches across them. While the existing --reorder-functions pass sorts functions strictly by call frequency to shrink LEB128 indexes, it scatters mutually compressible functions and ultimately increases gzipped delivery sizes. This new pass traverses defined function bodies in post-order and extracts a similarity sorting key based on signature type IDs, local variables types, and structural opcode sequences. By sorting defined functions lexicographically by this key, structurally similar functions are physically grouped together in the output binary, providing adjacent compressible bytes. Empirical benchmarks on real-world Flutter and Poppler Wasm examples show a significant improvement, saving up to 2.13% and .98% in compressed delivery size compared to the baseline (no reordering).
|
Below is a comparison of the uncompressed and gzip-compressed binary sizes for both configurations. There are still some tweaks I think we can make. I've been able to get 2% on some files, but it wasn't doing as well on others (still need to figure out why).
|
tlively
left a comment
There was a problem hiding this comment.
Mostly comments on algorithmic improvements. Let me know if you'd rather land as-is to get the measured benefit without investing more time in algorithmic improvements and I can review with that in mind.
| // Capture important immediate type/operator information | ||
| // TODO: There's probably more data that would be useful to capture. |
There was a problem hiding this comment.
You could probably extract and reuse the HashStringifyWalker from Outlining.cpp. It turns expression trees into strings by shallowly hashing each expression, including all of its immediates. You would just want it to use a normal PostWalker (but probably modified to also call addUniqueSymbol at control flow boundaries, e.g. end and else) instead of the custom StringifyWalker it currently uses. Nothing a little extra templating can't solve!
| // does not help and can regress size due to breaking natural call | ||
| // proximity. |
There was a problem hiding this comment.
Not call proximity, but LEB size, right?
| size_t numThreads = ThreadPool::get()->size(); | ||
| std::vector<std::function<ThreadWorkState()>> doWorkers; |
There was a problem hiding this comment.
I don't think we have any other passes that use ThreadPool directly. This is typically done by ParallelFunctionAnalysis or with a nested Pass for which isFunctionParallel() returns true.
| ThreadPool::get()->work(doWorkers); | ||
|
|
||
| // 3. Sort defined functions by the similarity heuristic | ||
| std::sort(keys.begin(), keys.end()); |
There was a problem hiding this comment.
Sorting only works when the similarities are at the beginning of the strings, right? It seems like looking for matching substrings would be more robust. You could check out what Outlining.cpp does with a suffix tree to find common substrings, for example.
|
I assume the background here is #4322 ? Some prior work is there. |
|
No, though I did find that after starting this. Awhile ago I was playing with compressed wat vs wasm with brotli/gzip and added a note to try reordering for gzip. I haven't tried out the idea from cromulate. I was also going to ask if you still have your |
Implement the
--reorder-functions-by-similarityoptimization pass in wasm-opt.Gzip and Brotli compression algorithms rely on finding repetitive byte patterns inside a sliding window (e.g., 32KB for Gzip). If structurally similar functions are placed far apart in the Wasm binary, the compressor cannot detect matches across them. While the existing --reorder-functions pass sorts functions strictly by call frequency to shrink LEB128 indexes, it scatters mutually compressible functions and ultimately increases gzipped delivery sizes.
This new pass traverses defined function bodies in post-order and extracts a similarity sorting key based on signature type IDs, local variables types, and structural opcode sequences. By sorting defined functions lexicographically by this key, structurally similar functions are physically grouped together in the output binary, providing adjacent compressible bytes.