Skip to content

Optimize VBA compression architecture and preserve benchmark gains #20

Description

@jozefizso

Context

The compression library currently has two useful reference points:

  • Old algorithm: benchmark branch
  • New algorithm: feature/architecture_change branch

The benchmark project now exercises a corpus of real VBA source files from the ExcelVBA sample repository, plus repeated and low-compressibility variants. This gives us a more realistic signal than only synthetic byte arrays.

Current architecture analysis

Old algorithm

The old compression path spends most of its time in copy-token discovery. For each position in the decompressed chunk, it searches backwards through previously seen bytes to find the best matching sequence. This makes compression cost grow quickly as the chunk gets larger or as the input contains many partial matches.

The old approach also creates more short-lived objects and temporary buffers during tokenization and decompression. This increases managed allocation pressure and causes more GC work during larger inputs.

New algorithm

The new branch changes the hot path in two important ways:

  1. It introduces indexed match lookup through a MatchFinder that keys candidate positions by a 3-byte prefix.
  2. It adds a fast path for repeated-byte runs, where the best copy token can be found without walking the general candidate chain.

For decompression, the new branch writes output directly into a byte list. Copy tokens append from already decompressed output instead of seeking through a stream and allocating temporary copied sequences.

The result is a much better balance: compression is substantially faster, and managed allocations are lower across every benchmarked input.

Benchmark comparison

Positive deltas mean the new algorithm is better.

Compression

Input Old algorithm mean New algorithm mean Speed delta Allocation delta
ExcelVbaCorpus 25.970 ms 2.009 ms +92.3% +63.4% less
ExcelVbaCorpusRepeated 163.750 ms 15.978 ms +90.2% +59.1% less
ExcelVbaCorpusWithNoise 448.400 ms 26.664 ms +94.1% +66.2% less
ExcelVbaLargestModuleRepeated 151.270 ms 12.019 ms +92.1% +61.8% less
LowCompressibility 534.190 ms 16.495 ms +96.9% +19.1% less

Decompression

Input Old algorithm mean New algorithm mean Speed delta Allocation delta
ExcelVbaCorpus 894.200 us 749.380 us +16.2% +15.9% less
ExcelVbaCorpusRepeated 7.737 ms 6.891 ms +10.9% +13.7% less
ExcelVbaCorpusWithNoise 25.995 ms 20.099 ms +22.7% +6.7% less
ExcelVbaLargestModuleRepeated 7.853 ms 6.264 ms +20.2% +13.2% less
LowCompressibility 113.700 us 57.320 us +49.6% +24.7% less

Proposed changes

  • Keep the indexed match-finder architecture for compression.
  • Keep the repeated-byte run fast path because it gives a large win for repetitive data and low-compressibility test cases.
  • Keep decompression writing into a contiguous byte buffer rather than using stream seeking and temporary copy arrays.
  • Add or preserve regression tests for malformed compressed data, copy-token bounds, and decompression equivalence.
  • Keep the benchmark project as part of the repository so future compression changes can be compared against the real VBA corpus.
  • Consider adding a longer benchmark job for release/performance validation, while keeping ShortRun useful for quick local checks.

Notes

The benchmark values above came from BenchmarkDotNet ShortRun, so exact percentages may vary between machines or runs. The compression improvement is large enough that the direction is clear: the new architecture should be preferred, provided compatibility tests continue to pass.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions