Skip to content

perf(level22): complete donor parity path#110

Open
polaz wants to merge 18 commits intomainfrom
perf/level22-donor-parity
Open

perf(level22): complete donor parity path#110
polaz wants to merge 18 commits intomainfrom
perf/level22-donor-parity

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Apr 30, 2026

Summary

  • port level 22 donor BT/optimal path, entropy handling, and header/block behavior into the Rust encoder
  • preserve C decoder interoperability and donor ratio parity on corpus proxy and random/structured scenarios
  • optimize the level22 hot path (remove per-block clone, reduce BT/opt loop overhead, tighten candidate/match accounting)

Testing

  • cargo clippy -p structured-zstd --all-targets --features dict_builder -- -D warnings
  • cargo nextest run -p structured-zstd -E 'test(level22_sequences_match_donor_on_small_corpus_proxy) or test(level22_sequences_match_donor_on_corpus_proxy) or test(fse)'

Benchmark Notes

  • controlled level22 run: small-4k-log-lines improved into ~0.63-0.65ms range for Rust path
  • controlled level22 run: decodecorpus-z000033 improved into ~0.80-0.87s range for Rust path, ratio parity preserved against donor path
  • remaining gap is now concentrated in donor-equivalent BT traversal/search cost on decodecorpus path

Summary by CodeRabbit

  • New Features

    • Explicit strategies for compression levels 16–22 and matcher entropy seeding to improve large-window compression.
  • Improvements

    • Two-phase block encoding with post-split fallback, refined entropy/table selection, RLE-literal handling, and savings-driven framing/splitting for better compression on large inputs.
  • Bug Fixes

    • Reject empty dictionary sources.
  • Documentation

    • README updated: mapping narrowed to levels 5–15 and new notes for 16–22; prior fallback description removed.
  • Tests

    • Added Level‑22 tests and relaxed a cross-block size assertion.
  • Chores

    • Enabled an experimental dev feature for a dev dependency and updated bench flamegraph invocation.

polaz added 2 commits April 17, 2026 23:33
- align HC main hash with donor hash4 formula
- keep donor regression coverage for level 22 parity gap
- preserve current donor-path findings before full BT rewrite
Copilot AI review requested due to automatic review settings April 30, 2026 13:28
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a post-split compressed-path for levels 16–22, refactors block/literals/FSE/Huffman table construction and selection, extends Matcher API to accept seeded entropy, tightens dictionary/FSE error handling, updates benches/scripts/README, and expands tests.

Changes

Encoding core (blocks, FSE, Huffman, frame/matcher)

Layer / File(s) Summary
Data shapes & helpers
zstd/src/encoding/blocks/compressed.rs
Introduced EncodedBlockParts, RawSequence, SequencePrefixSums, and constants for split thresholds to support partitioned encoding.
Block collection & partitioning
zstd/src/encoding/blocks/compressed.rs
Added collect_block_parts, SplitEstimator, and compress_block_with_post_split to gather parts and partition sequence streams.
Per-part encoding
zstd/src/encoding/blocks/compressed.rs
Added encode_block_parts, emit_single_sequence_block, encode_raw_sequences, and partition-level Raw fallback with state snapshot/restore.
FSE mode selection & encoding
zstd/src/encoding/blocks/compressed.rs, zstd/src/fse/fse_encoder.rs
Introduced FseTableMode::Rle(u8), cost-based choose_table logic, encode_sequences accepting explicit per-field modes, and FSETable changes (start_state slot, max_num_bits_for_symbol, donor-based table construction).
Literals / Huffman integration
zstd/src/encoding/blocks/compressed.rs, zstd/src/huff0/huff0_encoder.rs
Literals pipeline updated (RLE/all-identical detection, compress_literals returns HuffmanTableUpdate), Huffman table serialization prefers FSE-encoded weight descriptions with round-trip gating and candidate selection across table-log.
Frame compressor & matcher seeding
zstd/src/encoding/frame_compressor.rs, zstd/src/encoding/mod.rs
Added Matcher::seed_dictionary_entropy(...), PreviousFseTable::Rle(u8), seed only Custom cached tables; reworked donor-block read loop with pending-input suffix handling and savings-driven splitting.
Tests (encoder-level)
zstd/src/encoding/blocks/compressed.rs, zstd/src/huff0/*
Added tests for RLE table selection, raw-fallback state restore, Huffman weight-description roundtrips, and level-22 literal block emission parsing.

Dictionary FSE & dictionary creation

Layer / File(s) Summary
FSE table construction APIs
zstd/src/fse/fse_encoder.rs
Rewrote table builders to donor-based normalization and global state-table generation; removed build_table_from_bytes; added start-state slots and max_num_bits_for_symbol.
Serialization & validation
zstd/src/dictionary/mod.rs
serialize_fse_table_from_corpus returns io::Result<Vec<u8>> and validates counts; callers updated with ?; create_fastcover_dict_from_source rejects empty input and unit test added.
Sort cleanup
zstd/src/dictionary/fastcover.rs
Segment ranking sort simplified to sort_by_key(Reverse(...)).

Level dispatch and fastest path

Layer / File(s) Summary
Level dispatch
zstd/src/encoding/levels/fastest.rs
For levels 16–22 with matcher window ≥128 KiB, use compress_block_with_post_split; snapshot/restore around compress_block to avoid mutated state on Raw fallback.

Tests, benches, scripts, docs, small cleanups

Layer / File(s) Summary
Bench signatures
zstd/benches/support/mod.rs
supported_levels() size changed from 5→6 to include level22; removed Scenario::is_empty.
Cross-validation tests
zstd/tests/cross_validation.rs
Added three tests exercising CompressionLevel::Level(22) roundtrip and size-parity properties.
Roundtrip integrity
zstd/src/tests/roundtrip_integrity.rs
Relaxed multi-block size assertion to allow 16-byte slack.
Decoding test cleanup
zstd/src/decoding/sequence_section_decoder.rs
Removed std-only debug-print loop in test_ll_default.
Cargo / scripts / docs
zstd/Cargo.toml, scripts/bench-flamegraph.sh, README.md
Added experimental feature to [dev-dependencies]; cargo flamegraph now passes --features dict_builder; README updated to map compression strategies (levels 5–15 coverage narrowed; explicit behaviors for 16–22 documented).

Sequence Diagram(s)

sequenceDiagram
    participant FrameComp as FrameCompressor
    participant Matcher as Matcher
    participant BlockComp as BlockCompressor
    participant FSEEnc as FSE_Encoder
    participant HuffEnc as Huffman_Encoder

    FrameComp->>FrameComp: load cached prev FSE/Huff tables
    FrameComp->>Matcher: seed_dictionary_entropy(huff?, ll?, ml?, of?)
    Matcher-->>FrameComp: primed or noop

    FrameComp->>FrameComp: read donor block (fixed-capacity)
    FrameComp->>BlockComp: compress_block_with_post_split(input)
    BlockComp->>FSEEnc: choose_table_for_field(...) -> (Custom | RLE | Repeat | Encoded)
    FSEEnc-->>BlockComp: FseTableMode / table bytes

    BlockComp->>HuffEnc: request literal table description
    HuffEnc->>FSEEnc: encode_weight_description(...) (validate round-trip)
    FSEEnc-->>HuffEnc: serialized weight bytes or reject
    HuffEnc-->>BlockComp: table bytes or raw fallback

    BlockComp->>BlockComp: estimate partition costs, emit compressed or raw (restore state on raw)
    BlockComp-->>FrameComp: emit block(s)
    FrameComp->>FrameComp: compute savings, store pending_input
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

"I hop through tables, twitching my nose,
Counting symbol carrots where entropy grows.
I split the blocks and stitch them tight,
RLE crumbs tucked in each little byte.
Hooray — a crunchy, lossless code delight!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf(level22): complete donor parity path' directly and specifically describes the main change: implementing level 22 donor parity optimization, which is the core focus across multiple files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/level22-donor-parity

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports C zstd “donor” behavior for level 22 into the Rust encoder (entropy/table decisions, block splitting, and framing), aiming for ratio parity and C decoder interoperability while addressing a level-22 performance hot path.

Changes:

  • Add donor-parity pre-block splitting and revised framing/block handling for high compression levels.
  • Rework Huffman/FSE table construction/selection and add additional interoperability/regression tests (including donor FFI validation).
  • Update benchmarks/docs to include and describe level 22 behavior.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
zstd/tests/cross_validation.rs Adds regression + ratio tests around Rust level 22 and FFI roundtrips.
zstd/src/tests/roundtrip_integrity.rs Relaxes an assertion to tolerate small overhead deltas in a size comparison test.
zstd/src/huff0/mod.rs Adjusts Huffman round-trip test helper behavior for non-writeable table descriptions.
zstd/src/huff0/huff0_encoder.rs Major Huffman encoder/table build updates + new donor-FFI acceptance tests.
zstd/src/fse/fse_encoder.rs Refactors FSE table construction/normalization toward donor parity and adds RLE tracking.
zstd/src/encoding/mod.rs Updates compression level docs and extends Matcher with dictionary entropy seeding hook.
zstd/src/encoding/levels/fastest.rs Routes level 16–22 through post-split compressed-block path and adds state rollback for raw fallback.
zstd/src/encoding/frame_compressor.rs Implements donor-style pre-splitting and framing policy changes; threads entropy into matcher.
zstd/src/encoding/blocks/compressed.rs Adds post-split block emission, RLE literals, and donor-like table selection heuristics.
zstd/src/dictionary/mod.rs Switches dict FSE table build path to symbol-count construction.
zstd/src/decoding/sequence_section_decoder.rs Removes debug printing from a test path.
zstd/benches/support/mod.rs Adds a level 22 benchmark configuration and removes an unused helper.
zstd/Cargo.toml Enables experimental feature on dev-dependency zstd for donor interop tests.
README.md Updates strategy/level coverage documentation to reflect new level 16–22 behavior.

Comment thread zstd/src/huff0/huff0_encoder.rs
Comment thread zstd/src/huff0/mod.rs Outdated
Comment thread zstd/src/encoding/levels/fastest.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 30, 2026

Codecov Report

❌ Patch coverage is 91.56909% with 144 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
zstd/src/huff0/huff0_encoder.rs 90.41% 63 Missing ⚠️
zstd/src/encoding/blocks/compressed.rs 92.82% 36 Missing ⚠️
zstd/src/fse/fse_encoder.rs 89.92% 28 Missing ⚠️
zstd/src/dictionary/mod.rs 79.59% 10 Missing ⚠️
zstd/src/encoding/frame_compressor.rs 96.39% 7 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
zstd/src/encoding/blocks/compressed.rs (1)

151-176: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Don't precompute repeat-offset codes before raw-fallback is decided.

collect_block_parts() bakes seq.of against the full-block repeat-offset history, but emit_single_sequence_block() can later replace a partition with BlockType::Raw. A raw partition does not advance decoder repeat-offset history, so any later compressed partition — and the next block if the last partition falls back to raw — will interpret those precomputed repcodes against the wrong history. This can produce invalid streams on mixed raw/compressed split outcomes.

Please defer encode_offset_with_history() until a partition is definitively kept compressed, or snapshot/recompute state.offset_hist alongside the Huffman/FSE state when a partition falls back to raw.

Also applies to: 256-291

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@zstd/src/encoding/blocks/compressed.rs` around lines 151 - 176,
collect_block_parts currently calls encode_offset_with_history() and stores
seq.of using the live state.offset_hist, which is invalid if later
emit_single_sequence_block chooses BlockType::Raw for a partition; instead stop
precomputing repeat-offset codes in collect_block_parts: record the raw offset
values (e.g., store offset as a plain u32 or keep a SequenceRaw with ll, ml,
raw_offset) and defer calling encode_offset_with_history() until the partition
is definitely emitted as compressed inside emit_single_sequence_block (or when
the compressed partition is finalized); alternatively, if you prefer to keep the
current structure, snapshot state.offset_hist (and any related decoder repeat
state) when producing each candidate sequence so you can recompute encodings
when the final decision is made—refer to collect_block_parts,
encode_offset_with_history, emit_single_sequence_block, BlockType::Raw and
state.offset_hist to locate the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/levels/fastest.rs`:
- Around line 73-83: The saved snapshot variables (saved_offset_hist,
saved_huff_table, saved_ll_previous, saved_ml_previous, saved_of_previous) are
cloned before an early return when compression_level matches
CompressionLevel::Level(16..=22) and state.matcher.window_size() is large, which
wastes work; move the snapshot/cloning to after that level/window-size check
(i.e., perform the check that calls compress_block_with_post_split and returns
BlockType::Compressed first) so you only clone the state when falling back to
compress_block where the saved_* values are actually used, or alternatively
ensure compress_block_with_post_split documents/manages state rollback if you
prefer keeping clones.

---

Outside diff comments:
In `@zstd/src/encoding/blocks/compressed.rs`:
- Around line 151-176: collect_block_parts currently calls
encode_offset_with_history() and stores seq.of using the live state.offset_hist,
which is invalid if later emit_single_sequence_block chooses BlockType::Raw for
a partition; instead stop precomputing repeat-offset codes in
collect_block_parts: record the raw offset values (e.g., store offset as a plain
u32 or keep a SequenceRaw with ll, ml, raw_offset) and defer calling
encode_offset_with_history() until the partition is definitely emitted as
compressed inside emit_single_sequence_block (or when the compressed partition
is finalized); alternatively, if you prefer to keep the current structure,
snapshot state.offset_hist (and any related decoder repeat state) when producing
each candidate sequence so you can recompute encodings when the final decision
is made—refer to collect_block_parts, encode_offset_with_history,
emit_single_sequence_block, BlockType::Raw and state.offset_hist to locate the
changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a6c79ff9-5204-41ca-a93a-af3d309c2c25

📥 Commits

Reviewing files that changed from the base of the PR and between 95af26d and 5d02a9b.

📒 Files selected for processing (16)
  • README.md
  • zstd/Cargo.toml
  • zstd/benches/support/mod.rs
  • zstd/src/decoding/sequence_section_decoder.rs
  • zstd/src/dictionary/fastcover.rs
  • zstd/src/dictionary/mod.rs
  • zstd/src/encoding/blocks/compressed.rs
  • zstd/src/encoding/frame_compressor.rs
  • zstd/src/encoding/levels/fastest.rs
  • zstd/src/encoding/match_generator.rs
  • zstd/src/encoding/mod.rs
  • zstd/src/fse/fse_encoder.rs
  • zstd/src/huff0/huff0_encoder.rs
  • zstd/src/huff0/mod.rs
  • zstd/src/tests/roundtrip_integrity.rs
  • zstd/tests/cross_validation.rs
💤 Files with no reviewable changes (1)
  • zstd/src/decoding/sequence_section_decoder.rs

Comment thread zstd/src/encoding/levels/fastest.rs Outdated
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Apr 30, 2026

@coderabbitai Re: repeat-offset codes before raw fallback. Fixed in c47314f: post-split collection now stores raw offsets, encode_offset_with_history() is deferred until a partition is emitted as compressed, and raw fallback restores offset_hist; added regression coverage plus level22 Rust→C interop and ratio parity tests.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 30, 2026

Tip

For best results, initiate chat on the files or code changes.

@polaz: Thanks for the detailed summary — deferring encode_offset_with_history() to the compressed-emission site and restoring offset_hist on the raw fallback is exactly the right fix. The regression coverage and level22 interop/parity tests round it out nicely. Marking this as resolved.


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-08T16:00:30.438Z
Learning: In `zstd/src/encoding/match_generator.rs`, the `RowMatchGenerator` (used for `CompressionLevel::Level(4)` / `MatcherBackend::Row`) uses a stable 4-byte hash key, consistent with the 4-byte lookahead constraint shared by `HcMatchGenerator`. Previous-block tail positions are backfilled into the row tables before matching/skip begins (analogous to `backfill_boundary_positions` for the HC backend), and a regression test covers cross-boundary tail reuse. Do not flag missing backfill or key-width issues for the Row backend.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 99
File: zstd/src/encoding/match_generator.rs:1675-1701
Timestamp: 2026-04-10T07:20:54.796Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd), `DfastMatchGenerator::skip_matching` (the dense/non-sparse path, i.e. when `used_sparse` is false) backfills the previous-slice tail (`current_abs_start.saturating_sub(DFAST_MIN_MATCH_LEN + 3) .. current_abs_start`) before calling `insert_positions(current_abs_start, current_abs_end)`. This mirrors the `skip_matching_dense` backfill and `HcMatchGenerator::backfill_boundary_positions` patterns, ensuring positions at block boundaries become hashable as soon as the next slice extends history. A regression test `dfast_dense_skip_matching_backfills_previous_tail_for_next_block` covers immediate cross-boundary matching after a dense skip. Do not flag the absence of this backfill as a missing step.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-08T21:51:56.652Z
Learning: In `zstd/src/decoding/ringbuffer.rs` (structured-world/structured-zstd), the test `copy_bytes_overshooting_preserves_prefix_for_runtime_chunk_lengths` (or equivalent) includes a misaligned extra-capacity case that uses `add(1)`-offset pointers and a `chunk_plus_one` length to exercise the rounded-up wildcopy overshoot path (where `copy_multiple > copy_at_least`) and validates prefix correctness. Do not flag the absence of a misaligned-pointer case in this test — it is present by design.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 99
File: zstd/src/encoding/frame_compressor.rs:779-805
Timestamp: 2026-04-11T07:49:03.699Z
Learning: In `zstd/src/encoding/frame_compressor.rs` (structured-world/structured-zstd, PR `#99`), hinted tiny frames (`source_size_hint <= 2^14` and `total_uncompressed <= 2^14`) are intentionally emitted as single-segment across ALL compression levels, including levels that produce compressed (non-raw) blocks. The `hinted_small_compressible_frames_use_single_segment_across_levels` test asserts `single_segment_flag() == true` for compressible hinted inputs (e.g. 4 KiB repeated log lines). Do not re-suggest asserting `single_segment_flag() == false` for compressible hinted frames — the donor-parity design requires single-segment for all hinted tiny frames.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-10T12:52:14.244Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd), `HcMatchGenerator::skip_matching`'s dense-tail reseed (the `Some(true)` / sparse incompressible path) iterates position-by-position over `tail_start..current_abs_end` and skips positions where `(pos - current_abs_start) % INCOMPRESSIBLE_SKIP_STEP == 0`. This prevents sparse-grid positions already indexed in the sparse pass from being reinserted and creating self-loop chain entries. A regression test `hc_sparse_skip_matching_does_not_reinsert_sparse_tail_positions` (line ~4086) covers this invariant. Do not flag the absence of a bulk `insert_positions(tail_start, current_abs_end)` call in the sparse-hint dense-tail path as a missing optimization.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 99
File: zstd/src/encoding/frame_compressor.rs:779-805
Timestamp: 2026-04-11T07:49:03.699Z
Learning: In `zstd/src/encoding/frame_compressor.rs` (structured-world/structured-zstd, PR `#99`), a temporary compatibility guard exists for sub-256-byte compressed payloads: the 1-byte-FCS single-segment form is not yet fully C-FFI compatible for payloads below 256 bytes, and a compat guard was added to handle this case while the path is hardened. Do not flag this guard as unnecessary.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-08T21:51:51.957Z
Learning: In `zstd/src/decoding/ringbuffer.rs` (structured-world/structured-zstd), the test suite for `copy_bytes_overshooting` includes a misaligned extra-capacity case that exercises the rounded-up wildcopy path (where `copy_multiple > copy_at_least`) and validates prefix correctness for that scenario. Do not flag the absence of a misaligned/overshooting extra-capacity sub-case as missing test coverage.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-11T07:15:56.589Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd, PR `#99`), `RowMatchGenerator::skip_matching_with_hint(Some(true))` performs sparse-prefix insertion via `insert_positions_with_step(current_abs_start, current_abs_end, INCOMPRESSIBLE_SKIP_STEP)` followed by a dense-tail reseed over `ROW_MIN_MATCH_LEN + INCOMPRESSIBLE_SKIP_STEP` positions, skipping sparse-grid positions (`!(pos - current_abs_start).is_multiple_of(INCOMPRESSIBLE_SKIP_STEP)`). The `MatcherBackend::Row` dispatch in `MatchGeneratorDriver::skip_matching_with_hint` now passes the hint through to the Row backend. A regression test `row_skip_matching_with_incompressible_hint_uses_sparse_prefix` covers this. Do not flag Row as ignoring incompressible hints or defaulting to dense-only skip.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 48
File: zstd/src/encoding/match_generator.rs:1554-1559
Timestamp: 2026-04-02T12:03:04.619Z
Learning: In `zstd/src/encoding/match_generator.rs`, `HcMatchGenerator::insert_position` has an early-return guard at `abs_pos >= u32::MAX` that disables all further HC insertions for streams >4 GiB, degrading `CompressionLevel::Better` to repcodes-only. This is a known limitation tracked in issue `#51`; the proper fix (table rebasing) is deferred. A TODO comment and a limitation note on `CompressionLevel::Better` doc were added in PR `#48`. Do not re-flag this as a new bug.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 99
File: zstd/src/encoding/match_generator.rs:1682-1687
Timestamp: 2026-04-09T23:59:12.317Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd), `DfastMatchGenerator::skip_matching_dense` backfills the previous-slice tail (`max(current_abs_start - 3, history_abs_start) .. current_abs_start`) before seeding the current slice via `insert_positions`. This ensures that positions at internal priming-slice boundaries become hashable as soon as the next slice extends history, keeping cross-slice dictionary priming fully dense. The pattern mirrors `HcMatchGenerator::backfill_boundary_positions`. Do not flag the absence of this backfill as a missing step in `skip_matching_dense`.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-06T01:40:24.378Z
Learning: In `zstd/benches/compare_ffi.rs` (structured-world/structured-zstd), Rust FastCOVER trains with the post-finalization content budget in both the `REPORT_DICT_TRAIN` emission path (around lines 208-225) and the Criterion benchmark path (around lines 266-280). Both paths were aligned in commit 8622344. Do not flag these ranges as using inconsistent budget values.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-11T11:56:17.814Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd, PR `#102`), after commit 46927db, hash-mix kernel dispatch was moved from the global `HASH_MIX_KERNEL` AtomicU8 hot-path read into a per-matcher instance field `hash_mix_kernel` stored directly in `DfastMatchGenerator` and `RowMatchGenerator`. The hot path no longer performs an atomic load or global branch; kernel selection happens once at matcher initialisation and is subsequently a plain struct-field read. Do not flag the hot path as doing a per-call atomic load — that pattern was superseded by commit 46927db.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 99
File: zstd/src/encoding/incompressible.rs:73-84
Timestamp: 2026-04-10T19:40:54.470Z
Learning: In `zstd/src/encoding/incompressible.rs` (structured-world/structured-zstd, PR `#99`), `compression_level_allows_raw_fast_path(level, window_size)` intentionally returns `true` for `Fastest`, `Default`, and `Better` (unconditionally), `true` for `Best` and `Level(_)` when `window_size <= BETTER_WINDOW_SIZE_BYTES`, and `false` for `Uncompressed`. The inclusion of `Better` and all numeric `Level(_)` (window-gated) is a deliberate perf-first decision for the incompressible early-exit throughput gains documented in issue `#97`. Do not flag Better as returning false or suggest restricting Level(_) to 0..=3 only.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 53
File: zstd/src/tests/roundtrip_integrity.rs:498-509
Timestamp: 2026-04-02T22:26:07.979Z
Learning: In `structured-zstd` (`zstd/src/tests/roundtrip_integrity.rs`), `best_level_does_not_regress_vs_better` uses a `<=` (not strict `<`) assertion because the `repeat_offset_fixture(b"HelloWorld", ...)` input is simple enough that HC saturates at both Better (16 candidates) and Best (32 candidates) search depths, producing identical compressed sizes (~30243 bytes). Strict `<` would be a false positive on this fixture. The strict `Best < Better` quality assertion lives in `cross_validation::best_level_beats_better_on_corpus_proxy` on the decodecorpus sample. Do not re-flag the `<=` as a weakened guard.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 53
File: zstd/src/tests/roundtrip_integrity.rs:479-523
Timestamp: 2026-04-02T19:19:44.750Z
Learning: In `structured-zstd` (`zstd/src/tests/roundtrip_integrity.rs`), the per-level roundtrip test suites (Better, Best) share a common `level_roundtrip_suite!` macro that expands the same 7-test matrix (compressible, random, multi-block, streaming, edge cases, repeat offsets, large literals) for each level via module-scoped macro invocations. Do not suggest re-inlining these as separate flat test functions — the macro was added intentionally to eliminate cross-level drift.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-11T11:31:54.292Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd, PR `#102`), the test `row_hash_and_row_extracts_high_bits` acquires `HASH_MIX_KERNEL_TEST_LOCK` before any `hash_mix_u64` call (fixed in commit 904d2b0). This lock guards all tests that force a specific `HashMixKernel` via `with_forced_hash_mix_kernel(...)` to prevent cross-test kernel races caused by the global `HASH_MIX_KERNEL` atomic. Do not flag the absence of this lock as missing — it is acquired at the top of each affected test.

Learnt from: polaz
Repo: structured-world/structured-zstd PR: 0
File: :0-0
Timestamp: 2026-04-11T11:31:46.105Z
Learning: In `zstd/src/encoding/match_generator.rs` (structured-world/structured-zstd, PR `#102`), the test-only `with_forced_hash_mix_kernel(...)` helper forces the global `HASH_MIX_KERNEL` atomic to a specific kernel for coverage. Tests that call this helper must acquire a shared mutex lock *before* the `with_forced_hash_mix_kernel` call (not only around the assertion) to prevent cross-test kernel races when `cargo nextest` runs tests in parallel. This pattern was locked in commit 904d2b0. Do not flag the lock acquisition as unnecessarily early.

Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'structured-zstd vs C FFI (x86_64-gnu)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.

Benchmark suite Current: a1f04b7 Previous: 95af26d Ratio
compress/fastest/small-4k-log-lines/matrix/pure_rust 0.12 ms 0.031 ms 3.87
compress/default/small-4k-log-lines/matrix/pure_rust 0.191 ms 0.108 ms 1.77
compress/better/small-4k-log-lines/matrix/pure_rust 0.196 ms 0.058 ms 3.38
compress/level4-row/small-4k-log-lines/matrix/pure_rust 0.159 ms 0.07 ms 2.27
compress/best/small-4k-log-lines/matrix/pure_rust 0.196 ms 0.059 ms 3.32
compress/better/decodecorpus-z000033/matrix/pure_rust 121.842 ms 75.918 ms 1.60
compress/best/decodecorpus-z000033/matrix/pure_rust 129.261 ms 93.58 ms 1.38
compress/better/low-entropy-1m/matrix/pure_rust 7.591 ms 4.932 ms 1.54
compress/best/low-entropy-1m/matrix/pure_rust 8.241 ms 5.103 ms 1.61
decompress/fastest/small-4k-log-lines/rust_stream/matrix/pure_rust 0.005 ms 0.003 ms 1.67
decompress/fastest/small-4k-log-lines/rust_stream/matrix/c_ffi 0.002 ms 0.001 ms 2
decompress/default/small-4k-log-lines/rust_stream/matrix/pure_rust 0.005 ms 0.003 ms 1.67
decompress/default/small-4k-log-lines/rust_stream/matrix/c_ffi 0.002 ms 0.001 ms 2
decompress/better/small-4k-log-lines/rust_stream/matrix/pure_rust 0.005 ms 0.003 ms 1.67
decompress/better/small-4k-log-lines/rust_stream/matrix/c_ffi 0.002 ms 0.001 ms 2
decompress/level4-row/small-4k-log-lines/rust_stream/matrix/pure_rust 0.005 ms 0.003 ms 1.67
decompress/level4-row/small-4k-log-lines/rust_stream/matrix/c_ffi 0.002 ms 0.001 ms 2
decompress/best/small-4k-log-lines/rust_stream/matrix/pure_rust 0.005 ms 0.003 ms 1.67
decompress/best/small-4k-log-lines/rust_stream/matrix/c_ffi 0.002 ms 0.001 ms 2
decompress/fastest/decodecorpus-z000033/rust_stream/matrix/pure_rust 7.147 ms 2.635 ms 2.71
decompress/fastest/decodecorpus-z000033/rust_stream/matrix/c_ffi 1.091 ms 0.609 ms 1.79
decompress/default/decodecorpus-z000033/rust_stream/matrix/pure_rust 6.69 ms 2.132 ms 3.14
decompress/default/decodecorpus-z000033/rust_stream/matrix/c_ffi 1.039 ms 0.553 ms 1.88
decompress/better/decodecorpus-z000033/rust_stream/matrix/pure_rust 6.945 ms 2.748 ms 2.53
decompress/better/decodecorpus-z000033/rust_stream/matrix/c_ffi 1.198 ms 0.626 ms 1.91
decompress/level4-row/decodecorpus-z000033/rust_stream/matrix/pure_rust 6.565 ms 2.122 ms 3.09
decompress/level4-row/decodecorpus-z000033/rust_stream/matrix/c_ffi 0.982 ms 0.51 ms 1.93
decompress/best/decodecorpus-z000033/rust_stream/matrix/pure_rust 6.891 ms 2.698 ms 2.55
decompress/best/decodecorpus-z000033/rust_stream/matrix/c_ffi 1.188 ms 0.616 ms 1.93

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports the donor (C zstd) level 22 “optimal” compression path into the Rust encoder to improve level22 performance and ratio parity while maintaining C decoder interoperability.

Changes:

  • Implement donor-style pre-block splitting, entropy handling, and literal/sequence encoding behavior for higher compression levels (16–22).
  • Rework Huffman/FSE table construction and selection to better match donor behavior (including RLE/repeat handling).
  • Add regression and cross-validation tests plus benchmark wiring for level 22 parity checks.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
zstd/tests/cross_validation.rs Adds level22 cross-validation against FFI decode and size comparisons on corpus proxy.
zstd/src/tests/roundtrip_integrity.rs Relaxes a size assertion for a multi-block offset reuse test.
zstd/src/huff0/mod.rs Tightens Huffman table description expectations in roundtrip helper; reduces test loop upper bound.
zstd/src/huff0/huff0_encoder.rs Implements donor-like Huffman weight/table sizing/selection and FSE-based weight descriptions; adds extensive interoperability tests.
zstd/src/fse/fse_encoder.rs Reworks interleaved encoding and donor-like normalization / table-log selection; adds start-state tracking.
zstd/src/encoding/mod.rs Updates CompressionLevel docs and adds matcher hook for seeding dictionary entropy tables.
zstd/src/encoding/levels/fastest.rs Adds level16–22 post-split path; adds state restoration on raw fallback for oversized compressed blocks.
zstd/src/encoding/frame_compressor.rs Adds donor pre-split block sizing, contiguous-buffer chunking via pending suffix, and dictionary entropy seeding.
zstd/src/encoding/blocks/compressed.rs Adds post-split compressed-block emission, literal RLE handling, and donor-like FSE table mode selection (including RLE/repeat).
zstd/src/dictionary/mod.rs Switches FSE table building to symbol-count-based construction.
zstd/src/dictionary/fastcover.rs Uses sort_by_key(Reverse(..)) for segment ordering.
zstd/src/decoding/sequence_section_decoder.rs Removes debug-print loop from a test.
zstd/benches/support/mod.rs Adds level22 to supported benchmark levels; removes unused Scenario helper.
zstd/Cargo.toml Enables experimental feature on dev-dependency zstd crate for tests/interop.
README.md Updates documented strategy coverage to include btopt/btultra/btultra2-style implementations for levels 16–22.

Comment thread zstd/src/huff0/huff0_encoder.rs
Comment thread zstd/src/huff0/huff0_encoder.rs Outdated
Comment thread zstd/src/huff0/huff0_encoder.rs Outdated
Comment thread zstd/src/dictionary/mod.rs
Comment thread zstd/src/encoding/levels/fastest.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/dictionary/mod.rs`:
- Around line 349-354: serialize_fse_table_from_corpus can panic because
build_table_from_symbol_counts asserts total > 1 when symbols are degenerate
(e.g., bounded_fse_symbols returns [0u8]); change
serialize_fse_table_from_corpus to return io::Result<Vec<u8>> and detect
degenerate counts before calling build_table_from_symbol_counts: compute total
from counts, and if total <= 1 return an Ok(serialized_fallback) or an explicit
Err with context (mirror serialize_huffman_table behavior), then propagate that
Result through finalized_content_budget and callers such as
create_fastcover_dict_from_source and train_fastcover_raw_from_slice so the
panic is avoided and callers can handle the empty-input case.

In `@zstd/src/encoding/levels/fastest.rs`:
- Around line 73-77: The call to compress_block_with_post_split can emit
multiple partition blocks or raw fallbacks but the caller unconditionally
returns BlockType::Compressed; update the contract so callers can observe
multi-block emission: either (A) change compress_block_with_post_split's
documentation and signature to return a precise result (e.g.,
Result<BlockEmission, Error> or Vec<BlockType>/EmissionSummary) and have the
caller use that return value instead of unconditionally returning
BlockType::Compressed, or (B) add and return a new enum variant (e.g.,
BlockType::MultiPart or BlockEmission::Multiple) that represents multipart
emission and update the caller here to return that variant after calling
compress_block_with_post_split; ensure compress_block_with_post_split, its doc
comments, and the caller (the code currently returning BlockType::Compressed)
consistently reflect the new contract.

In `@zstd/src/huff0/huff0_encoder.rs`:
- Around line 151-154: The FSE/raw switch is using weights.len() / 2 which
undercounts the raw nibble byte cost for odd-length weights; update the
comparison that considers raw_description_is_representable and the branch using
encoded.len() to use the correct ceiling byte count for the raw nibble form
(i.e. compute raw bytes as (weights.len() + 1) / 2 or call a div_ceil helper) so
that both write_table() and try_table_description_size() use the true
weights.len().div_ceil(2) value when deciding between FSE vs raw description.
- Around line 271-280: The estimator estimate_compressed_size currently returns
bits.div_ceil(8) but must account for the byte-aligned terminator emitted by
encode_stream; change estimate_compressed_size to compute byte_count =
bits.div_ceil(8) and if bits % 8 == 0 (i.e., already byte-aligned) add one extra
byte to byte_count before returning; apply the identical adjustment to the other
payload-size estimator used by build_from_counts (the second estimator in this
module) so best_size and the early-break heuristic use the corrected sizes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4db2d9a4-b87d-4f8e-91c2-d2c0cd95bcb8

📥 Commits

Reviewing files that changed from the base of the PR and between c47314f and 44168f4.

📒 Files selected for processing (3)
  • zstd/src/dictionary/mod.rs
  • zstd/src/encoding/levels/fastest.rs
  • zstd/src/huff0/huff0_encoder.rs

Comment thread zstd/src/dictionary/mod.rs Outdated
Comment thread zstd/src/encoding/levels/fastest.rs
Comment thread zstd/src/huff0/huff0_encoder.rs
Comment thread zstd/src/huff0/huff0_encoder.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports the C zstd “donor” behavior for level 22 (bt* optimal paths, entropy/table selection, and framing/block splitting) into the Rust encoder to improve performance and maintain decoder interoperability.

Changes:

  • Add level 22-specific encoding behaviors: pre-block splitting, post-split fallback, entropy/table selection updates, and RLE literal handling.
  • Rework Huffman/FSE table building and serialization to better match donor semantics and improve interoperability validation.
  • Expand tests/bench support and update docs to reflect the new level-to-strategy mapping.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
zstd/tests/cross_validation.rs Adds regression and size-parity checks for Rust level 22 vs FFI decoder/donor output.
zstd/src/tests/roundtrip_integrity.rs Relaxes a multi-block compression-size assertion to tolerate new framing/entropy behavior.
zstd/src/huff0/mod.rs Strengthens Huffman round-trip helper by asserting table-description writeability; adjusts test coverage range.
zstd/src/huff0/huff0_encoder.rs Adds FSE-based Huffman weight description path, size estimation helpers, and extensive interop/roundtrip tests.
zstd/src/fse/fse_encoder.rs Refactors interleaved encoding, adds donor-normalization table-log selection, and updates start-state handling.
zstd/src/encoding/mod.rs Updates documentation for levels 16–22 and adds a matcher hook for seeding dictionary entropy.
zstd/src/encoding/levels/fastest.rs Uses post-split block compression for levels 16–22 and restores entropy/history on oversize raw fallback.
zstd/src/encoding/frame_compressor.rs Implements donor-like pre-splitting and framing policy; adds entropy seeding from cached dictionary state.
zstd/src/encoding/blocks/compressed.rs Introduces two-phase block encoding, post-split emission, table selection changes, and RLE literal sections.
zstd/src/dictionary/mod.rs Replaces byte-slice FSE table build with explicit symbol counting; keeps vec! macro import explicit.
zstd/src/dictionary/fastcover.rs Uses sort_by_key(Reverse(...)) for segment ordering.
zstd/src/decoding/sequence_section_decoder.rs Removes debug printing from tests.
zstd/benches/support/mod.rs Adds level 22 to benchmark level mapping and removes unused helper.
zstd/Cargo.toml Enables experimental feature for dev zstd dependency to support donor interop tests.
README.md Updates level strategy mapping documentation to include btopt/btultra/btultra2-style coverage.

Comment thread zstd/src/encoding/blocks/compressed.rs
Comment thread zstd/src/tests/roundtrip_integrity.rs Outdated
Comment thread zstd/src/encoding/blocks/compressed.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/blocks/compressed.rs`:
- Around line 335-389: The estimator currently only measures encode_block_parts
output in estimate_subblock_size, causing derive_block_splits to pick splits
that ignore the real emission overhead (emit_single_sequence_block adds a 3-byte
BlockHeader and may switch to Raw based on min_gain); update
estimate_subblock_size to mirror the real emission cost model used in
emit_single_sequence_block by adding the fixed BlockHeader size (3 bytes) and
applying the same min_gain/raw conversion logic (i.e., if the compressed payload
+ header doesn't meet min_gain compared to raw, return the raw-emission size
instead) so the split scorer uses the exact same cost decisions as the emitter.

In `@zstd/src/huff0/huff0_encoder.rs`:
- Around line 434-487: The sentinel “infinity” used in nodes.resize and the
low_s < 0 branches (currently 1usize<<30 and 1usize<<31) is too small; replace
those magic constants with usize::MAX for the HuffNode.count sentinel (in the
nodes.resize initializer and in both places where you use the large constant in
the selection blocks). Also change the initial merge sum assignment
(nodes[node_nb].count = nodes[low_s as usize].count + nodes[(low_s - 1) as
usize].count) to use saturating_add to avoid overflow (e.g.,
.saturating_add(...)). Update uses of these symbols: nodes.resize(..., HuffNode
{ count: ... }), the low_s < 0 branches, and the nodes[node_nb].count assignment
to use usize::MAX and saturating_add accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5f0c4cdd-1833-4037-8ee9-48744c509ea5

📥 Commits

Reviewing files that changed from the base of the PR and between 44168f4 and 97eeb6a.

📒 Files selected for processing (5)
  • zstd/src/dictionary/mod.rs
  • zstd/src/encoding/blocks/compressed.rs
  • zstd/src/encoding/levels/fastest.rs
  • zstd/src/huff0/huff0_encoder.rs
  • zstd/src/tests/roundtrip_integrity.rs

Comment thread zstd/src/encoding/blocks/compressed.rs Outdated
Comment thread zstd/src/huff0/huff0_encoder.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports zstd “donor” behavior for high compression levels (notably level 22) into the Rust encoder, focusing on BT/optimal parsing parity, entropy/table handling, and block framing/splitting while maintaining C decoder interoperability.

Changes:

  • Add level 22 cross-validation and interoperability regression tests, plus loosen one cross-block assertion to tolerate small framing overhead.
  • Rework Huffman + FSE encoder table selection/normalization and add donor-compat validation tests (including calling into libzstd HUF routines).
  • Implement donor-style pre-splitting / post-split fallback behavior in the encoder hot path, and expose matcher hooks for dictionary entropy seeding.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
zstd/tests/cross_validation.rs Adds level 22 roundtrip + ratio parity/regression tests against the FFI decoder/encoder.
zstd/src/tests/roundtrip_integrity.rs Relaxes a cross-block comparison to allow bounded slack for framing/table decisions.
zstd/src/huff0/mod.rs Tightens Huffman round-trip assumptions and reduces a test loop range.
zstd/src/huff0/huff0_encoder.rs Adds FSE-based Huffman weight description path, donor-compat tests, and table size estimation helpers.
zstd/src/fse/fse_encoder.rs Reworks FSE table construction/normalization to match donor behavior; adds RLE constraints and start-state tracking.
zstd/src/encoding/mod.rs Updates compression-level docs and adds a matcher hook for dictionary entropy seeding.
zstd/src/encoding/levels/fastest.rs Adds post-split compressed path for levels 16–22 and restores entropy/history state on oversize fallback.
zstd/src/encoding/frame_compressor.rs Implements donor-style pre-splitting across reads, savings tracking, single-segment framing policy adjustments, and entropy seeding wiring.
zstd/src/encoding/blocks/compressed.rs Refactors block encoding into collect/encode phases, adds post-split block emission, and donor-style entropy/table selection (incl. RLE modes).
zstd/src/dictionary/mod.rs Switches FSE table building to symbol-count path and adds validation for insufficient statistics.
zstd/src/dictionary/fastcover.rs Minor sort refactor using Reverse key ordering.
zstd/src/decoding/sequence_section_decoder.rs Removes debug print loop from a test.
zstd/benches/support/mod.rs Adds level 22 to benchmark level mappings and removes an unused helper.
zstd/Cargo.toml Enables experimental feature for the dev-dependency zstd crate (needed by new donor-interop tests).
README.md Updates documented strategy/back-end mapping for levels 16–22.

Comment thread zstd/src/encoding/blocks/compressed.rs Outdated
Comment thread zstd/src/encoding/blocks/compressed.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/dictionary/mod.rs`:
- Around line 343-359: The code currently picks sample_data over raw_content,
computes bounded_fse_symbols(source, max_symbol) and then errors if total <= 1;
update this to fall back to raw_content when sample_data was used and the first
pass yields total <= 1: after computing counts and total from symbols, if total
<= 1 and source was sample_data (i.e., sample_data.is_empty() was false),
recompute symbols/counts/total using raw_content as source and proceed; only
return the io::Error if total is still <= 1 after this second pass. Ensure you
reference the same bounded_fse_symbols, counts, and total variables so
callers/logic remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 98827d74-7fc5-4c39-b1fe-2bf571525187

📥 Commits

Reviewing files that changed from the base of the PR and between 97eeb6a and 23fb1ee.

📒 Files selected for processing (3)
  • zstd/src/dictionary/mod.rs
  • zstd/src/encoding/blocks/compressed.rs
  • zstd/src/huff0/huff0_encoder.rs

Comment thread zstd/src/dictionary/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ports Zstd “donor” level 22 optimal (BT/price-parser) behavior into the Rust encoder to improve performance and maintain C decoder interoperability, with additional entropy/table-selection and framing/block-splitting parity work.

Changes:

  • Adds level 16–22 strategy semantics (btopt/btultra/btultra2-style profiles), dictionary entropy seeding hooks, and donor-aligned single-segment framing and block pre-splitting.
  • Reworks entropy coding paths (HUF weight descriptions via FSE when needed, FSE table construction/selection including RLE modes) and compressed-block post-split fallback behavior.
  • Expands/adjusts regression and interoperability tests/benches (level 22 corpus proxy, donor HUF/FSE acceptance checks, dictionary empty-source rejection).

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
zstd/tests/cross_validation.rs Adds level-22 corpus-proxy cross-validation and FFI decode regression tests.
zstd/src/tests/roundtrip_integrity.rs Relaxes cross-block compression assertion with explicit overhead slack.
zstd/src/huff0/mod.rs Strengthens round-trip invariant checks; narrows looped fixtures.
zstd/src/huff0/huff0_encoder.rs Implements/validates FSE-based Huffman weight descriptions; adds donor interop tests.
zstd/src/fse/fse_encoder.rs Reworks FSE encoder interleaving and donor-aligned table building/normalization.
zstd/src/encoding/mod.rs Updates level→strategy documentation; adds matcher dictionary-entropy seeding hook.
zstd/src/encoding/levels/fastest.rs Adds post-split compressed-block path and rollback for oversize fallback.
zstd/src/encoding/frame_compressor.rs Adds donor pre-splitting and contiguous-input block handling; updates framing policy.
zstd/src/encoding/blocks/compressed.rs Major refactor: post-split encoding, literals RLE handling, FSE RLE mode, table selection cost model.
zstd/src/dictionary/mod.rs Switches FSE table build to symbol counts, propagates Result, rejects empty training source.
zstd/src/dictionary/fastcover.rs Uses key-based sorting for segment scoring.
zstd/src/decoding/sequence_section_decoder.rs Removes debug printing from tests.
zstd/benches/support/mod.rs Adds level22 to benchmark matrix; removes unused helper.
zstd/Cargo.toml Enables experimental feature on dev-dependency zstd for interoperability testing.
README.md Updates documented strategy coverage for levels 5–22.

Comment thread zstd/src/fse/fse_encoder.rs Outdated
Comment thread zstd/src/fse/fse_encoder.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/fse/fse_encoder.rs`:
- Line 56: The assertion in zstd/src/fse/fse_encoder.rs currently forbids a
two-symbol interleaved stream by using assert!(data.len() > 2); relax this to
assert!(data.len() >= 2) so len == 2 is allowed (the even branch can init both
terminal states and write them out); do not change the singleton fallback
logic—leave handling of len == 1 to the caller/single-stream path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 69702e5a-e994-44f7-accc-66d5bd668490

📥 Commits

Reviewing files that changed from the base of the PR and between 23fb1ee and b903f45.

📒 Files selected for processing (2)
  • zstd/src/dictionary/mod.rs
  • zstd/src/fse/fse_encoder.rs

Comment thread zstd/src/fse/fse_encoder.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/fse/fse_encoder.rs`:
- Around line 338-345: The current check uses total > 1 which allows
single-symbol (repeated) histograms to proceed into donor normalization; change
the guard to reject one-symbol histograms by counting distinct non-zero symbols
(e.g., let distinct = counts.iter().filter(|&&c| c > 0).count();) and assert
distinct > 1 (or return the RLE path) before calling
donor_optimal_table_log/max_symbol; update the logic around total, max_symbol,
and donor_optimal_table_log so only histograms with more than one distinct
symbol build FSE tables (use the symbols counts, total, max_symbol,
donor_optimal_table_log, probs to locate the code to change).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5e296704-c24a-4c74-a851-57da7c522b7f

📥 Commits

Reviewing files that changed from the base of the PR and between b903f45 and 9135def.

📒 Files selected for processing (2)
  • zstd/src/encoding/match_generator.rs
  • zstd/src/fse/fse_encoder.rs

Comment thread zstd/src/fse/fse_encoder.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@zstd/src/encoding/blocks/compressed.rs`:
- Around line 389-393: The estimator currently resets the right partition to the
block-entry state by reassigning self.scratch_state.last_huff_table,
.fse_tables.ll_previous/.ml_previous/.of_previous, and .offset_hist from the
original block-entry values; instead, after scoring/encoding the left partition
(the "first" split) propagate the left partition's post-emit state into the
scratch_state used to score the right partition so the "second" split is
evaluated with the actual Huffman/FSE tables and repeat offsets that would be
present at emission time (adjust the code paths around derive_block_splits()/the
split-scoring routine to clone the post-state produced by the left partition and
use those values for last_huff_table, fse_tables.* and offset_hist when
computing the right-hand cost, rather than reinitializing from the block-entry
state).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3ef26892-fa6d-4839-8e42-67d51051e63a

📥 Commits

Reviewing files that changed from the base of the PR and between eed734e and 6279ca7.

📒 Files selected for processing (1)
  • zstd/src/encoding/blocks/compressed.rs

Comment thread zstd/src/encoding/blocks/compressed.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants