Improve FSST LIKE contains handling by gatesn · Pull Request #8573 · vortex-data/vortex

gatesn · 2026-06-24T15:04:06Z

Rational for this change

FSST's compressed DFA path is not always the fastest path for short %needle% LIKE patterns. For short substring needles, decoding the FSST array once and using the existing canonical LIKE implementation is faster than walking the compressed byte stream through the DFA.

This also makes Shared transparent to parent reductions so FSST parent kernels are still reached when arrays are wrapped by the shared cache layer.

What changes are included in this PR?

Short contains-style FSST LIKE patterns now fall back to canonicalized LIKE evaluation. The FSST LikeKind parser is made crate-visible so the LIKE kernel can choose that path before constructing the DFA matcher.

The PR also adds a regression test covering FSST parent kernels through SharedArray.

What APIs are changed? Are there any user-facing changes?

No public API changes. The behavior change is internal query execution: matching results are unchanged, but some FSST LIKE predicates should execute faster.

Signed-off-by: Nicholas Gates <nick@nickgates.com>

codspeed-hq · 2026-06-24T15:23:27Z

Merging this PR will degrade performance by 36.13%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 9 regressed benchmarks
✅ 1578 untouched benchmarks
⏩ 4 skipped benchmarks¹

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`fsst_contains[path]`	5 ms	10.9 ms	-54.35%
❌	Simulation	`fsst_contains[email]`	4.4 ms	9.5 ms	-53.28%
❌	Simulation	`like_substr_high_match`	8.1 ms	16.6 ms	-51%
❌	Simulation	`fsst_contains[json]`	13.9 ms	28.3 ms	-50.99%
❌	Simulation	`fsst_contains[cb]`	15.5 ms	30.4 ms	-49.12%
❌	Simulation	`chunked_bool_canonical_into[(1000, 10)]`	16.3 µs	26.9 µs	-39.46%
❌	Simulation	`fsst_contains[log]`	25.4 ms	40.5 ms	-37.35%
❌	Simulation	`fsst_contains[urls]`	10.4 ms	15 ms	-30.87%
❌	Simulation	`slice_empty_vortex`	310 ns	368.3 ns	-15.84%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[128]`	244.4 ns	215.3 ns	+13.55%
⚡	Simulation	`bitwise_not_vortex_buffer_mut[1024]`	304.7 ns	275.6 ns	+10.58%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing ngates/fsst-like-pushdown (d15418f) with develop (2a19323)}

4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Improve FSST LIKE contains handling

d15418f

Signed-off-by: Nicholas Gates <nick@nickgates.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve FSST LIKE contains handling#8573

Improve FSST LIKE contains handling#8573
gatesn wants to merge 1 commit into
developfrom
ngates/fsst-like-pushdown

gatesn commented Jun 24, 2026

Uh oh!

codspeed-hq Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

gatesn commented Jun 24, 2026

Rational for this change

What changes are included in this PR?

What APIs are changed? Are there any user-facing changes?

Uh oh!

codspeed-hq Bot commented Jun 24, 2026

Merging this PR will degrade performance by 36.13%

Performance Changes

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant