Skip to content

IN LIST: reinterpret FixedSizeBinary for primitive fast paths#23018

Draft
geoffreyclaude wants to merge 11 commits into
apache:mainfrom
geoffreyclaude:perf/in_list_fixed_size_binary_filter
Draft

IN LIST: reinterpret FixedSizeBinary for primitive fast paths#23018
geoffreyclaude wants to merge 11 commits into
apache:mainfrom
geoffreyclaude:perf/in_list_fixed_size_binary_filter

Conversation

@geoffreyclaude

@geoffreyclaude geoffreyclaude commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

FixedSizeBinary means every value has the same number of bytes. For widths 1, 2, 4, 8, and 16, those bytes have the same shape as the primitive values optimized earlier in the stack.

That lets DataFusion reuse the existing fast paths without copying the bytes:

For example, a FixedSizeBinary(4) value is four bytes wide, just like a UInt32. The bytes can be checked by the same fixed-width lookup machinery. The value is still treated as binary data; this is only an internal lookup representation.

Other fixed-size binary widths stay on the generic fallback path.

What changes are included in this PR?

  • Routes FixedSizeBinary(1) and FixedSizeBinary(2) through the bitmap filters.
  • Routes FixedSizeBinary(4), (8), and (16) through branchless or direct-probe filters based on list size.
  • Adds shared width validation so bitmap, branchless, and direct-probe wrappers all accept matching FixedSizeBinary needles.
  • Keeps unsupported FixedSizeBinary widths on ArrayStaticFilter.
  • Adds focused coverage for sliced lists, null semantics, bitmap-width lookup, and direct-probe-width lookup.

Are these changes tested?

Yes.

  • cargo fmt --all --check
  • cargo test -p datafusion-physical-expr fixed_size_binary --lib
  • cargo test -p datafusion-physical-expr test_in_list_from_array_type_combinations --lib
  • cargo test -p datafusion-physical-expr reinterpreted_ --lib
  • cargo test -p datafusion-physical-expr in_list_binary_types --lib
  • cargo clippy -p datafusion-physical-expr --all-targets --all-features -- -D warnings

Are there any user-facing changes?

No. This is an internal performance optimization only.

Local benchmark snapshot

Benchmark command:

cargo bench -p datafusion-physical-expr --profile release-nonlto --bench in_list_strategy -- fixed_size_binary --save-baseline pr23018_rebased

Method: compare adjacent saved baselines using raw Criterion sample minima (min(time / iters)). Lower is better; changes within +/-5% are treated as noise.

Compared baselines: #23016 -> #23018

Relevant scope: FixedSizeBinary rows.

Summary: 8 relevant rows, 8 faster, 0 slower, 0 within +/-5%.

Benchmark Before After Change
fixed_size_binary/fsb16/list=10000/match=0% 29.76 us 10.86 us -63.5% (2.74x faster)
fixed_size_binary/fsb16/list=10000/match=50% 78.30 us 12.11 us -84.5% (6.46x faster)
fixed_size_binary/fsb16/list=256/match=0% 27.59 us 11.28 us -59.1% (2.45x faster)
fixed_size_binary/fsb16/list=256/match=50% 73.52 us 11.98 us -83.7% (6.14x faster)
fixed_size_binary/fsb16/list=4/match=0% 26.68 us 12.49 us -53.2% (2.14x faster)
fixed_size_binary/fsb16/list=4/match=50% 73.79 us 12.44 us -83.1% (5.93x faster)
fixed_size_binary/fsb16/list=64/match=0% 26.49 us 11.25 us -57.5% (2.36x faster)
fixed_size_binary/fsb16/list=64/match=50% 71.19 us 12.41 us -82.6% (5.73x faster)

@github-actions github-actions Bot added the physical-expr Changes to the physical-expr crates label Jun 18, 2026
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_fixed_size_binary_filter branch 3 times, most recently from 70c420f to 098e0a6 Compare June 18, 2026 10:29
@geoffreyclaude geoffreyclaude changed the title Implement FixedSizeBinary zero-copy reinterpretation optimization IN LIST: reinterpret FixedSizeBinary for primitive fast paths Jun 18, 2026
@geoffreyclaude

Copy link
Copy Markdown
Contributor Author

run benchmark in_list_strategy

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4745227444-590-9nfvd 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing perf/in_list_fixed_size_binary_filter (098e0a6) to c7e9284 (merge-base) diff using: in_list_strategy
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                                HEAD                                   perf_in_list_fixed_size_binary_filter
-----                                                                ----                                   -------------------------------------
dictionary/i32/dict=10/list=16                                       1.00      7.6±0.01µs        ? ?/sec    1.01      7.7±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=16                                      1.00      7.7±0.01µs        ? ?/sec    1.01      7.8±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=16/NOT_IN                               1.00      7.7±0.01µs        ? ?/sec    1.01      7.8±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=4                                       1.00      7.7±0.01µs        ? ?/sec    1.00      7.7±0.01µs        ? ?/sec
dictionary/i32/dict=100/list=64                                      1.00      7.7±0.01µs        ? ?/sec    1.01      7.8±0.01µs        ? ?/sec
dictionary/i32/dict=1000/list=16                                     1.01      9.0±0.01µs        ? ?/sec    1.00      8.9±0.01µs        ? ?/sec
dictionary/utf8_long/dict=100/list=16                                1.06      8.8±0.05µs        ? ?/sec    1.00      8.3±0.01µs        ? ?/sec
dictionary/utf8_short/dict=50/list=32                                1.06      8.6±0.06µs        ? ?/sec    1.00      8.1±0.01µs        ? ?/sec
dictionary/utf8_short/dict=50/list=8                                 1.05      8.5±0.11µs        ? ?/sec    1.00      8.1±0.01µs        ? ?/sec
dictionary/utf8_short/dict=500/list=20                               1.00     10.4±0.06µs        ? ?/sec    1.16     12.1±0.01µs        ? ?/sec
f32/large_list/list=64/match=0%                                      2.06     16.0±0.12µs        ? ?/sec    1.00      7.8±0.00µs        ? ?/sec
f32/large_list/list=64/match=50%                                     2.04     18.7±0.14µs        ? ?/sec    1.00      9.2±0.01µs        ? ?/sec
f32/small_list/list=32/match=0%                                      1.07     16.3±0.01µs        ? ?/sec    1.00     15.2±0.02µs        ? ?/sec
f32/small_list/list=32/match=50%                                     1.55     23.5±0.38µs        ? ?/sec    1.00     15.2±0.01µs        ? ?/sec
f32/small_list/list=4/match=0%                                       4.60     16.0±0.01µs        ? ?/sec    1.00      3.5±0.00µs        ? ?/sec
f32/small_list/list=4/match=50%                                      7.98     27.6±0.26µs        ? ?/sec    1.00      3.5±0.01µs        ? ?/sec
fixed_size_binary/fsb16/list=10000/match=0%                          2.71     33.3±0.06µs        ? ?/sec    1.00     12.3±0.04µs        ? ?/sec
fixed_size_binary/fsb16/list=10000/match=50%                         5.79     83.5±0.35µs        ? ?/sec    1.00     14.4±0.05µs        ? ?/sec
fixed_size_binary/fsb16/list=256/match=0%                            3.04     31.2±0.18µs        ? ?/sec    1.00     10.3±0.02µs        ? ?/sec
fixed_size_binary/fsb16/list=256/match=50%                           6.90     78.0±0.43µs        ? ?/sec    1.00     11.3±0.01µs        ? ?/sec
fixed_size_binary/fsb16/list=4/match=0%                              2.68     31.0±0.04µs        ? ?/sec    1.00     11.5±0.01µs        ? ?/sec
fixed_size_binary/fsb16/list=4/match=50%                             6.41     74.1±0.32µs        ? ?/sec    1.00     11.6±0.04µs        ? ?/sec
fixed_size_binary/fsb16/list=64/match=0%                             3.00     30.9±0.12µs        ? ?/sec    1.00     10.3±0.03µs        ? ?/sec
fixed_size_binary/fsb16/list=64/match=50%                            6.66     75.9±2.60µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
narrow_integer/i16/list=256/match=0%                                 2.47     13.1±0.09µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
narrow_integer/i16/list=256/match=50%                                3.29     17.5±0.21µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
narrow_integer/i16/list=4/match=0%                                   2.45     13.0±0.04µs        ? ?/sec    1.00      5.3±0.00µs        ? ?/sec
narrow_integer/i16/list=4/match=50%                                  4.31     22.9±0.31µs        ? ?/sec    1.00      5.3±0.03µs        ? ?/sec
narrow_integer/i16/list=64/match=0%                                  2.35     12.6±0.02µs        ? ?/sec    1.00      5.4±0.01µs        ? ?/sec
narrow_integer/i16/list=64/match=50%                                 3.43     18.5±0.17µs        ? ?/sec    1.00      5.4±0.03µs        ? ?/sec
narrow_integer/u8/list=16/match=0%                                   2.49     12.9±0.14µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=16/match=50%                                  4.71     24.5±0.16µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=4/match=0%                                    2.29     11.9±0.02µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
narrow_integer/u8/list=4/match=50%                                   5.08     26.4±0.29µs        ? ?/sec    1.00      5.2±0.00µs        ? ?/sec
nulls/narrow_integer/u8/list=16/match=50%/nulls=20%                  6.52     34.5±0.98µs        ? ?/sec    1.00      5.3±0.02µs        ? ?/sec
nulls/primitive/i32/large_list/list=64/match=50%/nulls=20%           2.03     20.5±0.07µs        ? ?/sec    1.00     10.1±0.01µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=20%           2.33     23.1±0.13µs        ? ?/sec    1.00      9.9±0.00µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=20%/NOT_IN    2.49     24.8±0.13µs        ? ?/sec    1.00     10.0±0.01µs        ? ?/sec
nulls/primitive/i32/small_list/list=16/match=50%/nulls=50%           1.77     17.6±0.16µs        ? ?/sec    1.00      9.9±0.01µs        ? ?/sec
nulls/utf8/long_24b/list=16/match=50%/nulls=20%                      1.18     82.4±0.24µs        ? ?/sec    1.00     69.6±0.19µs        ? ?/sec
nulls/utf8/short_8b/list=16/match=50%/nulls=20%                      1.08     71.6±0.27µs        ? ?/sec    1.00     66.4±0.37µs        ? ?/sec
nulls/utf8view/long_24b/list=16/match=50%/nulls=20%                  1.00     97.8±0.75µs        ? ?/sec    1.08    105.7±0.24µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=20%                  5.05     58.7±0.28µs        ? ?/sec    1.00     11.6±0.01µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=20%/NOT_IN           5.00     58.6±0.32µs        ? ?/sec    1.00     11.7±0.01µs        ? ?/sec
nulls/utf8view/short_8b/list=16/match=50%/nulls=50%                  4.11     50.8±0.30µs        ? ?/sec    1.00     12.4±0.01µs        ? ?/sec
primitive/i32/large_list/list=256/match=0%                           1.53     12.0±0.02µs        ? ?/sec    1.00      7.8±0.01µs        ? ?/sec
primitive/i32/large_list/list=256/match=50%                          1.96     18.9±0.14µs        ? ?/sec    1.00      9.6±0.01µs        ? ?/sec
primitive/i32/large_list/list=64/match=0%                            1.54     12.1±0.05µs        ? ?/sec    1.00      7.9±0.00µs        ? ?/sec
primitive/i32/large_list/list=64/match=50%                           1.95     19.3±0.20µs        ? ?/sec    1.00      9.9±0.01µs        ? ?/sec
primitive/i32/small_list/list=16/match=50%/NOT_IN                    2.56     25.3±0.23µs        ? ?/sec    1.00      9.9±0.01µs        ? ?/sec
primitive/i32/small_list/list=32/match=0%                            1.00     12.1±0.09µs        ? ?/sec    1.26     15.2±0.01µs        ? ?/sec
primitive/i32/small_list/list=32/match=50%                           1.21     18.4±1.18µs        ? ?/sec    1.00     15.2±0.01µs        ? ?/sec
primitive/i32/small_list/list=4/match=0%                             3.52     12.2±0.01µs        ? ?/sec    1.00      3.5±0.00µs        ? ?/sec
primitive/i32/small_list/list=4/match=50%                            6.71     23.2±0.65µs        ? ?/sec    1.00      3.5±0.00µs        ? ?/sec
primitive/i64/large_list/list=128/match=0%                           1.58     12.4±0.07µs        ? ?/sec    1.00      7.8±0.01µs        ? ?/sec
primitive/i64/large_list/list=128/match=50%                          2.01     18.6±0.17µs        ? ?/sec    1.00      9.3±0.01µs        ? ?/sec
primitive/i64/large_list/list=32/match=0%                            1.51     12.7±0.06µs        ? ?/sec    1.00      8.4±0.01µs        ? ?/sec
primitive/i64/large_list/list=32/match=50%                           2.15     22.3±0.54µs        ? ?/sec    1.00     10.4±0.01µs        ? ?/sec
primitive/i64/small_list/list=16/match=0%                            1.00     12.7±0.01µs        ? ?/sec    1.25     15.9±0.02µs        ? ?/sec
primitive/i64/small_list/list=16/match=50%                           1.43     22.7±0.37µs        ? ?/sec    1.00     15.9±0.02µs        ? ?/sec
primitive/i64/small_list/list=4/match=0%                             2.91     12.6±0.02µs        ? ?/sec    1.00      4.3±0.01µs        ? ?/sec
primitive/i64/small_list/list=4/match=50%                            5.15     22.3±0.72µs        ? ?/sec    1.00      4.3±0.00µs        ? ?/sec
timestamp_ns/large_list/list=32/match=0%                             3.02     25.0±0.11µs        ? ?/sec    1.00      8.3±0.01µs        ? ?/sec
timestamp_ns/large_list/list=32/match=50%                            6.02     59.4±0.59µs        ? ?/sec    1.00      9.9±0.01µs        ? ?/sec
timestamp_ns/small_list/list=16/match=0%                             1.56     24.9±0.03µs        ? ?/sec    1.00     15.9±0.01µs        ? ?/sec
timestamp_ns/small_list/list=16/match=50%                            3.71     59.1±0.59µs        ? ?/sec    1.00     15.9±0.02µs        ? ?/sec
timestamp_ns/small_list/list=4/match=0%                              5.80     25.1±0.03µs        ? ?/sec    1.00      4.3±0.01µs        ? ?/sec
timestamp_ns/small_list/list=4/match=50%                             13.71    59.3±0.43µs        ? ?/sec    1.00      4.3±0.01µs        ? ?/sec
utf8/long_24b/list=256/match=0%                                      1.19     41.3±0.06µs        ? ?/sec    1.00     34.7±0.04µs        ? ?/sec
utf8/long_24b/list=256/match=50%                                     1.31     94.8±0.46µs        ? ?/sec    1.00     72.5±0.46µs        ? ?/sec
utf8/long_24b/list=4/match=0%                                        1.15     41.3±0.04µs        ? ?/sec    1.00     35.8±0.36µs        ? ?/sec
utf8/long_24b/list=4/match=50%                                       1.30     93.2±0.30µs        ? ?/sec    1.00     71.7±0.45µs        ? ?/sec
utf8/long_24b/list=64/match=0%                                       1.17     40.9±0.07µs        ? ?/sec    1.00     34.8±0.47µs        ? ?/sec
utf8/long_24b/list=64/match=50%                                      1.27     94.0±0.45µs        ? ?/sec    1.00     74.3±0.67µs        ? ?/sec
utf8/mixed_len/list=16/match=0%                                      1.14     40.3±0.10µs        ? ?/sec    1.00     35.3±0.15µs        ? ?/sec
utf8/mixed_len/list=16/match=50%                                     1.25    120.6±1.09µs        ? ?/sec    1.00     96.3±0.77µs        ? ?/sec
utf8/mixed_len/list=64/match=0%                                      1.18     43.0±0.12µs        ? ?/sec    1.00     36.4±0.12µs        ? ?/sec
utf8/mixed_len/list=64/match=50%                                     1.21    124.6±0.83µs        ? ?/sec    1.00    102.8±0.94µs        ? ?/sec
utf8/shared_prefix/pfx=12/list=32/match=50%                          1.28     93.8±0.36µs        ? ?/sec    1.00     73.1±0.65µs        ? ?/sec
utf8/short_8b/list=16/match=50%/NOT_IN                               1.00     84.2±0.28µs        ? ?/sec    1.09     91.7±0.31µs        ? ?/sec
utf8/short_8b/list=256/match=0%                                      1.00     33.8±0.04µs        ? ?/sec    2.19     74.0±0.15µs        ? ?/sec
utf8/short_8b/list=256/match=50%                                     1.00     85.1±0.34µs        ? ?/sec    1.05     89.1±0.30µs        ? ?/sec
utf8/short_8b/list=4/match=0%                                        1.00     34.3±0.35µs        ? ?/sec    2.17     74.4±0.17µs        ? ?/sec
utf8/short_8b/list=4/match=50%                                       1.00     83.1±0.34µs        ? ?/sec    1.03     85.6±0.32µs        ? ?/sec
utf8/short_8b/list=64/match=0%                                       1.00     33.6±0.03µs        ? ?/sec    2.22     74.5±0.16µs        ? ?/sec
utf8/short_8b/list=64/match=50%                                      1.00     85.2±0.33µs        ? ?/sec    1.09     92.8±0.35µs        ? ?/sec
utf8view/len_12b/list=16/match=0%                                    2.49     25.8±0.03µs        ? ?/sec    1.00     10.4±0.02µs        ? ?/sec
utf8view/len_12b/list=16/match=50%                                   5.93     66.2±0.40µs        ? ?/sec    1.00     11.2±0.01µs        ? ?/sec
utf8view/len_12b/list=64/match=0%                                    2.59     25.9±0.03µs        ? ?/sec    1.00     10.0±0.02µs        ? ?/sec
utf8view/len_12b/list=64/match=50%                                   5.94     65.9±0.44µs        ? ?/sec    1.00     11.1±0.02µs        ? ?/sec
utf8view/long_24b/list=16/match=0%                                   3.62     47.8±0.08µs        ? ?/sec    1.00     13.2±0.01µs        ? ?/sec
utf8view/long_24b/list=16/match=50%                                  1.44    106.6±0.25µs        ? ?/sec    1.00     74.1±0.37µs        ? ?/sec
utf8view/long_24b/list=256/match=0%                                  2.71     47.3±0.07µs        ? ?/sec    1.00     17.5±0.02µs        ? ?/sec
utf8view/long_24b/list=256/match=50%                                 1.27    106.7±0.35µs        ? ?/sec    1.00     83.9±0.22µs        ? ?/sec
utf8view/long_24b/list=4/match=0%                                    3.96     47.9±0.05µs        ? ?/sec    1.00     12.1±0.02µs        ? ?/sec
utf8view/long_24b/list=4/match=50%                                   1.46    106.3±0.33µs        ? ?/sec    1.00     73.0±0.51µs        ? ?/sec
utf8view/long_24b/list=64/match=0%                                   2.55     47.2±0.07µs        ? ?/sec    1.00     18.5±0.03µs        ? ?/sec
utf8view/long_24b/list=64/match=50%                                  1.25    106.5±0.43µs        ? ?/sec    1.00     85.1±0.48µs        ? ?/sec
utf8view/mixed_len/list=16/match=0%                                  2.79     35.8±0.03µs        ? ?/sec    1.00     12.8±0.02µs        ? ?/sec
utf8view/mixed_len/list=16/match=50%                                 2.78    107.5±0.51µs        ? ?/sec    1.00     38.6±0.27µs        ? ?/sec
utf8view/mixed_len/list=64/match=0%                                  2.81     35.6±0.10µs        ? ?/sec    1.00     12.7±0.02µs        ? ?/sec
utf8view/mixed_len/list=64/match=50%                                 2.25    117.5±0.58µs        ? ?/sec    1.00     52.3±0.24µs        ? ?/sec
utf8view/shared_prefix/pfx=12/list=32/match=0%                       4.82     50.4±0.21µs        ? ?/sec    1.00     10.4±0.02µs        ? ?/sec
utf8view/shared_prefix/pfx=12/list=32/match=50%                      1.46    108.5±0.37µs        ? ?/sec    1.00     74.3±0.28µs        ? ?/sec
utf8view/shared_prefix/pfx=16/list=64/match=0%                       4.54     47.4±0.05µs        ? ?/sec    1.00     10.4±0.03µs        ? ?/sec
utf8view/shared_prefix/pfx=16/list=64/match=50%                      1.43    107.9±0.35µs        ? ?/sec    1.00     75.5±0.20µs        ? ?/sec
utf8view/shared_prefix/pfx=8/list=16/match=0%                        3.55     37.1±0.05µs        ? ?/sec    1.00     10.4±0.02µs        ? ?/sec
utf8view/shared_prefix/pfx=8/list=16/match=50%                       1.55     94.5±1.96µs        ? ?/sec    1.00     60.9±0.20µs        ? ?/sec
utf8view/short_8b/list=16/match=0%                                   2.45     25.0±0.02µs        ? ?/sec    1.00     10.2±0.01µs        ? ?/sec
utf8view/short_8b/list=16/match=50%                                  6.02     66.4±0.33µs        ? ?/sec    1.00     11.0±0.02µs        ? ?/sec
utf8view/short_8b/list=256/match=0%                                  2.49     25.1±0.05µs        ? ?/sec    1.00     10.1±0.02µs        ? ?/sec
utf8view/short_8b/list=256/match=50%                                 6.09     67.4±0.55µs        ? ?/sec    1.00     11.1±0.02µs        ? ?/sec
utf8view/short_8b/list=4/match=0%                                    2.24     25.6±0.03µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
utf8view/short_8b/list=4/match=50%                                   5.70     65.1±1.54µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
utf8view/short_8b/list=64/match=0%                                   2.37     24.9±0.19µs        ? ?/sec    1.00     10.5±0.01µs        ? ?/sec
utf8view/short_8b/list=64/match=50%                                  5.73     66.8±0.42µs        ? ?/sec    1.00     11.7±0.02µs        ? ?/sec

Resource Usage

in_list_strategy — base (merge-base)

Metric Value
Wall time 1195.3s
Peak memory 40.7 MiB
Avg memory 30.9 MiB
CPU user 1380.6s
CPU sys 1.1s
Peak spill 0 B

in_list_strategy — branch

Metric Value
Wall time 1310.3s
Peak memory 42.3 MiB
Avg memory 30.1 MiB
CPU user 1451.3s
CPU sys 1.1s
Peak spill 0 B

File an issue against this benchmark runner

@geoffreyclaude

Copy link
Copy Markdown
Contributor Author

run benchmark in_list

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4745623089-592-5pkzr 6.12.68+ #1 SMP Sat May 2 07:49:07 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing perf/in_list_fixed_size_binary_filter (098e0a6) to c7e9284 (merge-base) diff using: in_list
Results will be posted here when complete


File an issue against this benchmark runner

@adriangbot

Copy link
Copy Markdown

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                  HEAD                                   perf_in_list_fixed_size_binary_filter
-----                                                  ----                                   -------------------------------------
in_list/Float32/list=100/nulls=0%                      8.08     60.6±2.29µs        ? ?/sec    1.00      7.5±0.03µs        ? ?/sec
in_list/Float32/list=100/nulls=20%                     3.23     25.6±0.92µs        ? ?/sec    1.00      7.9±0.01µs        ? ?/sec
in_list/Float32/list=28/nulls=0%                       2.90     46.4±2.08µs        ? ?/sec    1.00     16.0±0.02µs        ? ?/sec
in_list/Float32/list=28/nulls=20%                      3.86     62.0±0.52µs        ? ?/sec    1.00     16.1±0.03µs        ? ?/sec
in_list/Float32/list=3/nulls=0%                        4.92     15.4±0.06µs        ? ?/sec    1.00      3.1±0.01µs        ? ?/sec
in_list/Float32/list=3/nulls=20%                       4.81     15.4±0.08µs        ? ?/sec    1.00      3.2±0.01µs        ? ?/sec
in_list/Float32/list=8/nulls=0%                        3.04     15.5±0.06µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list/Float32/list=8/nulls=20%                       3.00     15.5±0.06µs        ? ?/sec    1.00      5.2±0.02µs        ? ?/sec
in_list/Int16/list=100/nulls=0%                        6.59     35.1±0.56µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
in_list/Int16/list=100/nulls=20%                       5.27     28.5±1.50µs        ? ?/sec    1.00      5.4±0.01µs        ? ?/sec
in_list/Int16/list=28/nulls=0%                         10.69    57.5±0.84µs        ? ?/sec    1.00      5.4±0.07µs        ? ?/sec
in_list/Int16/list=28/nulls=20%                        10.84    58.6±2.82µs        ? ?/sec    1.00      5.4±0.01µs        ? ?/sec
in_list/Int16/list=3/nulls=0%                          2.27     12.1±0.04µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
in_list/Int16/list=3/nulls=20%                         2.25     12.2±0.03µs        ? ?/sec    1.00      5.4±0.01µs        ? ?/sec
in_list/Int16/list=8/nulls=0%                          2.28     12.1±0.04µs        ? ?/sec    1.00      5.3±0.01µs        ? ?/sec
in_list/Int16/list=8/nulls=20%                         2.28     12.3±0.08µs        ? ?/sec    1.00      5.4±0.01µs        ? ?/sec
in_list/Int32/list=100/nulls=0%                        6.32     46.1±0.57µs        ? ?/sec    1.00      7.3±0.01µs        ? ?/sec
in_list/Int32/list=100/nulls=20%                       6.91     50.4±0.80µs        ? ?/sec    1.00      7.3±0.01µs        ? ?/sec
in_list/Int32/list=28/nulls=0%                         2.32     37.1±0.45µs        ? ?/sec    1.00     16.0±0.04µs        ? ?/sec
in_list/Int32/list=28/nulls=20%                        3.66     58.8±1.71µs        ? ?/sec    1.00     16.1±0.03µs        ? ?/sec
in_list/Int32/list=3/nulls=0%                          3.94     12.3±0.07µs        ? ?/sec    1.00      3.1±0.01µs        ? ?/sec
in_list/Int32/list=3/nulls=20%                         3.79     12.2±0.04µs        ? ?/sec    1.00      3.2±0.01µs        ? ?/sec
in_list/Int32/list=8/nulls=0%                          2.39     12.2±0.04µs        ? ?/sec    1.00      5.1±0.01µs        ? ?/sec
in_list/Int32/list=8/nulls=20%                         2.35     12.2±0.10µs        ? ?/sec    1.00      5.2±0.01µs        ? ?/sec
in_list/TimestampNs/list=100/nulls=0%                  8.97     66.6±1.84µs        ? ?/sec    1.00      7.4±0.01µs        ? ?/sec
in_list/TimestampNs/list=100/nulls=20%                 4.58     36.8±0.68µs        ? ?/sec    1.00      8.0±0.02µs        ? ?/sec
in_list/TimestampNs/list=28/nulls=0%                   5.44     41.4±0.73µs        ? ?/sec    1.00      7.6±0.02µs        ? ?/sec
in_list/TimestampNs/list=28/nulls=20%                  3.99     32.8±1.05µs        ? ?/sec    1.00      8.2±0.01µs        ? ?/sec
in_list/TimestampNs/list=3/nulls=0%                    6.74     24.9±0.06µs        ? ?/sec    1.00      3.7±0.01µs        ? ?/sec
in_list/TimestampNs/list=3/nulls=20%                   7.29     27.6±0.31µs        ? ?/sec    1.00      3.8±0.01µs        ? ?/sec
in_list/TimestampNs/list=8/nulls=0%                    3.53     25.5±0.06µs        ? ?/sec    1.00      7.2±0.01µs        ? ?/sec
in_list/TimestampNs/list=8/nulls=20%                   3.79     27.7±0.20µs        ? ?/sec    1.00      7.3±0.01µs        ? ?/sec
in_list/UInt8/list=100/nulls=0%                        3.52     18.3±0.42µs        ? ?/sec    1.00      5.2±0.01µs        ? ?/sec
in_list/UInt8/list=100/nulls=20%                       4.25     22.4±1.77µs        ? ?/sec    1.00      5.3±0.02µs        ? ?/sec
in_list/UInt8/list=28/nulls=0%                         12.21    63.5±0.93µs        ? ?/sec    1.00      5.2±0.02µs        ? ?/sec
in_list/UInt8/list=28/nulls=20%                        8.75     46.6±0.61µs        ? ?/sec    1.00      5.3±0.07µs        ? ?/sec
in_list/UInt8/list=3/nulls=0%                          2.34     12.1±0.04µs        ? ?/sec    1.00      5.2±0.02µs        ? ?/sec
in_list/UInt8/list=3/nulls=20%                         2.33     12.4±0.04µs        ? ?/sec    1.00      5.3±0.03µs        ? ?/sec
in_list/UInt8/list=8/nulls=0%                          2.39     12.5±0.07µs        ? ?/sec    1.00      5.2±0.01µs        ? ?/sec
in_list/UInt8/list=8/nulls=20%                         2.32     12.2±0.03µs        ? ?/sec    1.00      5.3±0.02µs        ? ?/sec
in_list/Utf8/list=100/nulls=0%/str=100                 1.18     94.4±0.76µs        ? ?/sec    1.00     80.2±1.62µs        ? ?/sec
in_list/Utf8/list=100/nulls=0%/str=12                  1.90     60.3±0.58µs        ? ?/sec    1.00     31.7±1.02µs        ? ?/sec
in_list/Utf8/list=100/nulls=0%/str=3                   1.03    112.5±0.52µs        ? ?/sec    1.00    109.3±0.76µs        ? ?/sec
in_list/Utf8/list=100/nulls=20%/str=100                1.07     80.6±0.72µs        ? ?/sec    1.00     75.2±0.80µs        ? ?/sec
in_list/Utf8/list=100/nulls=20%/str=12                 1.84     55.7±2.42µs        ? ?/sec    1.00     30.2±0.88µs        ? ?/sec
in_list/Utf8/list=100/nulls=20%/str=3                  1.00     52.1±1.97µs        ? ?/sec    1.51     78.8±0.60µs        ? ?/sec
in_list/Utf8/list=28/nulls=0%/str=100                  1.14    138.0±1.90µs        ? ?/sec    1.00    120.7±2.74µs        ? ?/sec
in_list/Utf8/list=28/nulls=0%/str=12                   2.19     70.9±1.90µs        ? ?/sec    1.00     32.3±0.86µs        ? ?/sec
in_list/Utf8/list=28/nulls=0%/str=3                    1.03    100.4±2.36µs        ? ?/sec    1.00     97.4±1.12µs        ? ?/sec
in_list/Utf8/list=28/nulls=20%/str=100                 1.00    117.0±3.51µs        ? ?/sec    1.05    123.4±2.54µs        ? ?/sec
in_list/Utf8/list=28/nulls=20%/str=12                  1.97     63.2±1.37µs        ? ?/sec    1.00     32.0±1.45µs        ? ?/sec
in_list/Utf8/list=28/nulls=20%/str=3                   1.00     49.1±0.97µs        ? ?/sec    1.44     70.9±0.56µs        ? ?/sec
in_list/Utf8/list=3/nulls=0%/str=100                   1.10     72.2±0.46µs        ? ?/sec    1.00     65.4±0.41µs        ? ?/sec
in_list/Utf8/list=3/nulls=0%/str=12                    1.15     34.2±0.19µs        ? ?/sec    1.00     29.8±0.91µs        ? ?/sec
in_list/Utf8/list=3/nulls=0%/str=3                     1.00     35.5±0.09µs        ? ?/sec    2.25     79.9±0.30µs        ? ?/sec
in_list/Utf8/list=3/nulls=20%/str=100                  1.12     76.0±1.10µs        ? ?/sec    1.00     68.1±0.25µs        ? ?/sec
in_list/Utf8/list=3/nulls=20%/str=12                   1.31     37.2±0.17µs        ? ?/sec    1.00     28.3±0.42µs        ? ?/sec
in_list/Utf8/list=3/nulls=20%/str=3                    1.00     39.5±0.23µs        ? ?/sec    1.66     65.6±0.28µs        ? ?/sec
in_list/Utf8/list=8/nulls=0%/str=100                   1.12     73.6±0.56µs        ? ?/sec    1.00     65.8±0.47µs        ? ?/sec
in_list/Utf8/list=8/nulls=0%/str=12                    1.00     35.2±0.35µs        ? ?/sec    1.04     36.5±0.86µs        ? ?/sec
in_list/Utf8/list=8/nulls=0%/str=3                     1.00     36.9±0.13µs        ? ?/sec    2.48     91.2±0.78µs        ? ?/sec
in_list/Utf8/list=8/nulls=20%/str=100                  1.12     77.8±0.53µs        ? ?/sec    1.00     69.2±0.25µs        ? ?/sec
in_list/Utf8/list=8/nulls=20%/str=12                   1.26     37.8±0.23µs        ? ?/sec    1.00     29.9±0.80µs        ? ?/sec
in_list/Utf8/list=8/nulls=20%/str=3                    1.00     40.1±0.25µs        ? ?/sec    1.65     66.4±0.35µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=0%/nulls=0%          1.40     71.6±0.55µs        ? ?/sec    1.00     51.2±0.85µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=0%/nulls=20%         1.05     76.8±2.46µs        ? ?/sec    1.00     73.1±0.84µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=25%/nulls=0%         1.32    142.8±1.99µs        ? ?/sec    1.00    108.3±1.64µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=25%/nulls=20%        1.03    126.3±2.64µs        ? ?/sec    1.00    123.2±2.30µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=75%/nulls=0%         1.11    167.2±5.27µs        ? ?/sec    1.00    150.4±1.52µs        ? ?/sec
in_list/Utf8/mixed/list=100/match=75%/nulls=20%        1.04    153.0±2.88µs        ? ?/sec    1.00    146.4±1.69µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=0%/nulls=0%           1.18    135.0±0.49µs        ? ?/sec    1.00    114.7±1.48µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=0%/nulls=20%          1.06     70.3±2.46µs        ? ?/sec    1.00     66.3±0.62µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=25%/nulls=0%          1.19    160.0±1.03µs        ? ?/sec    1.00    134.7±1.75µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=25%/nulls=20%         1.19    125.5±2.13µs        ? ?/sec    1.00    105.0±2.20µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=75%/nulls=0%          1.15    176.1±3.29µs        ? ?/sec    1.00    152.6±1.08µs        ? ?/sec
in_list/Utf8/mixed/list=28/match=75%/nulls=20%         1.07    162.1±2.90µs        ? ?/sec    1.00    151.3±2.81µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=0%/nulls=0%            1.17     48.0±0.35µs        ? ?/sec    1.00     40.8±0.87µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=0%/nulls=20%           1.65     62.8±0.74µs        ? ?/sec    1.00     38.0±0.63µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=25%/nulls=0%           1.24     88.6±1.30µs        ? ?/sec    1.00     71.7±0.79µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=25%/nulls=20%          1.66     84.5±0.69µs        ? ?/sec    1.00     50.8±2.91µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=75%/nulls=0%           1.11    129.5±1.48µs        ? ?/sec    1.00    116.3±0.81µs        ? ?/sec
in_list/Utf8/mixed/list=3/match=75%/nulls=20%          2.19    132.7±1.23µs        ? ?/sec    1.00     60.5±0.50µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=0%/nulls=0%            1.17     48.8±0.41µs        ? ?/sec    1.00     41.7±0.33µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=0%/nulls=20%           1.75     63.6±0.97µs        ? ?/sec    1.00     36.4±0.96µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=25%/nulls=0%           1.22    102.7±1.17µs        ? ?/sec    1.00     84.5±0.87µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=25%/nulls=20%          1.60     90.7±0.83µs        ? ?/sec    1.00     56.5±3.51µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=75%/nulls=0%           1.14    155.1±2.66µs        ? ?/sec    1.00    135.6±1.41µs        ? ?/sec
in_list/Utf8/mixed/list=8/match=75%/nulls=20%          1.13    123.2±4.19µs        ? ?/sec    1.00    109.4±1.18µs        ? ?/sec
in_list/Utf8View/list=100/nulls=0%/str=100             6.83    115.3±0.72µs        ? ?/sec    1.00     16.9±0.06µs        ? ?/sec
in_list/Utf8View/list=100/nulls=0%/str=12              4.23     40.5±0.70µs        ? ?/sec    1.00      9.6±0.02µs        ? ?/sec
in_list/Utf8View/list=100/nulls=0%/str=3               3.77     52.7±1.91µs        ? ?/sec    1.00     14.0±0.03µs        ? ?/sec
in_list/Utf8View/list=100/nulls=20%/str=100            1.62     96.5±1.63µs        ? ?/sec    1.00     59.5±0.73µs        ? ?/sec
in_list/Utf8View/list=100/nulls=20%/str=12             5.65     52.2±3.03µs        ? ?/sec    1.00      9.2±0.02µs        ? ?/sec
in_list/Utf8View/list=100/nulls=20%/str=3              2.32     34.4±0.66µs        ? ?/sec    1.00     14.8±0.02µs        ? ?/sec
in_list/Utf8View/list=28/nulls=0%/str=100              9.43    126.3±0.99µs        ? ?/sec    1.00     13.4±0.04µs        ? ?/sec
in_list/Utf8View/list=28/nulls=0%/str=12               8.33     78.3±0.42µs        ? ?/sec    1.00      9.4±0.02µs        ? ?/sec
in_list/Utf8View/list=28/nulls=0%/str=3                3.59     39.3±1.21µs        ? ?/sec    1.00     10.9±0.02µs        ? ?/sec
in_list/Utf8View/list=28/nulls=20%/str=100             2.20     99.8±2.69µs        ? ?/sec    1.00     45.3±0.97µs        ? ?/sec
in_list/Utf8View/list=28/nulls=20%/str=12              3.65     38.0±0.77µs        ? ?/sec    1.00     10.4±0.02µs        ? ?/sec
in_list/Utf8View/list=28/nulls=20%/str=3               3.49     37.6±0.82µs        ? ?/sec    1.00     10.8±0.02µs        ? ?/sec
in_list/Utf8View/list=3/nulls=0%/str=100               6.65     82.4±0.34µs        ? ?/sec    1.00     12.4±0.12µs        ? ?/sec
in_list/Utf8View/list=3/nulls=0%/str=12                2.70     25.8±0.06µs        ? ?/sec    1.00      9.6±0.02µs        ? ?/sec
in_list/Utf8View/list=3/nulls=0%/str=3                 2.69     25.7±0.05µs        ? ?/sec    1.00      9.6±0.02µs        ? ?/sec
in_list/Utf8View/list=3/nulls=20%/str=100              2.05     78.4±0.96µs        ? ?/sec    1.00     38.3±1.00µs        ? ?/sec
in_list/Utf8View/list=3/nulls=20%/str=12               3.51     33.9±0.50µs        ? ?/sec    1.00      9.7±0.03µs        ? ?/sec
in_list/Utf8View/list=3/nulls=20%/str=3                3.46     33.4±0.49µs        ? ?/sec    1.00      9.7±0.02µs        ? ?/sec
in_list/Utf8View/list=8/nulls=0%/str=100               6.43     83.1±0.56µs        ? ?/sec    1.00     12.9±0.19µs        ? ?/sec
in_list/Utf8View/list=8/nulls=0%/str=12                2.37     26.8±0.08µs        ? ?/sec    1.00     11.3±0.02µs        ? ?/sec
in_list/Utf8View/list=8/nulls=0%/str=3                 2.43     26.4±0.09µs        ? ?/sec    1.00     10.8±0.03µs        ? ?/sec
in_list/Utf8View/list=8/nulls=20%/str=100              1.57     79.0±0.96µs        ? ?/sec    1.00     50.2±1.15µs        ? ?/sec
in_list/Utf8View/list=8/nulls=20%/str=12               3.43     33.6±0.58µs        ? ?/sec    1.00      9.8±0.02µs        ? ?/sec
in_list/Utf8View/list=8/nulls=20%/str=3                2.26     34.0±0.60µs        ? ?/sec    1.00     15.1±0.15µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=0%/nulls=0%      6.50     79.9±1.40µs        ? ?/sec    1.00     12.3±0.09µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=0%/nulls=20%     2.25     89.9±2.26µs        ? ?/sec    1.00     40.1±0.62µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=25%/nulls=0%     3.40    114.2±1.81µs        ? ?/sec    1.00     33.6±0.34µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=25%/nulls=20%    1.43    102.2±2.94µs        ? ?/sec    1.00     71.7±0.60µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=75%/nulls=0%     2.10    134.1±1.86µs        ? ?/sec    1.00     63.7±0.67µs        ? ?/sec
in_list/Utf8View/mixed/list=100/match=75%/nulls=20%    1.47    124.6±3.27µs        ? ?/sec    1.00     84.7±0.71µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=0%/nulls=0%       8.32    104.5±0.79µs        ? ?/sec    1.00     12.6±0.06µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=0%/nulls=20%      2.37     95.5±2.89µs        ? ?/sec    1.00     40.2±0.84µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=25%/nulls=0%      3.90    142.0±1.97µs        ? ?/sec    1.00     36.5±0.60µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=25%/nulls=20%     1.48    104.6±1.75µs        ? ?/sec    1.00     70.8±0.48µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=75%/nulls=0%      1.68    138.5±2.63µs        ? ?/sec    1.00     82.2±1.41µs        ? ?/sec
in_list/Utf8View/mixed/list=28/match=75%/nulls=20%     1.36    126.5±3.41µs        ? ?/sec    1.00     93.2±0.95µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=0%/nulls=0%        3.26     37.4±0.18µs        ? ?/sec    1.00     11.5±0.06µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=0%/nulls=20%       1.09     39.8±0.40µs        ? ?/sec    1.00     36.5±0.51µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=25%/nulls=0%       2.02     77.8±1.46µs        ? ?/sec    1.00     38.5±1.26µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=25%/nulls=20%      1.06     74.4±0.89µs        ? ?/sec    1.00     70.4±0.71µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=75%/nulls=0%       2.04     99.1±1.55µs        ? ?/sec    1.00     48.6±0.58µs        ? ?/sec
in_list/Utf8View/mixed/list=3/match=75%/nulls=20%      1.22    114.6±2.61µs        ? ?/sec    1.00     93.6±0.72µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=0%/nulls=0%        3.17     38.2±0.23µs        ? ?/sec    1.00     12.1±0.11µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=0%/nulls=20%       1.00     40.1±0.80µs        ? ?/sec    1.00     40.1±0.47µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=25%/nulls=0%       2.68     76.6±1.04µs        ? ?/sec    1.00     28.6±0.44µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=25%/nulls=20%      1.01     67.2±0.84µs        ? ?/sec    1.00     66.4±0.38µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=75%/nulls=0%       2.39     93.0±1.93µs        ? ?/sec    1.00     38.9±0.50µs        ? ?/sec
in_list/Utf8View/mixed/list=8/match=75%/nulls=20%      1.28    104.5±3.16µs        ? ?/sec    1.00     81.9±0.85µs        ? ?/sec
in_list_cols/Int32/list=28/match=0%/nulls=0%           1.00     50.9±0.57µs        ? ?/sec    1.00     51.1±0.48µs        ? ?/sec
in_list_cols/Int32/list=28/match=0%/nulls=20%          1.00     67.0±0.31µs        ? ?/sec    1.00     67.0±0.26µs        ? ?/sec
in_list_cols/Int32/list=28/match=100%/nulls=0%         1.01   1835.1±5.65ns        ? ?/sec    1.00   1821.5±8.79ns        ? ?/sec
in_list_cols/Int32/list=28/match=100%/nulls=20%        1.00     66.4±0.27µs        ? ?/sec    1.01     67.4±0.46µs        ? ?/sec
in_list_cols/Int32/list=28/match=50%/nulls=0%          1.00     27.2±0.11µs        ? ?/sec    1.00     27.2±0.07µs        ? ?/sec
in_list_cols/Int32/list=28/match=50%/nulls=20%         1.00     66.8±0.25µs        ? ?/sec    1.01     67.3±0.28µs        ? ?/sec
in_list_cols/Int32/list=3/match=0%/nulls=0%            1.00      5.5±0.01µs        ? ?/sec    1.00      5.5±0.01µs        ? ?/sec
in_list_cols/Int32/list=3/match=0%/nulls=20%           1.00      6.7±0.02µs        ? ?/sec    1.01      6.8±0.02µs        ? ?/sec
in_list_cols/Int32/list=3/match=100%/nulls=0%          1.01   1843.5±5.92ns        ? ?/sec    1.00   1821.4±5.78ns        ? ?/sec
in_list_cols/Int32/list=3/match=100%/nulls=20%         1.00      6.7±0.03µs        ? ?/sec    1.01      6.8±0.03µs        ? ?/sec
in_list_cols/Int32/list=3/match=50%/nulls=0%           1.00      5.5±0.01µs        ? ?/sec    1.00      5.5±0.02µs        ? ?/sec
in_list_cols/Int32/list=3/match=50%/nulls=20%          1.00      6.7±0.02µs        ? ?/sec    1.00      6.7±0.02µs        ? ?/sec
in_list_cols/Int32/list=8/match=0%/nulls=0%            1.00     14.5±0.04µs        ? ?/sec    1.00     14.5±0.03µs        ? ?/sec
in_list_cols/Int32/list=8/match=0%/nulls=20%           1.00     18.7±0.05µs        ? ?/sec    1.00     18.8±0.06µs        ? ?/sec
in_list_cols/Int32/list=8/match=100%/nulls=0%          1.01   1839.9±5.03ns        ? ?/sec    1.00   1820.5±4.91ns        ? ?/sec
in_list_cols/Int32/list=8/match=100%/nulls=20%         1.00     18.6±0.05µs        ? ?/sec    1.00     18.5±0.05µs        ? ?/sec
in_list_cols/Int32/list=8/match=50%/nulls=0%           1.00     14.6±0.05µs        ? ?/sec    1.00     14.6±0.04µs        ? ?/sec
in_list_cols/Int32/list=8/match=50%/nulls=20%          1.00     18.6±0.07µs        ? ?/sec    1.01     18.8±0.05µs        ? ?/sec
in_list_cols/Utf8/list=28/match=0%                     1.00    178.4±0.42µs        ? ?/sec    1.00    178.9±0.54µs        ? ?/sec
in_list_cols/Utf8/list=28/match=100%                   1.02    735.1±9.41µs        ? ?/sec    1.00    722.0±2.91µs        ? ?/sec
in_list_cols/Utf8/list=28/match=50%                    1.00   1141.8±2.17µs        ? ?/sec    1.01   1152.2±2.90µs        ? ?/sec
in_list_cols/Utf8/list=3/match=0%                      1.00     18.7±0.09µs        ? ?/sec    1.00     18.7±0.04µs        ? ?/sec
in_list_cols/Utf8/list=3/match=100%                    1.00     76.5±0.33µs        ? ?/sec    1.01     77.2±0.28µs        ? ?/sec
in_list_cols/Utf8/list=3/match=50%                     1.00     97.9±0.90µs        ? ?/sec    1.00     98.1±1.09µs        ? ?/sec
in_list_cols/Utf8/list=8/match=0%                      1.00     50.4±0.09µs        ? ?/sec    1.01     50.7±0.08µs        ? ?/sec
in_list_cols/Utf8/list=8/match=100%                    1.02    209.3±1.68µs        ? ?/sec    1.00    205.0±1.13µs        ? ?/sec
in_list_cols/Utf8/list=8/match=50%                     1.00    314.2±1.74µs        ? ?/sec    1.02    321.8±0.60µs        ? ?/sec

Resource Usage

in_list — base (merge-base)

Metric Value
Wall time 270.1s
Peak memory 46.5 MiB
Avg memory 22.9 MiB
CPU user 581.0s
CPU sys 1.5s
Peak spill 0 B

in_list — branch

Metric Value
Wall time 270.1s
Peak memory 43.7 MiB
Avg memory 21.0 MiB
CPU user 581.4s
CPU sys 1.6s
Peak spill 0 B

File an issue against this benchmark runner

@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_fixed_size_binary_filter branch from 098e0a6 to ce699c4 Compare June 18, 2026 20:55
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_fixed_size_binary_filter branch 2 times, most recently from 867992e to 2ac665f Compare June 19, 2026 05:35
Replaces HashSet<u8> with a 32-byte stack-allocated bitmap. Provides O(1) membership testing via bit-shifting, significantly reducing memory overhead and improving cache locality. Triggers for UInt8 arrays.
Implements an 8 KB heap-allocated bitmap for UInt16. Maintains O(1) performance while handling the larger value space. Triggers for UInt16 arrays.
Introduces zero-copy buffer reinterpretation to allow signed integers and other 1 or 2-byte primitive types (e.g. Float16) to use the high-performance bitmap filters. Triggers for all types with 1-byte or 2-byte width.
Adds a const-generic unrolled comparison chain that avoids CPU branching. Outperforms hash lookups for very small lists. Triggers for primitives when list size <= 32 (4-byte), 16 (8-byte), or 4 (16-byte).
Implements a fast hash table using open addressing with linear probing and a 25% load factor. Replaces the legacy HashSet for primitives, reducing indirection. Triggers for primitives when list size exceeds branchless thresholds.
Introduces a two-stage filter for ByteView types. Stage 1 uses a fast DirectProbeFilter on masked views (len + prefix) for quick rejection; Stage 2 performs full verification only for potential long-string matches. Triggers for Utf8View and BinaryView.
FixedSizeBinary(N) arrays share the same contiguous buffer layout as primitive arrays, so for power-of-2 widths (1, 2, 4, 8, 16) we can zero-copy reinterpret them and use the optimized primitive filters (bitmap, branchless, hash) instead of falling through to the NestedTypeFilter fallback.
@geoffreyclaude geoffreyclaude force-pushed the perf/in_list_fixed_size_binary_filter branch from 2ac665f to b62beb4 Compare June 19, 2026 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-expr Changes to the physical-expr crates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants