Summary
IN LIST evaluates expressions like:
The list on the right is constant, so DataFusion can build a lookup structure once and reuse it for every input row. This matters for dynamic filter pushdown, where these checks can run millions of times during large scans.
The goal of this stack is to pick a lookup strategy that matches the data type and list size, instead of sending every common case through the same generic comparator-based path.
The SQL behavior should stay the same, including IN, NOT IN, dictionaries, slices, and null handling. These are internal performance optimizations.
Current Stack
Review order follows this list:
How The Optimizations Work
The stack is a sequence of small lookup improvements. Each PR keeps the same result semantics, but makes a common membership check cheaper.
Bitmap Lookup
For small integer domains, the IN list can be represented as a bitmap.
For UInt8, there are only 256 possible values. If the list contains 42, set bit 42. To check an input row with value 42, read bit 42.
That turns membership into one indexed bit test.
Reusing The Same Bytes
Some Arrow types have different meanings but the same physical width. For example, Int8 and UInt8 are both one byte wide.
The bitmap only cares about the exact bits, so compatible values can reuse the same lookup storage without copying the array. This is the “zero-copy reinterpretation” part of the stack.
Tiny Lists
For very small primitive lists, comparing directly can be cheaper than hashing.
For example:
can be checked like:
x == 10 OR x == 20 OR x == 30
That is the branchless small-list path.
Larger Primitive Lists
Once primitive lists get larger, direct comparison is no longer the best fit. The stack then switches to a compact purpose-built lookup table for fixed-width primitive values.
This is still hash-table-style lookup, but it is simpler than the generic fallback because primitive values can be stored and probed directly.
Strings And Binary Views
Strings and binary values can be expensive to compare byte-for-byte. The string/view PRs first ask a cheaper question:
Could this value possibly match anything in the IN list?
Length and prefix checks can reject many non-matches quickly. If a value still looks like a possible match, long values are verified with exact byte comparison before returning true.
Fixed-Size Binary
FixedSizeBinary(1), (2), (4), (8), and (16) have fixed-width byte layouts that match the primitive-width optimizations above.
Those values can reuse the bitmap, branchless, and direct-probe paths internally. Other fixed-size binary widths stay on the generic fallback.
Expected Impact
Reference benchmarks from #19390 show the largest wins in areas where the old path did much more work than necessary:
| Area |
Why it can be faster |
UInt8 / UInt16 |
one bitmap bit test per input value |
| Small primitive lists |
direct comparisons avoid hash-table overhead |
| Larger primitive lists |
compact fixed-width lookup avoids the generic comparator path |
| Date, timestamp, float, and signed integer paths |
same-width values can reuse primitive fast paths |
Utf8View / BinaryView |
length and prefix checks reject many non-matches cheaply |
FixedSizeBinary supported widths |
fixed-width byte values can reuse primitive lookup machinery |
The detailed benchmark numbers live in #19390 and have been refreshed in the dedicated PRs as review changes the stack. #23017 was closed because its local benchmark snapshot did not justify the added regular Utf8 / LargeUtf8 path.
Implementation Plan
Additional Context
Summary
IN LISTevaluates expressions like:The list on the right is constant, so DataFusion can build a lookup structure once and reuse it for every input row. This matters for dynamic filter pushdown, where these checks can run millions of times during large scans.
The goal of this stack is to pick a lookup strategy that matches the data type and list size, instead of sending every common case through the same generic comparator-based path.
The SQL behavior should stay the same, including
IN,NOT IN, dictionaries, slices, and null handling. These are internal performance optimizations.Current Stack
Review order follows this list:
UInt8UInt16UInt8andUInt16bitmap filter implementationsUtf8ViewandBinaryViewFixedSizeBinarywidths reuse the primitive fast paths (now stacked directly on IN LIST: add string-view filters for Utf8View and BinaryView #23016)How The Optimizations Work
The stack is a sequence of small lookup improvements. Each PR keeps the same result semantics, but makes a common membership check cheaper.
Bitmap Lookup
For small integer domains, the
INlist can be represented as a bitmap.For
UInt8, there are only 256 possible values. If the list contains42, set bit42. To check an input row with value42, read bit42.That turns membership into one indexed bit test.
Reusing The Same Bytes
Some Arrow types have different meanings but the same physical width. For example,
Int8andUInt8are both one byte wide.The bitmap only cares about the exact bits, so compatible values can reuse the same lookup storage without copying the array. This is the “zero-copy reinterpretation” part of the stack.
Tiny Lists
For very small primitive lists, comparing directly can be cheaper than hashing.
For example:
can be checked like:
That is the branchless small-list path.
Larger Primitive Lists
Once primitive lists get larger, direct comparison is no longer the best fit. The stack then switches to a compact purpose-built lookup table for fixed-width primitive values.
This is still hash-table-style lookup, but it is simpler than the generic fallback because primitive values can be stored and probed directly.
Strings And Binary Views
Strings and binary values can be expensive to compare byte-for-byte. The string/view PRs first ask a cheaper question:
Length and prefix checks can reject many non-matches quickly. If a value still looks like a possible match, long values are verified with exact byte comparison before returning true.
Fixed-Size Binary
FixedSizeBinary(1),(2),(4),(8), and(16)have fixed-width byte layouts that match the primitive-width optimizations above.Those values can reuse the bitmap, branchless, and direct-probe paths internally. Other fixed-size binary widths stay on the generic fallback.
Expected Impact
Reference benchmarks from #19390 show the largest wins in areas where the old path did much more work than necessary:
UInt8/UInt16Utf8View/BinaryViewFixedSizeBinarysupported widthsThe detailed benchmark numbers live in #19390 and have been refreshed in the dedicated PRs as review changes the stack. #23017 was closed because its local benchmark snapshot did not justify the added regular
Utf8/LargeUtf8path.Implementation Plan
Utf8View/BinaryViewFixedSizeBinaryreinterpretation for supported widthsUtf8/LargeUtf8Additional Context
StaticFiltersfor different data types #18824INperformance with specialized implementations #19390