You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The idea is that now that we have some very optimized string builder APIs that generalize to the three different string types, we can reuse them in multiple kernels
append_withto string builders, use inreplace#22029The idea is that now that we have some very optimized string builder APIs that generalize to the three different string types, we can reuse them in multiple kernels
At the moment the code is all in in the datafusion-functions crate: https://github.com/apache/datafusion/blob/7708aa2dc61271423a5c334bd2e2025b5e275133/datafusion/functions/src/strings.rs
However, that means they can't be used in other crates. I suggest we could put the string code in https://github.com/apache/datafusion/blob/0dfcd97a37e083e48aefc5267539ac453cc07b44/datafusion/physical-expr-common
This is consistent with things like String/BinaryMap:
https://github.com/apache/datafusion/blob/0dfcd97a37e083e48aefc5267539ac453cc07b44/datafusion/physical-expr-common/src/binary_map.rs#L40-L39
This might make it easier to and and reuse across crates
As @neilconway says:
Other places where these APIs should be useful:
initcaplower,upper: at least for the Unicode code path; for ASCII, we might not beat the hand-optimized code added in perf: Optimizelower,upperfor ASCII inputs #21980translatereverse(might need a slightly different API)to_char(might need a small API extension)lpad,rpad(needs a closer look)If we make the builders accessible outside the current crate, some of the Spark functions could use these APIs, as well as
||forUtf8Viewvalues.Originally posted by @neilconway in #22029 (comment)