-
Notifications
You must be signed in to change notification settings - Fork 1.6k
refactor character_length
impl by unifying null handling logic
#16877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @waynexia -- I think we should try and figure out why some of the bechmarks got slower (maybe it is because checking for nulls is faster than computing zero lengths strs 🤔 )
} else { | ||
let values: Vec<_> = (0..array.len()) | ||
.map(|i| { | ||
let value = array.value(i); | ||
// Safety: we are iterating with array.len() so the index is always valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another idea here is if you know the values are always ascii, you can avoid making a &str
at all -- and instead simply compute the character lengths based on the offsets array (for StringArray and LargeStringArray) or the views for `StringViewArray)
@@ -136,56 +136,31 @@ where | |||
// string is ASCII only is relatively cheap. | |||
// If strings are ASCII only, count bytes instead. | |||
let is_array_ascii_only = array.is_ascii(); | |||
let array = if array.null_count() == 0 { | |||
let nulls = array.nulls().cloned(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea to remove the no-nulls optimization because it doesn't make things faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think avoiding the null check (array.is_null(i)
) makes things faster and it seems we don't need 0
as we always cloned array.nulls().cloned()
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
I do find revert the null checking logic on non-ascii branch will eliminate regression on those two cases, but it would slow down other cases in the same time. Still not sure why the get value operation is slower only for 32-length UTF-8... Maybe we can make it faster by branching using offsets (like check if start equals to end), but However I find checking if the value is empty (zero length) in advance will make it a bit faster, the regression changes from 10% to 5% while keeping other cases unchanged:
|
Which issue does this PR close?
character_length
function #13696, but I came up with this change when reading Optimize performance ofcharacter_length
function #13696,Rationale for this change
Simplify the implementation by unifying the non-null and nullable branches.
T::default_value()
is just 0, which is the same as get.len()
from an empty slice returned bystring_array.value()
. And in both way, the NULL in result array is marked by null buffer, not.push_null()
.And I also find a performance improvement by doing it this way (and I verified this is not from
value_unchecked
):But there are two outliers
character_length_StringArray_utf8_str_len_32
andcharacter_length_StringViewArray_utf8_str_len_32
that become 10% slower, for some reason I don't know 😞.What changes are included in this PR?
simplify
character_length
's implementationAre these changes tested?
with existing test case
Are there any user-facing changes?