-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-48866: [C++][Gandiva] Truncate subseconds beyond milliseconds in castTIMESTAMP_utf8 and castTIME_utf8
#48867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
b377b59
6460d6f
a707b48
318ce7f
d233f19
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -122,15 +122,26 @@ TEST(TestTime, TestCastTimestamp) { | |
| "Not a valid time for timestamp value 2000-01-01 00:00:100"); | ||
| context.Reset(); | ||
|
|
||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.0001", 24), 0); | ||
| EXPECT_EQ(context.get_error(), | ||
| "Invalid millis for timestamp value 2000-01-01 00:00:00.0001"); | ||
| context.Reset(); | ||
|
|
||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.1000", 24), 0); | ||
| EXPECT_EQ(context.get_error(), | ||
| "Invalid millis for timestamp value 2000-01-01 00:00:00.1000"); | ||
| context.Reset(); | ||
| // Test truncation of subseconds to 3 digits (milliseconds) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add some test cases with less than three digits?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We already have such test cases, see here: https://github.com/akravchukdremio/arrow/blob/GH-48866/cpp/src/gandiva/precompiled/time_test.cc#L100-L104 and https://github.com/akravchukdremio/arrow/blob/GH-48866/cpp/src/gandiva/precompiled/time_test.cc#L154-L158. Seems like enough, but if you think we need more - I can add them |
||
| // "2000-01-01 00:00:00.0001" should truncate to "2000-01-01 00:00:00.000" | ||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.0001", 24), | ||
| castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.000", 23)); | ||
|
|
||
| // "2000-01-01 00:00:00.1000" should truncate to "2000-01-01 00:00:00.100" | ||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.1000", 24), | ||
| castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.100", 23)); | ||
|
|
||
| // "2000-01-01 00:00:00.123456789" should truncate to "2000-01-01 00:00:00.123" | ||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.123456789", 29), | ||
| castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.123", 23)); | ||
|
|
||
| // "2000-01-01 00:00:00.1999" should truncate to "2000-01-01 00:00:00.199" | ||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.1999", 24), | ||
| castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.199", 23)); | ||
|
|
||
| // "2000-01-01 00:00:00.1994" should truncate to "2000-01-01 00:00:00.199" | ||
| EXPECT_EQ(castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.1994", 24), | ||
| castTIMESTAMP_utf8(context_ptr, "2000-01-01 00:00:00.199", 23)); | ||
| } | ||
|
|
||
| TEST(TestTime, TestCastTimeUtf8) { | ||
|
|
@@ -166,13 +177,26 @@ TEST(TestTime, TestCastTimeUtf8) { | |
| EXPECT_EQ(context.get_error(), "Not a valid time value 00:00:100"); | ||
| context.Reset(); | ||
|
|
||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.0001", 13), 0); | ||
| EXPECT_EQ(context.get_error(), "Invalid millis for time value 00:00:00.0001"); | ||
| context.Reset(); | ||
| // Test truncation of subseconds to 3 digits (milliseconds) | ||
| // "00:00:00.0001" should truncate to "00:00:00.000" | ||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.0001", 13), | ||
| castTIME_utf8(context_ptr, "00:00:00.000", 12)); | ||
|
|
||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.1000", 13), 0); | ||
| EXPECT_EQ(context.get_error(), "Invalid millis for time value 00:00:00.1000"); | ||
| context.Reset(); | ||
| // "00:00:00.1000" should truncate to "00:00:00.100" | ||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.1000", 13), | ||
| castTIME_utf8(context_ptr, "00:00:00.100", 12)); | ||
|
|
||
| // "9:45:30.123456789" should truncate to "9:45:30.123" | ||
| EXPECT_EQ(castTIME_utf8(context_ptr, "9:45:30.123456789", 17), | ||
| castTIME_utf8(context_ptr, "9:45:30.123", 11)); | ||
|
|
||
| // "00:00:00.1999" should truncate to "00:00:00.199" | ||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.1999", 13), | ||
| castTIME_utf8(context_ptr, "00:00:00.199", 12)); | ||
|
|
||
| // "00:00:00.1994" should truncate to "00:00:00.199" | ||
| EXPECT_EQ(castTIME_utf8(context_ptr, "00:00:00.1994", 13), | ||
| castTIME_utf8(context_ptr, "00:00:00.199", 12)); | ||
| } | ||
|
|
||
| #ifndef _WIN32 | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might use the regular 'inline'. I don't see ARROW_FORCE_INLINE really being used anywhere and I don't feel like I know enough that this function should be forced inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that 'FORCE_INLINE' is used a lot, so maybe that is ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right,
ARROW_FORCE_INLINEis used only in this file: https://github.com/akravchukdremio/arrow/blob/GH-48866/cpp/src/arrow/compute/kernels/gather_internal.h#L55. Initially I've usedFORCE_INLINE, which is used in many places as far as I can see (around 10 files in gandiva), but then I've changed it toARROW_FORCE_INLINEto address this comment from committer: #48867 (comment).This is just a helper function, initially I've done my changes directly in
castTIMESTAMP_utf8andcastTIME_utf8functions: b377b59, but then there was a comment to factor out a common code: #48867 (comment) - so I decided to create small function for that.Why I've used
FORCE_INLINEinitially? Because there was already a function in this file with such macros, see here: https://github.com/akravchukdremio/arrow/blob/GH-48866/cpp/src/gandiva/precompiled/time.cc#L564-L568, and I've decided to use it too to make our code execution faster (gandiva is aiming for it, AFAIK), since invoking function is an action to put new variables in the call stack. But I agree, that probably justinlinekeyword would be enough, function is not too complex, so compiler seems anyway will inline it by just havinginlinekeyword.@kou what do you think, is it fine to just remove
ARROW_FORCE_INLINEand use onlyinlinekeyword instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was initially there, then I've changed it to
ARROW_FORCE_INLINEto address another review comment: #48867 (comment)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need force inline,
inlineis OK.For
FORCE_INLINEandARROW_FORCE_INLINE: I want to removeFORCE_INLINEandGDV_FORCE_INLINEand useARROW_FORCE_INLINEinstead of them. We don't need multiple similar definitions in the same project.