Skip to content

HIVE-28728: fix STR_TO_MAP() returning garbled utf-8 characters when vectorization is enabled #6559

Open
cyanzheng2926 wants to merge 2 commits into
apache:masterfrom
cyanzheng2926:HIVE-28728-str_to_map_udf8
Open

HIVE-28728: fix STR_TO_MAP() returning garbled utf-8 characters when vectorization is enabled #6559
cyanzheng2926 wants to merge 2 commits into
apache:masterfrom
cyanzheng2926:HIVE-28728-str_to_map_udf8

Conversation

@cyanzheng2926

Copy link
Copy Markdown

What changes were proposed in this pull request?

This PR is proposed for HIVE-28728, when under non-UTF8 default charset, STR_TO_MAP() returns garbled characters with vectorization enabled.

Why are the changes needed?

Using default non-UTF8 charset should not give different behaviours (garbled characters vs correct characters) when vectorization is on/off.
This also aligns with the Jira HIVE-28544.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Created str_to_map_utf8.q and tested with TestMiniTezCliDriver TestMiniLlapCliDriver with corresponding settings

@cyanzheng2926 cyanzheng2926 changed the title Hive-28728: fix STR_TO_MAP() returning garbled utf-8 characters when vectorization is enabled HIVE-28728: fix STR_TO_MAP() returning garbled utf-8 characters when vectorization is enabled Jun 23, 2026
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants