Fix reading columns with dot-separated names from Iceberg #1319

mkmkme · 2026-01-16T11:44:51Z

Fixes #1301. Upstream PR: ClickHouse#94335

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fixes an issue when Iceberg columns with dot in names returned NULL as values.

CI/CD Options

Exclude tests:

Regression jobs to run:

mkmkme · 2026-01-16T11:45:36Z

@codex review

chatgpt-codex-connector · 2026-01-16T11:52:47Z

Codex Review: Didn't find any major issues. You're on a roll.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

github-actions · 2026-01-16T12:03:23Z

Workflow [PR], commit [15a5d9a]

name

mkmkme · 2026-01-19T14:31:31Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d6b0a778de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-01-19T14:45:45Z

src/Processors/Formats/Impl/Parquet/SchemaConverter.cpp

+    /// Strip "current_path." prefix to get child name (preserves dots in child names)
+    std::string_view mapped = it->second;
+    if (mapped.starts_with(current_path) && mapped.size() > current_path.size()
+        && mapped[current_path.size()] == '.')
+        return mapped.substr(current_path.size() + 1);


Normalize case before prefix stripping

When case_insensitive_column_matching is enabled, node.name is rewritten to the query’s casing in processSubtree, so current_path can differ in case from the Iceberg-mapped name. The starts_with check here is case-sensitive; if the user queries a tuple/struct column with different case, the prefix won’t be stripped and child names get built with the full mapped path (e.g., mystruct.MyStruct.child), which then fails tryGetPositionByName and can yield missing tuple elements or errors. Consider normalizing current_path/mapped with the same case-insensitive rules before comparing.

Useful? React with 👍 / 👎.

The setting input_format_parquet_case_insensitive_column_matching is not applicable to ColumnMapper. It is used when there are differences between what ClickHouse table schema expects and what the parquet file contains -- see https://github.com/Altinity/ClickHouse/blob/antalya-25.8/tests/queries/0_stateless/03285_orc_arrow_parquet_tuple_field_matching.sh, while ColumnMapper is a part of ClickHouse analyzer which is always case-sensitive for identifiers.

That being said, this is what this setting affects:

# Int64 on write is `foo.bar` :) INSERT INTO function file('dots.parquet', 'parquet', '`my.tuple` Tuple(`foo.bar` Int64, `bar.baz` String)') SELECT (1, '2') SETTINGS engine_file_truncate_on_insert = 1; # Int64 on read is `Foo.bar`, case_insensitive_column_matching = 0, no value returned :) SELECT `my.tuple.Foo.bar` FROM file('dots.parquet', 'parquet', '`my.tuple` Tuple(`Foo.bar` Int64, `bar.baz` String)') SETTINGS input_format_parquet_case_insensitive_column_matching = 0 Query id: 622d67b9-06e0-44a3-9c43-1229cbf4eea4 ┌─my.tuple.Foo.bar─┐ 1. │ 0 │ └──────────────────┘ # Int64 on read is `Foo.bar`, case_insensitive_column_matching = 1, a proper value returned :) SELECT `my.tuple.Foo.bar` FROM file('dots.parquet', 'parquet', '`my.tuple` Tuple(`Foo.bar` Int64, `bar.baz` String)') SETTINGS input_format_parquet_case_insensitive_column_matching = 1 Query id: 8f5f3d83-380e-46d5-97d5-af94d4b5ed17 ┌─my.tuple.Foo.bar─┐ 1. │ 1 │ └──────────────────┘

The correct value here is 1. This is an example of reading it from the current antalya-25.8, not this branch.

Trying to SELECT my.tuple.Foo.bar when schema has my.tuple.foo.bar will always fail, because this setting doesn't affect what you're selecting vs what's in the schema. It affects what's in the schema vs what's in the file.

arthurpassos

LGTM

Fix reading columns with dot-separated names from Iceberg

5bc69af

mkmkme added antalya bugfix antalya-25.8 antalya-25.8.14.20001 antalya-25.8.14 antalya-in-upstream PR exists in upstream labels Jan 16, 2026

Merge branch 'antalya-25.8' into mkmkme/dot-issue

ec6992f

SchemaConverter: fix handling the nested fields containing a dot in the

d6b0a77

name

chatgpt-codex-connector bot reviewed Jan 19, 2026

View reviewed changes

mkmkme added 2 commits January 19, 2026 16:04

added test with nested structs having dot in the name

86b3b91

Merge branch 'antalya-25.8' into mkmkme/dot-issue

15a5d9a

arthurpassos approved these changes Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix reading columns with dot-separated names from Iceberg #1319

Fix reading columns with dot-separated names from Iceberg #1319

Uh oh!

mkmkme commented Jan 16, 2026

Uh oh!

mkmkme commented Jan 16, 2026

Uh oh!

chatgpt-codex-connector bot commented Jan 16, 2026

Uh oh!

github-actions bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

mkmkme commented Jan 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Jan 19, 2026

Uh oh!

mkmkme Jan 20, 2026

Uh oh!

mkmkme Jan 20, 2026 •

edited

Loading

Uh oh!

arthurpassos left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fix reading columns with dot-separated names from Iceberg #1319

Are you sure you want to change the base?

Fix reading columns with dot-separated names from Iceberg #1319

Uh oh!

Conversation

mkmkme commented Jan 16, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

CI/CD Options

Exclude tests:

Regression jobs to run:

Uh oh!

mkmkme commented Jan 16, 2026

Uh oh!

chatgpt-codex-connector bot commented Jan 16, 2026

Uh oh!

github-actions bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkmkme commented Jan 19, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

mkmkme Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

mkmkme Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurpassos left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 16, 2026 •

edited

Loading

mkmkme Jan 20, 2026 •

edited

Loading