Skip to content

Antalya-26.3: Resolve currentDatabase() in Hybrid segment metadata#1995

Open
mkmkme wants to merge 4 commits into
antalya-26.3from
mkmkme/antalya-26.3/hybrid-restart
Open

Antalya-26.3: Resolve currentDatabase() in Hybrid segment metadata#1995
mkmkme wants to merge 4 commits into
antalya-26.3from
mkmkme/antalya-26.3/hybrid-restart

Conversation

@mkmkme

@mkmkme mkmkme commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

A Hybrid table created in a non-default database with currentDatabase() in an additional segment persisted the unresolved currentDatabase() token in its metadata. On restart the startup ATTACH replays the definition with no session database, so currentDatabase() resolved to default and the segment pointed at a missing default table, failing to attach with UNKNOWN_TABLE.

registerStorageHybrid resolved currentDatabase() only on the clone used for schema validation and in-memory segment execution, never on the original engine_args[i] that gets serialized to metadata. Resolve it on the original argument too, so the persisted value is a concrete database name. This mirrors the base segment, which is already resolved in place by TableFunctionRemote::parseArguments, and the identifier branch, which writes the qualified name back into engine_args[i].

Add an integration test that creates a Hybrid table in a non-default database using currentDatabase(), restarts the server, and verifies the table still attaches and is queryable.

Fixes #1993

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fixes an issue where currentDatabase() used as a hybrid table argument would cause an UNKNOWN_TABLE error on restart

Documentation entry for user-facing changes

...

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • Aarch64 tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • OAuth (5m)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

A Hybrid table created in a non-default database with `currentDatabase()`
in an additional segment persisted the unresolved `currentDatabase()` token
in its metadata. On restart the startup ATTACH replays the definition with no
session database, so `currentDatabase()` resolved to `default` and the segment
pointed at a missing `default` table, failing to attach with `UNKNOWN_TABLE`.

`registerStorageHybrid` resolved `currentDatabase()` only on the clone used for
schema validation and in-memory segment execution, never on the original
`engine_args[i]` that gets serialized to metadata. Resolve it on the original
argument too, so the persisted value is a concrete database name. This mirrors
the base segment, which is already resolved in place by
`TableFunctionRemote::parseArguments`, and the identifier branch, which writes
the qualified name back into `engine_args[i]`.

Add an integration test that creates a Hybrid table in a non-default database
using `currentDatabase()`, restarts the server, and verifies the table still
attaches and is queryable.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mkmkme

mkmkme commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@codex review

Comment thread tests/integration/test_hybrid_table_restart/test.py Outdated
Comment thread src/Storages/StorageDistributed.cpp Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7ba1a9d383

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/Storages/StorageDistributed.cpp Outdated
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown

Workflow [PR], commit [1037c7d]

@mkmkme

mkmkme commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator Author

@codex review

@mkmkme mkmkme requested a review from filimonov July 1, 2026 16:36

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1037c7d712

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Normalize arguments (evaluate `currentDatabase()`, expand named collections, etc.).
// TableFunctionFactory::get mutates the AST in-place inside TableFunctionRemote::parseArguments.
ASTPtr normalized_table_function_ast = table_function_ast->clone();
replaceCurrentDatabaseFunction(engine_args[i], local_context);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve currentDatabase aliases in persisted segments

This new metadata rewrite only catches AST functions named exactly currentDatabase, but the same function is exposed through accepted aliases such as DATABASE, SCHEMA, and current_database. For an additional segment like remote('localhost:9000', DATABASE(), 'local_cold'), TableFunctionRemote::parseArguments still normalizes the clone used in memory, while the original engine_args[i] keeps the unresolved alias in the stored definition, so reattach/startup from a different current database can still point the segment at the wrong database and raise the same UNKNOWN_TABLE exception this patch is fixing. Please normalize the actual remote/cluster database argument with the existing database-name constant evaluator instead of relying on this exact-name tree walk.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hybrid engine — currentDatabase() in engine arguments breaks server restart

2 participants