Skip to content

fix: correct 'clustering_key' to 'clustering' in column kind filter#761

Open
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:fix/clustering-key-filter-typo
Open

fix: correct 'clustering_key' to 'clustering' in column kind filter#761
mykaul wants to merge 1 commit intoscylladb:masterfrom
mykaul:fix/clustering-key-filter-typo

Conversation

@mykaul
Copy link
Copy Markdown

@mykaul mykaul commented Mar 25, 2026

Summary

  • Fix a latent bug in SchemaParserV3._build_table_metadata() where the column kind filter at cassandra/metadata.py:2744 used 'clustering_key' instead of 'clustering', causing clustering columns to not be excluded from the "other columns" loop and thus processed/added twice.

Details

The system_schema.columns table uses 'clustering' as the kind value for clustering key columns (not 'clustering_key'). The correct value 'clustering' is already used 6 lines above (line 2735) when building the clustering key list. The filter on line 2744, which is supposed to skip partition key and clustering columns when iterating "other" columns, was using the wrong string.

This bug is harmless in most cases because _build_column_metadata() overwrites column objects in a dict, but it causes unnecessary duplicate processing and could cause subtle issues with column ordering in edge cases.

Testing

All 55 metadata unit tests pass (tests/unit/test_metadata.py). The 1 pre-existing test failure in tests/unit/test_types.py::TypeTests::test_lookup_casstype is unrelated (broken test on master, to be fixed by PR #690).

The column kind filter at line 2744 used 'clustering_key' but
system_schema.columns uses 'clustering' as the kind value. This caused
clustering columns to not be excluded from the 'other columns' loop,
resulting in them being processed twice (once as clustering key, once
as regular column). The correct value 'clustering' was already used
6 lines above in the clustering key extraction loop.
@mykaul mykaul marked this pull request as draft March 25, 2026 07:50
@mykaul mykaul requested a review from Copilot March 25, 2026 07:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a schema parsing bug in SchemaParserV3._build_table_columns() where clustering columns from system_schema.columns were not being filtered out of the “other columns” loop due to using the wrong kind value.

Changes:

  • Corrected the kind filter from 'clustering_key' to 'clustering' when skipping key columns during non-key column processing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mykaul mykaul marked this pull request as ready for review March 25, 2026 15:58
@mykaul
Copy link
Copy Markdown
Author

mykaul commented Mar 26, 2026

It's not very clear to me it is the correct fix. In GoCQL we have both? https://github.com/scylladb/gocql/blob/2d8532c6f9e6047ef4c05e5705224dd618d76387/metadata_scylla.go#L509

AI says:

  • ScyllaDB reports column kind as "clustering_key"
  • Cassandra reports it as "clustering"

I think AI is wrong here. Nothing is sent over the write as such. Need to understand this and GoCQL better :-/

@mykaul
Copy link
Copy Markdown
Author

mykaul commented Mar 26, 2026

AI again:

Writing (what drivers see): In db/schema_tables.cc:2298-2306, the serialize_kind function maps column_kind::clustering_key to the string "clustering":
case column_kind::clustering_key: return "clustering";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants