Context
PR #858 introduced dynamic column projection for topology monitor queries: the first query uses SELECT * to discover available columns, then subsequent queries use a projected SELECT col1, col2, ... string. However, those subsequent projected queries are still sent as plain Query messages — the query string is re-parsed by the server on every call.
Proposed improvement
After the first SELECT * populates a column cache (e.g. localColumns, peersColumns, peersV2Columns), immediately issue a PREPARE for the resulting projected query string and cache the returned statementId/resultMetadataId alongside the column list. All subsequent queries then use Execute instead of Query, sending only the prepared statement ID (~16 bytes) rather than the full query text (~130–160 bytes), and skipping server-side re-parsing on every call.
Scope
The cleanest targets are the full-scan queries (no WHERE clause, no bind values):
SELECT col1, col2, ... FROM system.local WHERE key='local' (fixed WHERE, prepare once)
SELECT col1, col2, ... FROM system.peers
SELECT col1, col2, ... FROM system.peers_v2
The WHERE-clause single-node queries in refreshNode() already use named bind parameters (:address, :port) and pass null columns (i.e. SELECT *). These could also be extended to use prepared+projected form with positional bind values, but that is a separate concern.
Implementation sketch
In DefaultTopologyMonitor:
- Add three additional
volatile cache fields: localStatementId, peersStatementId, peersV2StatementId (type ByteBuffer, matching AdminRequestHandler's existing Prepared handler return type).
- Add a new
AdminRequestHandler.prepare(channel, queryString) factory method (the infrastructure for handling Prepared responses already exists in AdminRequestHandler at line ~188).
- After each
SELECT * populates a column cache, immediately chain a PREPARE call for the projected query string and store the returned ID.
- In the query-building step, if a
statementId is available, use Execute instead of Query.
- Extend
resetColumnCaches() to also clear the prepared IDs — so a reconnect re-issues SELECT * and re-prepares with whatever columns the new server exposes.
Notes
- Prepared statements on system tables are supported by both Scylla and Cassandra (confirmed by existing test in
PreparedStatementTest).
- Prepared IDs are per-node and do not transfer between connections. Clearing them in
resetColumnCaches() (already called by ControlConnection on reconnect) is sufficient.
- The first query per connection still pays full
SELECT * cost; this optimization only affects steady-state repeated queries.
Follows up on #858. Parent epic: DRIVER-274.
Context
PR #858 introduced dynamic column projection for topology monitor queries: the first query uses
SELECT *to discover available columns, then subsequent queries use a projectedSELECT col1, col2, ...string. However, those subsequent projected queries are still sent as plainQuerymessages — the query string is re-parsed by the server on every call.Proposed improvement
After the first
SELECT *populates a column cache (e.g.localColumns,peersColumns,peersV2Columns), immediately issue aPREPAREfor the resulting projected query string and cache the returnedstatementId/resultMetadataIdalongside the column list. All subsequent queries then useExecuteinstead ofQuery, sending only the prepared statement ID (~16 bytes) rather than the full query text (~130–160 bytes), and skipping server-side re-parsing on every call.Scope
The cleanest targets are the full-scan queries (no WHERE clause, no bind values):
SELECT col1, col2, ... FROM system.local WHERE key='local'(fixed WHERE, prepare once)SELECT col1, col2, ... FROM system.peersSELECT col1, col2, ... FROM system.peers_v2The WHERE-clause single-node queries in
refreshNode()already use named bind parameters (:address,:port) and passnullcolumns (i.e.SELECT *). These could also be extended to use prepared+projected form with positional bind values, but that is a separate concern.Implementation sketch
In
DefaultTopologyMonitor:volatilecache fields:localStatementId,peersStatementId,peersV2StatementId(typeByteBuffer, matchingAdminRequestHandler's existingPreparedhandler return type).AdminRequestHandler.prepare(channel, queryString)factory method (the infrastructure for handlingPreparedresponses already exists inAdminRequestHandlerat line ~188).SELECT *populates a column cache, immediately chain aPREPAREcall for the projected query string and store the returned ID.statementIdis available, useExecuteinstead ofQuery.resetColumnCaches()to also clear the prepared IDs — so a reconnect re-issuesSELECT *and re-prepares with whatever columns the new server exposes.Notes
PreparedStatementTest).resetColumnCaches()(already called byControlConnectionon reconnect) is sufficient.SELECT *cost; this optimization only affects steady-state repeated queries.Follows up on #858. Parent epic: DRIVER-274.