From cecfffbfb4352204cf5c941e05761610096c75c3 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Mon, 30 Mar 2026 18:28:37 +0100 Subject: [PATCH 01/12] Add posting index and covering index documentation New page: concepts/deep-dive/posting-index.md - When to use, creating with INDEX TYPE POSTING and INCLUDE - Covering index: how it works, supported types, choosing columns - Verifying with EXPLAIN, comparison with bitmap index - All accelerated query patterns with examples - SQL optimizer hints (no_covering, no_index) - Trade-offs: storage, write performance, memory - Architecture: file types, generations, sealing, FSST compression - Limitations Updated pages: - indexes.md: added index type comparison table - create-table.md: added posting index and INCLUDE syntax - alter-table-alter-column-add-index.md: added posting + INCLUDE examples - sql-optimizer-hints.md: added no_covering and no_index hints - schema-design-essentials.md: added indexing decision guide - sidebars.js: added posting-index to Deep Dive navigation --- documentation/concepts/deep-dive/indexes.md | 10 + .../concepts/deep-dive/posting-index.md | 328 ++++++++++++++++++ .../concepts/deep-dive/sql-optimizer-hints.md | 34 ++ .../sql/alter-table-alter-column-add-index.md | 34 +- documentation/query/sql/create-table.md | 46 ++- documentation/schema-design-essentials.md | 38 ++ documentation/sidebars.js | 1 + 7 files changed, 487 insertions(+), 4 deletions(-) create mode 100644 documentation/concepts/deep-dive/posting-index.md diff --git a/documentation/concepts/deep-dive/indexes.md b/documentation/concepts/deep-dive/indexes.md index 68a865032..baa0f7d26 100644 --- a/documentation/concepts/deep-dive/indexes.md +++ b/documentation/concepts/deep-dive/indexes.md @@ -14,6 +14,16 @@ Indexing is available for [symbol](/docs/concepts/symbol/) columns in both table and [materialized views](/docs/concepts/materialized-views). Index support for other types will be added over time. +QuestDB supports two index types: + +| Index type | Syntax | Covering support | Best for | +|------------|--------|-----------------|----------| +| **Bitmap** (default) | `INDEX` or `INDEX TYPE BITMAP` | No | General-purpose, low write overhead | +| **Posting** | `INDEX TYPE POSTING` | Yes (via `INCLUDE`) | Read-heavy workloads, selective queries, wide tables | + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for the detailed guide on the posting index and its covering query capabilities. + ## Index creation and deletion The following are ways to index a `symbol` column: diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md new file mode 100644 index 000000000..e84aad5ac --- /dev/null +++ b/documentation/concepts/deep-dive/posting-index.md @@ -0,0 +1,328 @@ +--- +title: Posting index and covering index +sidebar_label: Posting index +description: + The posting index is a compact, high-performance index for symbol columns + that supports covering queries. Learn how it works, when to use it, and + how to optimize queries with INCLUDE columns. +--- + +The **posting index** is an advanced index type for +[symbol](/docs/concepts/symbol/) columns that provides better compression, +faster reads, and **covering index** support compared to the default bitmap +index. + +A **covering index** stores additional column values alongside the index +entries, so queries that only need those columns can be answered entirely from +the index without reading the main column files. + +## When to use the posting index + +Use the posting index when: + +- You frequently filter on a symbol column (`WHERE symbol = 'X'`) +- Your queries select a small set of columns alongside the symbol filter +- You want to reduce I/O by reading from compact sidecar files instead of + full column files +- You need efficient `DISTINCT` queries on a symbol column +- You need efficient `LATEST ON` queries partitioned by a symbol column + +The posting index is especially effective for high-cardinality symbol columns +(hundreds to thousands of distinct values) and wide tables where reading full +column files is expensive. + +## Creating a posting index + +### At table creation + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING, + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +### With covering columns (INCLUDE) + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price, timestamp), + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +The `INCLUDE` clause specifies which columns are stored in the index sidecar +files. Queries that only read these columns plus the indexed symbol column +can be served entirely from the index. + +### On an existing table + +```questdb-sql +ALTER TABLE trades + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (exchange, price); +``` + +## Covering index + +The covering index is the most powerful feature of the posting index. When all +columns in a query's `SELECT` list are either: + +- The indexed symbol column itself (from the `WHERE` clause) +- Listed in the `INCLUDE` clause + +...the query engine reads data directly from the index sidecar files, bypassing +the main column files entirely. This is significantly faster for selective +queries on wide tables. + +### Supported column types in INCLUDE + +| Type | Supported | Notes | +|------|-----------|-------| +| BOOLEAN, BYTE, SHORT, CHAR | Yes | Fixed-width, 1-2 bytes per value | +| INT, FLOAT, IPv4 | Yes | Fixed-width, 4 bytes per value | +| LONG, DOUBLE, TIMESTAMP, DATE | Yes | Fixed-width, 8 bytes per value | +| GEOBYTE, GEOSHORT, GEOINT, GEOLONG | Yes | Fixed-width, 1-8 bytes depending on precision | +| DECIMAL8, DECIMAL16, DECIMAL32, DECIMAL64 | Yes | Fixed-width, 1-8 bytes depending on precision | +| SYMBOL | Yes | Stored as integer key, resolved at query time | +| VARCHAR | Yes | Variable-width, FSST compressed in sealed partitions | +| STRING | Yes | Variable-width, FSST compressed in sealed partitions | +| BINARY | No | Not yet supported | +| UUID, LONG256 | No | Not yet supported (requires multi-long sidecar format) | +| DECIMAL128, DECIMAL256 | No | Not yet supported | +| Arrays (DOUBLE[][], etc.) | No | Not supported | + +### How to choose INCLUDE columns + +Include columns that you frequently select together with the indexed symbol: + +```questdb-sql +-- If your typical queries look like this: +SELECT timestamp, price, quantity FROM trades WHERE symbol = 'AAPL'; + +-- Then include those columns: +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (timestamp, price, quantity), + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE, + -- other columns not needed in hot queries + raw_data VARCHAR, + metadata VARCHAR +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +:::tip + +Only include columns that appear in your most frequent queries. Each included +column adds storage overhead and slows down writes slightly. Columns not in +the `INCLUDE` list can still be queried — they just won't benefit from the +covering optimization and will be read from column files. + +::: + +### Verifying covering index usage + +Use `EXPLAIN` to verify that a query uses the covering index: + +```questdb-sql +EXPLAIN SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +If the covering index is used, the plan shows `CoveringIndex`: + +``` +SelectedRecord + CoveringIndex on: symbol with: timestamp, price + filter: symbol='AAPL' +``` + +If you see `DeferredSingleSymbolFilterPageFrame` or `PageFrame` instead, the +query is reading from column files. This happens when the `SELECT` list +includes columns not in the `INCLUDE` list. + +## Comparison with bitmap index + +| Feature | Bitmap index | Posting index | +|---------|-------------|---------------| +| Storage size | 8-16 bytes/value | ~1 byte/value | +| Covering index (INCLUDE) | No | Yes | +| DISTINCT acceleration | No | Yes | +| Write overhead | Minimal | Minimal (without INCLUDE) | +| Write overhead with INCLUDE | N/A | Moderate (depends on INCLUDE columns) | +| LATEST ON optimization | Yes | Yes | +| Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | + +## Query patterns accelerated + +### Point queries (WHERE symbol = 'X') + +```questdb-sql +-- Reads from sidecar if price is in INCLUDE +SELECT price FROM trades WHERE symbol = 'AAPL'; +``` + +### Point queries with additional filters + +If the additional filter columns are also in INCLUDE, the covering index +is still used with a filter applied on top: + +```questdb-sql +-- Covering index + filter on covered column +SELECT price FROM trades WHERE symbol = 'AAPL' AND price > 100; +``` + +### IN-list queries + +```questdb-sql +-- Multiple keys, still uses covering index +SELECT price FROM trades WHERE symbol IN ('AAPL', 'GOOGL', 'MSFT'); +``` + +### LATEST ON queries + +```questdb-sql +-- Latest row per symbol, reads from sidecar +SELECT timestamp, symbol, price +FROM trades +WHERE symbol = 'AAPL' +LATEST ON timestamp PARTITION BY symbol; +``` + +### DISTINCT queries + +```questdb-sql +-- Enumerates keys from index metadata, O(keys x partitions) instead of full scan +SELECT DISTINCT symbol FROM trades; + +-- Also works with timestamp filters +SELECT DISTINCT symbol FROM trades WHERE timestamp > '2024-01-01'; +``` + +### COUNT queries + +```questdb-sql +-- Uses index to scan only matching rows instead of full table +SELECT COUNT(*) FROM trades WHERE symbol = 'AAPL'; +``` + +### Aggregate queries on covered columns + +```questdb-sql +-- Vectorized GROUP BY reads from sidecar page frames +SELECT count(*), min(price), max(price) +FROM trades +WHERE symbol = 'AAPL'; +``` + +## SQL optimizer hints + +Two hints control index usage: + +### no_covering + +Forces the query to read from column files instead of the covering index +sidecar. Useful for benchmarking or when the covering path has an issue. + +```questdb-sql +SELECT /*+ no_covering */ price FROM trades WHERE symbol = 'AAPL'; +``` + +### no_index + +Completely disables index usage, falling back to a full table scan with +filter. Also implies `no_covering`. + +```questdb-sql +SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; +``` + +## Trade-offs + +### Storage + +The posting index itself is very compact (~1 byte per indexed value). +The covering sidecar adds storage proportional to the included columns: + +- Fixed-width columns (DOUBLE, INT, etc.): exact column size, compressed + with ALP (Adaptive Lossless floating-Point) and Frame-of-Reference bitpacking +- Variable-width columns (VARCHAR, STRING): FSST compressed in sealed + partitions, typically 2-5x smaller than raw column data +- The sidecar is typically 0.5-5% of the total column file size for the + included columns + +### Write performance + +Write overhead depends on the number and type of INCLUDE columns. Typical +ranges (measured with 100K row inserts, 50 symbol keys): + +- **Posting index without INCLUDE**: ~15-20% slower than no index +- **Posting index with fixed-width INCLUDE** (DOUBLE, INT): ~40-50% slower +- **Posting index with VARCHAR INCLUDE**: ~2x slower + +Actual overhead varies with row size, cardinality, and hardware. Query +performance improvements typically far outweigh the write cost for +read-heavy workloads. + +### Memory + +The posting index uses native memory for encoding/decoding buffers. +The covering index's FSST symbol tables use ~70KB of native memory per +compressed column per active reader. + +## Architecture + +The posting index stores data in three file types per partition: + +- **`.pk`** — Key file: double-buffered metadata pages with generation + directory (32 bytes per generation entry) +- **`.pv`** — Value file: delta + Frame-of-Reference bitpacked row IDs, + organized into stride-indexed generations +- **`.pci` + `.pc0`, `.pc1`, ...** — Sidecar files: covered column values + stored alongside the posting list, one file per INCLUDE column + +### Generations and sealing + +Data is written incrementally as **generations** (one per commit). Each +generation contains a sparse block of key→rowID mappings. Periodically, +generations are **sealed** into a single dense generation with stride-indexed +layout for optimal read performance. + +Sealing happens automatically when the generation count reaches the maximum +(125) or when the partition is closed. Sealed data uses two encoding modes +per stride (256 keys): + +- **Delta mode**: per-key delta encoding with bitpacking +- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking + +The encoder trial-encodes both modes and picks the smaller one per stride. + +### FSST compression for strings + +VARCHAR and STRING columns in the INCLUDE list are compressed using FSST +(Fast Static Symbol Table) compression during sealing. FSST replaces +frequently occurring 1-8 byte patterns with single-byte codes, typically +achieving 2-5x compression on string data with repetitive patterns. + +The FSST symbol table is trained per stride block and stored inline in the +sidecar file. Decompression is transparent to the query engine. + +## Limitations + +:::warning + +- INCLUDE is only supported for POSTING index type (not BITMAP) +- Array columns (DOUBLE[][], etc.) cannot be included +- BINARY, UUID, LONG256, DECIMAL128, and DECIMAL256 columns cannot yet be included +- SAMPLE BY queries do not currently use the covering index + (they fall back to the regular index path) +- REINDEX on WAL tables requires dropping and re-adding the index + (this applies to all index types, not just posting) + +::: diff --git a/documentation/concepts/deep-dive/sql-optimizer-hints.md b/documentation/concepts/deep-dive/sql-optimizer-hints.md index 93f207598..7a989a54b 100644 --- a/documentation/concepts/deep-dive/sql-optimizer-hints.md +++ b/documentation/concepts/deep-dive/sql-optimizer-hints.md @@ -358,3 +358,37 @@ your symbol set is high-cardinality. - superseded by `asof_index` - `asof_memoized_search` - superseded by `asof_memoized` + +----- + +## Index hints + +These hints control whether the query optimizer uses indexes (bitmap or posting) +for symbol column lookups. + +### no_covering + +Disables the [covering index](/docs/concepts/deep-dive/posting-index/) +optimization, forcing the query to read from column files instead of the +index sidecar. The index is still used for row ID lookup, but column values +are read from the main column files. + +```questdb-sql +SELECT /*+ no_covering */ price FROM trades WHERE symbol = 'AAPL'; +``` + +This is useful for benchmarking covering index performance or working around +a specific issue with the covering path. + +### no_index + +Completely disables all index usage for the query, including bitmap index, +posting index, and covering index. The query falls back to a full table scan +with a filter applied to every row. Also implies `no_covering`. + +```questdb-sql +SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; +``` + +This is useful for benchmarking index effectiveness or forcing a table scan +when you know the filter selectivity is low (many rows match). diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 3ddea73cb..bff10995f 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -10,13 +10,41 @@ Indexes an existing [`symbol`](/docs/concepts/symbol/) column. ![Flow chart showing the syntax of the ALTER TABLE ALTER COLUMN ADD INDEX keyword](/images/docs/diagrams/alterTableAddIndex.svg) - Adding an [index](/docs/concepts/deep-dive/indexes/) is an atomic, non-blocking, and non-waiting operation. Once complete, the SQL optimizer will start using the new index for SQL executions. -## Example +## Examples + +### Adding a bitmap index (default) -```questdb-sql title="Adding an index" +```questdb-sql ALTER TABLE trades ALTER COLUMN instrument ADD INDEX; ``` + +### Adding a posting index + +```questdb-sql +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; +``` + +### Adding a posting index with covering columns + +The `INCLUDE` clause stores additional column values in the index sidecar +files, enabling covering queries that bypass column file reads: + +```questdb-sql +ALTER TABLE trades + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity, timestamp); +``` + +After this, queries that only select columns from the `INCLUDE` list (plus the +indexed symbol column) are served from the index sidecar: + +```questdb-sql +-- This query reads from the index sidecar, not from column files +SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for supported column types and performance details. diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index ac77b4e20..8ec7a0299 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -475,6 +475,8 @@ must be of type [symbol](/docs/concepts/symbol/). ![Flow chart showing the syntax of the index function](/images/docs/diagrams/indexDef.svg) +### Bitmap index (default) + ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, @@ -484,13 +486,55 @@ CREATE TABLE trades ( ), INDEX(symbol) TIMESTAMP(timestamp); ``` +### Posting index + +The posting index offers better compression and read performance than the +default bitmap index. Use `INDEX TYPE POSTING`: + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING, + price DOUBLE, + amount DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +### Posting index with covering columns (INCLUDE) + +The `INCLUDE` clause stores additional column values in the index sidecar +files. Queries that only need these columns plus the indexed symbol can be +served entirely from the index, bypassing column files: + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, timestamp, exchange), + exchange SYMBOL, + price DOUBLE, + amount DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + +With this schema, the following query reads only from the index sidecar: + +```questdb-sql +SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; +``` + +See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) +for a comprehensive guide including supported column types, query patterns, +and performance characteristics. + :::warning - The **index capacity** and [**symbol capacity**](/docs/concepts/symbol/) are different settings. - The index capacity value should not be changed, unless a user is aware of all - the implications. ::: + the implications. + +::: See the [Index concept](/docs/concepts/deep-dive/indexes/#how-indexes-work) for more information about indexes. diff --git a/documentation/schema-design-essentials.md b/documentation/schema-design-essentials.md index 592e9d09f..ea585a933 100644 --- a/documentation/schema-design-essentials.md +++ b/documentation/schema-design-essentials.md @@ -75,6 +75,44 @@ TIMESTAMP(ts) PARTITION BY MONTH; See [Partitions](/docs/concepts/partitions/) for details. +## Indexing + +Index your primary filter columns to speed up `WHERE` clause queries. QuestDB +supports two index types for SYMBOL columns: + +```questdb-sql +-- Default bitmap index — low overhead, good for most cases +CREATE TABLE trades ( + ts TIMESTAMP, + symbol SYMBOL INDEX, + price DOUBLE +) TIMESTAMP(ts) PARTITION BY DAY WAL; + +-- Posting index with covering columns — best for read-heavy, selective queries +CREATE TABLE trades ( + ts TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, ts), + price DOUBLE, + raw_data VARCHAR -- not in INCLUDE, read from column files +) TIMESTAMP(ts) PARTITION BY DAY WAL; +``` + +**When to choose each:** + +| Scenario | Recommendation | +|----------|---------------| +| General purpose, write-heavy | Bitmap index (`INDEX`) | +| Read-heavy, filtering on symbol | Posting index (`INDEX TYPE POSTING`) | +| Frequent queries on a few columns | Posting with `INCLUDE` | +| Wide table, queries select subset | Posting with `INCLUDE` — biggest win | + +The covering index (`INCLUDE`) lets queries that only select covered columns +read from compact sidecar files instead of full column files. Use `EXPLAIN` to +verify your queries use the `CoveringIndex` plan. + +See [Indexes](/docs/concepts/deep-dive/indexes/) and +[Posting index](/docs/concepts/deep-dive/posting-index/) for details. + ## Data types ### SYMBOL vs VARCHAR diff --git a/documentation/sidebars.js b/documentation/sidebars.js index 0a83522d1..9c3850465 100644 --- a/documentation/sidebars.js +++ b/documentation/sidebars.js @@ -538,6 +538,7 @@ module.exports = { collapsed: true, items: [ "concepts/deep-dive/indexes", + "concepts/deep-dive/posting-index", "concepts/deep-dive/interval-scan", "concepts/deep-dive/jit-compiler", "concepts/deep-dive/query-tracing", From 1527456806d2c81c31c171ff61e1d95ef002efb5 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:29:43 +0100 Subject: [PATCH 02/12] Update posting/covering index docs to match current codebase - All column types now supported in INCLUDE (UUID, LONG256, BINARY, DECIMAL128/256, arrays were added since initial docs) - Document encoding options: POSTING DELTA and POSTING EF syntax - Document designated timestamp auto-inclusion in covering index - Add out-of-line INDEX(col TYPE POSTING) syntax examples - Add SHOW COLUMNS indexType and indexInclude columns to show.md, meta.md (table_columns), and posting-index.md - Add SHOW CREATE TABLE example with posting index - Note CAPACITY restriction (bitmap only) across all relevant pages - Note INCLUDE restrictions (inline syntax only, cannot include indexed column itself) - Update storage/compression details per column type category Co-Authored-By: Claude Opus 4.6 --- documentation/concepts/deep-dive/indexes.md | 7 +- .../concepts/deep-dive/posting-index.md | 140 ++++++++++++++---- documentation/query/functions/meta.md | 16 +- .../sql/alter-table-alter-column-add-index.md | 15 +- documentation/query/sql/create-table.md | 29 +++- documentation/query/sql/show.md | 35 ++++- documentation/schema-design-essentials.md | 9 +- 7 files changed, 201 insertions(+), 50 deletions(-) diff --git a/documentation/concepts/deep-dive/indexes.md b/documentation/concepts/deep-dive/indexes.md index baa0f7d26..7f2b7d861 100644 --- a/documentation/concepts/deep-dive/indexes.md +++ b/documentation/concepts/deep-dive/indexes.md @@ -107,6 +107,9 @@ Consider the following query applied to the above table :::warning +Index capacity applies to **bitmap indexes only**. Posting indexes manage +their own storage layout and do not use this setting. + We strongly recommend to rely on the default index capacity. Misconfiguring this property might lead to worse performance and increased disk usage. @@ -124,8 +127,8 @@ When in doubt, reach out via the QuestDB support channels for advice. ::: -When a symbol column is indexed, an additional **index capacity** can be defined -to specify how many row IDs to store in a single storage block on disk: +When a symbol column has a bitmap index, an additional **index capacity** can be +defined to specify how many row IDs to store in a single storage block on disk: - Server-wide setting: `cairo.index.value.block.size` with a default of `256` - Column-wide setting: The diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index e84aad5ac..55309dc82 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -35,6 +35,8 @@ column files is expensive. ### At table creation +Inline syntax (index defined alongside the column): + ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, @@ -45,12 +47,25 @@ CREATE TABLE trades ( ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` +Out-of-line syntax (index defined separately): + +```questdb-sql +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL, + exchange SYMBOL, + price DOUBLE, + quantity DOUBLE +), INDEX(symbol TYPE POSTING) +TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + ### With covering columns (INCLUDE) ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price, timestamp), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (exchange, price), exchange SYMBOL, price DOUBLE, quantity DOUBLE @@ -61,6 +76,23 @@ The `INCLUDE` clause specifies which columns are stored in the index sidecar files. Queries that only read these columns plus the indexed symbol column can be served entirely from the index. +:::tip + +The designated timestamp column is automatically included in the covering +index when an `INCLUDE` clause is present — you do not need to list it +explicitly. This means timestamp-filtered covering queries work out of the +box. + +::: + +:::note + +The `INCLUDE` clause is only supported with inline column syntax and +`ALTER TABLE`. The out-of-line `INDEX(col TYPE POSTING)` syntax does not +support `INCLUDE`. + +::: + ### On an existing table ```questdb-sql @@ -68,6 +100,34 @@ ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (exchange, price); ``` +### Encoding options + +The posting index supports two internal row ID encoding strategies. In most +cases the default is optimal and no keyword is needed: + +| Syntax | Encoding | Description | +|--------|----------|-------------| +| `INDEX TYPE POSTING` | Adaptive (default) | Trial-encodes delta and flat modes per stride, picks the smaller | +| `INDEX TYPE POSTING EF` | Adaptive (explicit) | Same as above — `EF` makes the choice explicit | +| `INDEX TYPE POSTING DELTA` | Delta-only | Forces per-key delta encoding, skipping flat-mode trial | + +```questdb-sql +-- Default adaptive encoding (recommended) +CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) + TIMESTAMP(ts) PARTITION BY DAY WAL; + +-- Force delta-only encoding +CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) + TIMESTAMP(ts) PARTITION BY DAY WAL; +``` + +:::note + +`CAPACITY` is only supported for bitmap indexes. Using `CAPACITY` with a +posting index will produce an error. + +::: + ## Covering index The covering index is the most powerful feature of the posting index. When all @@ -82,20 +142,20 @@ queries on wide tables. ### Supported column types in INCLUDE -| Type | Supported | Notes | -|------|-----------|-------| -| BOOLEAN, BYTE, SHORT, CHAR | Yes | Fixed-width, 1-2 bytes per value | -| INT, FLOAT, IPv4 | Yes | Fixed-width, 4 bytes per value | -| LONG, DOUBLE, TIMESTAMP, DATE | Yes | Fixed-width, 8 bytes per value | -| GEOBYTE, GEOSHORT, GEOINT, GEOLONG | Yes | Fixed-width, 1-8 bytes depending on precision | -| DECIMAL8, DECIMAL16, DECIMAL32, DECIMAL64 | Yes | Fixed-width, 1-8 bytes depending on precision | -| SYMBOL | Yes | Stored as integer key, resolved at query time | -| VARCHAR | Yes | Variable-width, FSST compressed in sealed partitions | -| STRING | Yes | Variable-width, FSST compressed in sealed partitions | -| BINARY | No | Not yet supported | -| UUID, LONG256 | No | Not yet supported (requires multi-long sidecar format) | -| DECIMAL128, DECIMAL256 | No | Not yet supported | -| Arrays (DOUBLE[][], etc.) | No | Not supported | +All column types except the indexed symbol column itself can be included: + +| Type | Compression | Notes | +|------|-------------|-------| +| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Raw copy | 1 byte per value | +| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference | 2 bytes uncompressed | +| INT, FLOAT, IPv4, GEOINT, DECIMAL32 | FoR (int) / ALP (float) | 4 bytes uncompressed | +| LONG, DOUBLE, TIMESTAMP, DATE, GEOLONG, DECIMAL64 | FoR / ALP / linear prediction | 8 bytes uncompressed | +| SYMBOL | Frame-of-Reference | Stored as integer key, resolved at query time | +| UUID, DECIMAL128 | Raw copy | 16 bytes per value | +| LONG256, DECIMAL256 | Raw copy | 32 bytes per value | +| VARCHAR, STRING | FSST compressed | Variable-width, typically 2-5x compression | +| BINARY | Variable-width sidecar | Stored in offset-based format | +| Arrays (DOUBLE[], INT[], etc.) | Variable-width sidecar | Stored in offset-based format | ### How to choose INCLUDE columns @@ -105,10 +165,10 @@ Include columns that you frequently select together with the indexed symbol: -- If your typical queries look like this: SELECT timestamp, price, quantity FROM trades WHERE symbol = 'AAPL'; --- Then include those columns: +-- Then include those columns (timestamp is auto-included as designated timestamp): CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (timestamp, price, quantity), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, quantity), exchange SYMBOL, price DOUBLE, quantity DOUBLE, @@ -127,6 +187,26 @@ covering optimization and will be read from column files. ::: +### Inspecting indexes with SHOW COLUMNS + +`SHOW COLUMNS` displays index metadata for each column, including the index +type and covered columns: + +```questdb-sql +SHOW COLUMNS FROM trades; +``` + +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | +|--------|------|---------|-------------------|-----------|-------------|-------------|----------------|------------|-----------| +| timestamp | TIMESTAMP | false | 0 | | | false | 0 | true | false | +| symbol | SYMBOL | true | 256 | POSTING | exchange,price | true | 128 | false | false | +| exchange | SYMBOL | false | 0 | | | true | 128 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | false | false | +| quantity | DOUBLE | false | 0 | | | false | 0 | false | false | + +The `indexType` column shows `POSTING`, `BITMAP`, or is empty for +non-indexed columns. The `indexInclude` column lists covered column names. + ### Verifying covering index usage Use `EXPLAIN` to verify that a query uses the covering index: @@ -250,12 +330,16 @@ SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; The posting index itself is very compact (~1 byte per indexed value). The covering sidecar adds storage proportional to the included columns: -- Fixed-width columns (DOUBLE, INT, etc.): exact column size, compressed - with ALP (Adaptive Lossless floating-Point) and Frame-of-Reference bitpacking -- Variable-width columns (VARCHAR, STRING): FSST compressed in sealed +- **Numeric columns** (DOUBLE, FLOAT): compressed with ALP (Adaptive + Lossless floating-Point) and Frame-of-Reference bitpacking +- **Integer columns** (INT, LONG, etc.): Frame-of-Reference bitpacking; + TIMESTAMP additionally uses linear-prediction encoding +- **Small fixed-width types** (BYTE, BOOLEAN, etc.): stored as raw copies +- **Wide fixed-width types** (UUID, LONG256, DECIMAL128/256): stored as + raw copies with a count header +- **Variable-width columns** (VARCHAR, STRING): FSST compressed in sealed partitions, typically 2-5x smaller than raw column data -- The sidecar is typically 0.5-5% of the total column file size for the - included columns +- **BINARY and arrays**: stored in an offset-based variable-width sidecar ### Write performance @@ -317,12 +401,14 @@ sidecar file. Decompression is transparent to the query engine. :::warning -- INCLUDE is only supported for POSTING index type (not BITMAP) -- Array columns (DOUBLE[][], etc.) cannot be included -- BINARY, UUID, LONG256, DECIMAL128, and DECIMAL256 columns cannot yet be included -- SAMPLE BY queries do not currently use the covering index +- `INCLUDE` is only supported for the posting index type (not bitmap) +- `INCLUDE` cannot list the indexed symbol column itself +- `INCLUDE` is not supported with out-of-line `INDEX(col ...)` syntax — + use inline column syntax or `ALTER TABLE` instead +- `CAPACITY` is not supported for posting indexes (bitmap only) +- `SAMPLE BY` queries do not currently use the covering index (they fall back to the regular index path) -- REINDEX on WAL tables requires dropping and re-adding the index +- `REINDEX` on WAL tables requires dropping and re-adding the index (this applies to all index types, not just posting) ::: diff --git a/documentation/query/functions/meta.md b/documentation/query/functions/meta.md index 832f56bb2..d7d450b9f 100644 --- a/documentation/query/functions/meta.md +++ b/documentation/query/functions/meta.md @@ -594,6 +594,10 @@ Returns a `table` with the following columns: - `indexed` - if indexing is applied to this column - `indexBlockCapacity` - how many row IDs to store in a single storage block on disk +- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) + (`POSTING`, `BITMAP`, or empty) +- `indexInclude` - comma-separated names of columns included in a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar - `symbolCached` - whether this `symbol` column is cached - `symbolCapacity` - how many distinct values this column of `symbol` type is expected to have @@ -611,12 +615,12 @@ For more details on the meaning and use of these values, see the table_columns('my_table'); ``` -| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | designated | upsertKey | -| ------ | --------- | ------- | ------------------ | ------------ | -------------- | ---------- | --------- | -| symb | SYMBOL | true | 1048576 | false | 256 | false | false | -| price | DOUBLE | false | 0 | false | 0 | false | false | -| ts | TIMESTAMP | false | 0 | false | 0 | true | false | -| s | VARCHAR | false | 0 | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | +| ------ | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | ---------- | --------- | +| symb | SYMBOL | true | 1048576 | | | false | 256 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | false | false | +| ts | TIMESTAMP | false | 0 | | | false | 0 | true | false | +| s | VARCHAR | false | 0 | | | false | 0 | false | false | ```questdb-sql title="Get designated timestamp column" SELECT "column", type, designated FROM table_columns('my_table') WHERE designated = true; diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index bff10995f..12d77467a 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -28,6 +28,13 @@ ALTER TABLE trades ALTER COLUMN instrument ADD INDEX; ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; ``` +An encoding variant can be specified: + +```questdb-sql +-- Force delta-only encoding +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING DELTA; +``` + ### Adding a posting index with covering columns The `INCLUDE` clause stores additional column values in the index sidecar @@ -35,11 +42,15 @@ files, enabling covering queries that bypass column file reads: ```questdb-sql ALTER TABLE trades - ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity, timestamp); + ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity); ``` +The designated timestamp column is automatically included in the covering +index — you do not need to list it explicitly. + After this, queries that only select columns from the `INCLUDE` list (plus the -indexed symbol column) are served from the index sidecar: +indexed symbol column and designated timestamp) are served from the index +sidecar: ```questdb-sql -- This query reads from the index sidecar, not from column files diff --git a/documentation/query/sql/create-table.md b/documentation/query/sql/create-table.md index 8ec7a0299..6705f7753 100644 --- a/documentation/query/sql/create-table.md +++ b/documentation/query/sql/create-table.md @@ -489,15 +489,26 @@ CREATE TABLE trades ( ### Posting index The posting index offers better compression and read performance than the -default bitmap index. Use `INDEX TYPE POSTING`: +default bitmap index. Use `INDEX TYPE POSTING` with either inline or +out-of-line syntax: ```questdb-sql +-- Inline syntax CREATE TABLE trades ( timestamp TIMESTAMP, symbol SYMBOL INDEX TYPE POSTING, price DOUBLE, amount DOUBLE ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; + +-- Out-of-line syntax +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL, + price DOUBLE, + amount DOUBLE +), INDEX(symbol TYPE POSTING) +TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` ### Posting index with covering columns (INCLUDE) @@ -509,19 +520,29 @@ served entirely from the index, bypassing column files: ```questdb-sql CREATE TABLE trades ( timestamp TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, timestamp, exchange), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, exchange), exchange SYMBOL, price DOUBLE, amount DOUBLE ) TIMESTAMP(timestamp) PARTITION BY DAY WAL; ``` -With this schema, the following query reads only from the index sidecar: +The designated timestamp column is automatically included — you do not need +to list it in the `INCLUDE` clause. With this schema, the following query +reads only from the index sidecar: ```questdb-sql SELECT timestamp, price FROM trades WHERE symbol = 'AAPL'; ``` +:::note + +`INCLUDE` is only supported with inline column syntax (not out-of-line +`INDEX(col ...)`). Use `ALTER TABLE` to add covering columns to an existing +table. + +::: + See [Posting index and covering index](/docs/concepts/deep-dive/posting-index/) for a comprehensive guide including supported column types, query patterns, and performance characteristics. @@ -533,6 +554,8 @@ and performance characteristics. settings. - The index capacity value should not be changed, unless a user is aware of all the implications. +- `CAPACITY` is only supported for bitmap indexes — it cannot be used with + posting indexes. ::: diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index 5d14121fe..6bdcf9c12 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -57,13 +57,18 @@ SHOW TABLES; SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | -| --------- | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | -| symbol | SYMBOL | false | 0 | true | 256 | 42 | false | false | -| side | SYMBOL | false | 0 | true | 256 | 2 | false | false | -| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | -| amount | DOUBLE | false | 0 | false | 0 | 0 | false | false | -| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | +| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | +| --------- | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | --------------- | ---------- | --------- | +| symbol | SYMBOL | false | 0 | | | true | 256 | 42 | false | false | +| side | SYMBOL | false | 0 | | | true | 256 | 2 | false | false | +| price | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | +| amount | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | +| timestamp | TIMESTAMP | false | 0 | | | false | 0 | 0 | true | false | + +The `indexType` column shows the index type (`POSTING`, `BITMAP`, or empty for +non-indexed columns). The `indexInclude` column lists the names of columns +included in a [posting index's](/docs/concepts/deep-dive/posting-index/) +covering sidecar, as a comma-separated string. ### SHOW CREATE TABLE @@ -88,6 +93,22 @@ CREATE TABLE trades ( WITH maxUncommittedRows=500000, o3MaxLag=600000000us; ``` +#### Posting index with covering columns + +When a symbol column has a posting index with `INCLUDE`, the DDL reflects +the index type and covered columns: + +```questdb-sql +CREATE TABLE trades ( + symbol SYMBOL CAPACITY 128 CACHE INDEX TYPE POSTING INCLUDE (price, exchange), + exchange SYMBOL CAPACITY 128 CACHE, + price DOUBLE, + amount DOUBLE, + timestamp TIMESTAMP +) timestamp(timestamp) PARTITION BY DAY WAL +WITH maxUncommittedRows=500000, o3MaxLag=600000000us; +``` + #### Per-column Parquet encoding When columns have per-column Parquet encoding or compression overrides, they diff --git a/documentation/schema-design-essentials.md b/documentation/schema-design-essentials.md index ea585a933..b88a6c929 100644 --- a/documentation/schema-design-essentials.md +++ b/documentation/schema-design-essentials.md @@ -91,10 +91,11 @@ CREATE TABLE trades ( -- Posting index with covering columns — best for read-heavy, selective queries CREATE TABLE trades ( ts TIMESTAMP, - symbol SYMBOL INDEX TYPE POSTING INCLUDE (price, ts), + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price), price DOUBLE, raw_data VARCHAR -- not in INCLUDE, read from column files ) TIMESTAMP(ts) PARTITION BY DAY WAL; +-- The designated timestamp (ts) is automatically included in the covering index. ``` **When to choose each:** @@ -107,8 +108,10 @@ CREATE TABLE trades ( | Wide table, queries select subset | Posting with `INCLUDE` — biggest win | The covering index (`INCLUDE`) lets queries that only select covered columns -read from compact sidecar files instead of full column files. Use `EXPLAIN` to -verify your queries use the `CoveringIndex` plan. +read from compact sidecar files instead of full column files. The designated +timestamp is automatically included, so timestamp-filtered queries benefit +without explicit listing. Use `EXPLAIN` to verify your queries use the +`CoveringIndex` plan. See [Indexes](/docs/concepts/deep-dive/indexes/) and [Posting index](/docs/concepts/deep-dive/posting-index/) for details. From 72b98cada696fbbcbc1ccfbc0ba14ab5c345b054 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:36:29 +0100 Subject: [PATCH 03/12] Add posting index docs to explain, symbol, mat-view, config pages - explain.md: add CoveringIndex and PostingIndex plan node descriptions - symbol.md: add posting index example alongside bitmap in indexing section - alter-mat-view-alter-column-add-index.md: add TYPE POSTING syntax (INCLUDE not supported on materialized views) - _cairo.config.json: add cairo.posting.index.auto.include.timestamp and cairo.posting.index.row.id.encoding config keys; clarify bitmap-only scope of cairo.index.value.block.size and cairo.spin.lock.timeout Co-Authored-By: Claude Opus 4.6 --- documentation/concepts/symbol.md | 14 +++++++++++ .../configuration-utils/_cairo.config.json | 12 ++++++++-- .../alter-mat-view-alter-column-add-index.md | 24 ++++++++++++++++--- documentation/query/sql/explain.md | 6 +++++ 4 files changed, 51 insertions(+), 5 deletions(-) diff --git a/documentation/concepts/symbol.md b/documentation/concepts/symbol.md index 2e5bf740f..bfd7b0be5 100755 --- a/documentation/concepts/symbol.md +++ b/documentation/concepts/symbol.md @@ -117,6 +117,7 @@ ALTER TABLE trades ALTER COLUMN client_id CACHE; For columns frequently used in `WHERE` clauses, add an index: ```questdb-sql +-- Bitmap index (default) — low overhead, good for most cases CREATE TABLE trades ( timestamp TIMESTAMP, symbol SYMBOL INDEX, @@ -124,10 +125,23 @@ CREATE TABLE trades ( ) TIMESTAMP(timestamp) PARTITION BY DAY; ``` +For read-heavy workloads, a [posting index](/docs/concepts/deep-dive/posting-index/) +offers better compression and supports covering queries: + +```questdb-sql +-- Posting index with covering columns — reads from compact sidecar files +CREATE TABLE trades ( + timestamp TIMESTAMP, + symbol SYMBOL INDEX TYPE POSTING INCLUDE (price), + price DOUBLE +) TIMESTAMP(timestamp) PARTITION BY DAY WAL; +``` + Or add an index later: ```questdb-sql ALTER TABLE trades ALTER COLUMN symbol ADD INDEX; +-- or: ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING; ``` See [Indexes](/docs/concepts/deep-dive/indexes/) for more information. diff --git a/documentation/configuration/configuration-utils/_cairo.config.json b/documentation/configuration/configuration-utils/_cairo.config.json index 9fd0f8689..64f928008 100644 --- a/documentation/configuration/configuration-utils/_cairo.config.json +++ b/documentation/configuration/configuration-utils/_cairo.config.json @@ -81,7 +81,15 @@ }, "cairo.index.value.block.size": { "default": "256", - "description": "Approximation of number of rows for a single index key, must be power of 2." + "description": "Approximation of number of rows for a single index key, must be power of 2. Applies to bitmap indexes only; posting indexes manage their own block layout." + }, + "cairo.posting.index.auto.include.timestamp": { + "default": "true", + "description": "When `true`, the designated timestamp column is automatically added to the covering index when a [posting index](/docs/concepts/deep-dive/posting-index/) is created with an `INCLUDE` clause." + }, + "cairo.posting.index.row.id.encoding": { + "default": "posting", + "description": "Default row ID encoding for posting indexes. Valid values: `posting` (adaptive delta/flat trial encoding) and `posting_delta` (delta-only encoding)." }, "cairo.max.swap.file.count": { "default": "30", @@ -105,7 +113,7 @@ }, "cairo.spin.lock.timeout": { "default": "1000", - "description": "Timeout when attempting to get BitmapIndexReaders in millisecond." + "description": "Timeout in milliseconds when attempting to acquire index readers (bitmap and posting)." }, "cairo.character.store.capacity": { "default": "1024", diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index d8b5d2b27..d866e788b 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -12,6 +12,7 @@ query performance for filtered lookups. ``` ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [ CAPACITY n ] +ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX TYPE POSTING ``` ## Parameters @@ -20,7 +21,8 @@ ALTER MATERIALIZED VIEW viewName ALTER COLUMN columnName ADD INDEX [ CAPACITY n | --------- | ----------- | | `viewName` | Name of the materialized view | | `columnName` | Name of the `SYMBOL` column to index | -| `CAPACITY` | Optional index capacity (advanced; use default unless you understand implications) | +| `CAPACITY` | Optional index capacity for bitmap indexes (advanced; use default unless you understand implications) | +| `TYPE POSTING` | Use a [posting index](/docs/concepts/deep-dive/posting-index/) instead of the default bitmap index | ## When to use @@ -30,13 +32,29 @@ Add an index when: - The column has high cardinality (many distinct values) - Query performance on the materialized view needs improvement -## Example +## Examples -```questdb-sql title="Add index to symbol column" +### Adding a bitmap index (default) + +```questdb-sql title="Add bitmap index to symbol column" ALTER MATERIALIZED VIEW trades_hourly ALTER COLUMN symbol ADD INDEX; ``` +### Adding a posting index + +```questdb-sql title="Add posting index to symbol column" +ALTER MATERIALIZED VIEW trades_hourly + ALTER COLUMN symbol ADD INDEX TYPE POSTING; +``` + +:::note + +The `INCLUDE` clause for covering indexes is not supported on materialized +views. Use a posting index without `INCLUDE` for faster filtered lookups. + +::: + ## Behavior | Aspect | Description | diff --git a/documentation/query/sql/explain.md b/documentation/query/sql/explain.md index d3858d806..cabe99f72 100644 --- a/documentation/query/sql/explain.md +++ b/documentation/query/sql/explain.md @@ -76,6 +76,12 @@ The following list contains some plan node types: `INTERSECT`). - `Index forward/backward scan` - scans all row ids associated with a given `symbol` value from start to finish or vice versa. +- `CoveringIndex` - reads data from a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar + files instead of main column files. Appears when all selected columns are + covered by the `INCLUDE` clause. +- `PostingIndex` - uses a posting index for accelerated operations such as + `DISTINCT` on a symbol column. - `Limit` - standalone node implementing the `LIMIT` keyword. Other nodes can implement `LIMIT` internally, e.g. the `Sort` node. - `Row forward/backward scan` - scans data frame (usually partitioned) records From 630384d8ed85cf5485443b06dbb7747aa9e585c5 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:41:31 +0100 Subject: [PATCH 04/12] Clarify encoding trade-offs: delta vs adaptive modes Delta encoding compresses best for regular, evenly-distributed data and is faster for large scans. The adaptive (default) mode additionally trial-encodes a flat layout that compresses better for irregular distributions and is faster for point queries. Co-Authored-By: Claude Opus 4.6 --- .../concepts/deep-dive/posting-index.md | 47 +++++++++++++------ 1 file changed, 33 insertions(+), 14 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 55309dc82..9c13396d0 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -102,21 +102,36 @@ ALTER TABLE trades ### Encoding options -The posting index supports two internal row ID encoding strategies. In most -cases the default is optimal and no keyword is needed: - -| Syntax | Encoding | Description | -|--------|----------|-------------| -| `INDEX TYPE POSTING` | Adaptive (default) | Trial-encodes delta and flat modes per stride, picks the smaller | -| `INDEX TYPE POSTING EF` | Adaptive (explicit) | Same as above — `EF` makes the choice explicit | -| `INDEX TYPE POSTING DELTA` | Delta-only | Forces per-key delta encoding, skipping flat-mode trial | +The posting index supports two row ID encoding strategies with different +performance characteristics: + +| Syntax | Encoding | Best for | +|--------|----------|----------| +| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both modes per stride, picks the smaller | +| `INDEX TYPE POSTING DELTA` | Delta-only | Regular, evenly-distributed data — faster large scans | + +**Delta encoding** stores per-key deltas between consecutive row IDs with +Frame-of-Reference bitpacking. It compresses best when row IDs for each +symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion +of a fixed set of symbols) and is faster for queries that scan large +ranges of matching rows. + +The **adaptive (default)** encoding additionally trial-encodes a +stride-wide flat layout and picks whichever is smaller. This mode +compresses better for irregular data distributions (e.g. bursty or +skewed symbol frequencies) and produces a layout that is faster for +point queries and selective lookups. + +For most workloads the default adaptive encoding is the best choice. +Use `DELTA` only when you know your data arrives in a regular pattern +and your queries predominantly scan large result sets. ```questdb-sql --- Default adaptive encoding (recommended) +-- Default adaptive encoding (recommended for most workloads) CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; --- Force delta-only encoding +-- Delta-only encoding (regular data, large scans) CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` @@ -379,13 +394,17 @@ generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. Sealing happens automatically when the generation count reaches the maximum -(125) or when the partition is closed. Sealed data uses two encoding modes -per stride (256 keys): +(125) or when the partition is closed. With the default adaptive encoding, +sealed data uses two encoding modes per stride (256 keys): -- **Delta mode**: per-key delta encoding with bitpacking -- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking +- **Delta mode**: per-key delta encoding with bitpacking — compresses best + for regular, evenly-distributed row IDs and is faster for large scans +- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking — + compresses better for irregular distributions and is faster for point + queries The encoder trial-encodes both modes and picks the smaller one per stride. +With `POSTING DELTA`, only delta mode is used. ### FSST compression for strings From e187aac19b90b83cffb83097ddf1defe3d1bc7a3 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Fri, 10 Apr 2026 14:42:37 +0100 Subject: [PATCH 05/12] Add EF encoding as distinct option, clarify all three modes - POSTING DELTA: regular data, better compression for even distributions, faster for large sequential scans - POSTING EF: Elias-Fano encoding, better compression for irregular distributions, faster for point queries - POSTING (default): adaptive, trial-encodes both per stride, picks smaller Co-Authored-By: Claude Opus 4.6 --- .../concepts/deep-dive/posting-index.md | 54 ++++++++++--------- 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 9c13396d0..e6d53d900 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -102,13 +102,14 @@ ALTER TABLE trades ### Encoding options -The posting index supports two row ID encoding strategies with different -performance characteristics: +The posting index supports three row ID encoding options with different +compression and query performance characteristics: | Syntax | Encoding | Best for | |--------|----------|----------| -| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both modes per stride, picks the smaller | -| `INDEX TYPE POSTING DELTA` | Delta-only | Regular, evenly-distributed data — faster large scans | +| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both EF and delta per stride, picks the smaller | +| `INDEX TYPE POSTING EF` | Elias-Fano | Irregular data distributions, point queries and selective lookups | +| `INDEX TYPE POSTING DELTA` | Delta | Regular, evenly-distributed data, large sequential scans | **Delta encoding** stores per-key deltas between consecutive row IDs with Frame-of-Reference bitpacking. It compresses best when row IDs for each @@ -116,23 +117,27 @@ symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion of a fixed set of symbols) and is faster for queries that scan large ranges of matching rows. -The **adaptive (default)** encoding additionally trial-encodes a -stride-wide flat layout and picks whichever is smaller. This mode -compresses better for irregular data distributions (e.g. bursty or -skewed symbol frequencies) and produces a layout that is faster for -point queries and selective lookups. +**Elias-Fano (EF) encoding** uses a stride-wide flat layout with +Frame-of-Reference bitpacking across all keys in a stride. It compresses +better for irregular data distributions (e.g. bursty or skewed symbol +frequencies) and is faster for point queries and selective lookups. -For most workloads the default adaptive encoding is the best choice. -Use `DELTA` only when you know your data arrives in a regular pattern -and your queries predominantly scan large result sets. +The **adaptive (default)** encoding trial-encodes both EF and delta modes +per stride and picks whichever produces the smaller output. This is the +best choice when you are unsure about your data distribution or have a +mixed query workload. ```questdb-sql -- Default adaptive encoding (recommended for most workloads) CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; +-- EF encoding (irregular data, point queries) +CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING EF) + TIMESTAMP(ts) PARTITION BY DAY WAL; + -- Delta-only encoding (regular data, large scans) -CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) +CREATE TABLE t3 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` @@ -394,17 +399,18 @@ generations are **sealed** into a single dense generation with stride-indexed layout for optimal read performance. Sealing happens automatically when the generation count reaches the maximum -(125) or when the partition is closed. With the default adaptive encoding, -sealed data uses two encoding modes per stride (256 keys): - -- **Delta mode**: per-key delta encoding with bitpacking — compresses best - for regular, evenly-distributed row IDs and is faster for large scans -- **Flat mode**: stride-wide Frame-of-Reference with contiguous bitpacking — - compresses better for irregular distributions and is faster for point - queries - -The encoder trial-encodes both modes and picks the smaller one per stride. -With `POSTING DELTA`, only delta mode is used. +(125) or when the partition is closed. Sealed data uses two encoding modes +per stride (256 keys): + +- **Delta mode** (`POSTING DELTA`): per-key delta encoding with bitpacking — + compresses best for regular, evenly-distributed row IDs and is faster for + large sequential scans +- **Elias-Fano mode** (`POSTING EF`): stride-wide Frame-of-Reference with + contiguous bitpacking — compresses better for irregular distributions and + is faster for point queries + +With the default adaptive encoding (`POSTING`), the encoder trial-encodes +both modes per stride and picks the smaller one. ### FSST compression for strings From 96b89305abb0cf47747cc1ffb3d81733ae13000f Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:40:52 +0100 Subject: [PATCH 06/12] Correct posting/covering index facts verified against live instance MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit posting-index.md: - Auto-include of designated timestamp applies to any posting index, not only when an INCLUDE clause is present (verified in source and EXPLAIN). Document SHOW CREATE TABLE round-trip with the expanded list. - Note that bare INDEX INCLUDE (...) auto-promotes to POSTING. - Replace "max 125 generations" with the actual seal threshold of 16 (cairo.posting.seal.gen.threshold). - Distinguish the two seal-time sub-layouts (Delta sub-layout and Flat sub-layout, both internal to delta+FoR) from the SQL DELTA / EF encoding variants — they were previously conflated. - Note the native AVX2 fast path for 8/16/32-bit widths. - Bitmap storage size: ~15 B/value (PR benchmark figure) instead of the older 8-16 B/value range. - Write-perf comparison baselined against bitmap (~9% slower for the index path itself) instead of vs. no-index. - FSST symbol table is ~2.3 KB and L1-resident; drop the unverified ~70 KB per-reader figure. - Generalise the SAMPLE BY limitation: covering needs a filter on the indexed symbol, otherwise unfiltered LATEST ON / SAMPLE BY / GROUP BY fall back to a regular page-frame scan. - Refresh EXPLAIN snippets to match real output: IN-list filter rendering, LATEST ON without SelectedRecord wrapper, DISTINCT as PostingIndex op: distinct, Async Filter layered on top of CoveringIndex for AND filters on covered columns. - Tighten architecture: .pv encoding depends on variant (delta+FoR or EF); .pcN sidecars carry txn-segment suffixes on disk and the auto-included timestamp gets its own sidecar. _cairo.config.json: - cairo.posting.index.row.id.encoding: default is `adaptive`, valid values are `adaptive`, `delta`, `ef` (not the previous `posting`/`posting_delta`). - cairo.posting.index.auto.include.timestamp: clarify that it applies to any posting index, including bare INDEX TYPE POSTING. - Add cairo.posting.seal.gen.threshold (default 16). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 173 ++++++++++++------ .../configuration-utils/_cairo.config.json | 10 +- 2 files changed, 129 insertions(+), 54 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index e6d53d900..653b136bc 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -79,9 +79,13 @@ can be served entirely from the index. :::tip The designated timestamp column is automatically included in the covering -index when an `INCLUDE` clause is present — you do not need to list it -explicitly. This means timestamp-filtered covering queries work out of the -box. +index — even when no explicit `INCLUDE` clause is given. So a bare +`INDEX TYPE POSTING` already covers `SELECT timestamp, sym FROM t WHERE +sym = 'X'`. The expanded list is what `SHOW CREATE TABLE` round-trips, so +`INCLUDE (exchange, price)` renders back as +`INCLUDE (exchange, price, timestamp)` after creation. Controlled by the +`cairo.posting.index.auto.include.timestamp` server property +(default `true`). ::: @@ -91,6 +95,10 @@ The `INCLUDE` clause is only supported with inline column syntax and `ALTER TABLE`. The out-of-line `INDEX(col TYPE POSTING)` syntax does not support `INCLUDE`. +Writing `INDEX INCLUDE (...)` (no explicit `TYPE`) is also accepted and +implicitly creates a posting index — `INCLUDE` is only valid with +`POSTING`, so the parser promotes the type for you. + ::: ### On an existing table @@ -243,22 +251,58 @@ SelectedRecord filter: symbol='AAPL' ``` +`IN`-list filters render as `filter: symbol IN ['AAPL','GOOGL','MSFT']`. +`LATEST ON` queries that hit the covering path show an `op: latest` +annotation and have no `SelectedRecord` wrapper: + +``` +CoveringIndex op: latest on: symbol with: timestamp, price + filter: symbol='AAPL' +``` + +`SELECT DISTINCT` does not need to read covered values, so it shows up as +`PostingIndex op: distinct` rather than `CoveringIndex`: + +``` +PostingIndex op: distinct on: symbol + Frame forward scan on: trades +``` + +When you add a filter on a covered column, an `Async Filter` is layered +above the covering index — the predicate values are read from the sidecar, +not the column file: + +``` +SelectedRecord + Async Filter workers: N + filter: 100 Date: Tue, 5 May 2026 08:47:57 +0100 Subject: [PATCH 07/12] Refine posting/covering index facts after deeper source review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit posting-index.md: - Encoding-options: Elias-Fano is per-key Elias-Fano coding (low/high bit split), not "stride-wide flat layout with FoR". Rewrite the EF description to match the actual algorithm and recharacterise the three SQL variants as choices that pick the per-key encoding the writer uses, with the explicit DELTA/EF variants positioned for benchmarking rather than tied to vague data-distribution claims. - INCLUDE type table: BOOLEAN/BYTE/etc. use Frame-of-Reference bitpacking, not raw copies. Split FLOAT/DOUBLE into their own rows (both ALP) and TIMESTAMP into its own row (linear-prediction + FoR). BINARY/arrays are length-prefixed raw bytes, not "offset-based sidecar". - Trade-offs storage section: same correction — small fixed-width types use FoR bitpacking; only UUID / LONG256 / DECIMAL128/256 are raw copies. - SHOW COLUMNS example: column order now matches live output (indexType / indexInclude come last, after upsertKey), and adds the symbolTableSize column. The indexInclude value shows exchange,price,timestamp to reflect auto-include of the timestamp. meta.md: - table_columns(): description list adds symbolTableSize and reorders indexType / indexInclude to the end (where they actually appear). - Example table column order matches live output and includes symbolTableSize. show.md: - SHOW COLUMNS example: column order corrected (indexType / indexInclude at end, symbolCached / symbolCapacity / symbolTableSize before designated). Mention POSTING DELTA / POSTING EF as possible indexType values. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../concepts/deep-dive/posting-index.md | 114 ++++++++++-------- documentation/query/functions/meta.md | 22 ++-- documentation/query/sql/show.md | 25 ++-- 3 files changed, 88 insertions(+), 73 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 653b136bc..4d44e1246 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -113,27 +113,30 @@ ALTER TABLE trades The posting index supports three row ID encoding options with different compression and query performance characteristics: -| Syntax | Encoding | Best for | -|--------|----------|----------| -| `INDEX TYPE POSTING` | Adaptive (default) | General purpose — trial-encodes both EF and delta per stride, picks the smaller | -| `INDEX TYPE POSTING EF` | Elias-Fano | Irregular data distributions, point queries and selective lookups | -| `INDEX TYPE POSTING DELTA` | Delta | Regular, evenly-distributed data, large sequential scans | - -**Delta encoding** stores per-key deltas between consecutive row IDs with -Frame-of-Reference bitpacking. It compresses best when row IDs for each -symbol key are evenly spaced (e.g. round-robin or time-ordered ingestion -of a fixed set of symbols) and is faster for queries that scan large -ranges of matching rows. - -**Elias-Fano (EF) encoding** uses a stride-wide flat layout with -Frame-of-Reference bitpacking across all keys in a stride. It compresses -better for irregular data distributions (e.g. bursty or skewed symbol -frequencies) and is faster for point queries and selective lookups. - -The **adaptive (default)** encoding trial-encodes both EF and delta modes -per stride and picks whichever produces the smaller output. This is the -best choice when you are unsure about your data distribution or have a -mixed query workload. +| Syntax | Encoding | Notes | +|--------|----------|-------| +| `INDEX TYPE POSTING` | Adaptive (default) | Trials delta + Frame-of-Reference and Elias-Fano per key per stride and keeps the smaller output | +| `INDEX TYPE POSTING EF` | Elias-Fano only | Forces Elias-Fano even when delta + FoR would be smaller — useful for benchmarking | +| `INDEX TYPE POSTING DELTA` | Delta + Frame-of-Reference only | Forces delta + FoR even when Elias-Fano would be smaller — useful for benchmarking | + +**Delta + Frame-of-Reference encoding** stores each key's row IDs as +per-key deltas, split into blocks of 64 with per-block Frame-of-Reference +bitpacking. Round-robin or periodic distributions produce constant +deltas (bitwidth 0), so this mode compresses them to near-zero. The +trade-off is a per-key block-header overhead that hurts low-cardinality +keys. + +**Elias-Fano (EF) encoding** is a classic monotonic-sequence encoding: +each key's sorted row IDs are split into low and high bit halves, with +the high half stored as a unary-coded bit array and the low half as a +fixed-width packed array. This typically produces denser output for +keys with few values per stride and avoids the block-header overhead. + +The **adaptive (default)** encoding trial-encodes each key with both +delta + Frame-of-Reference and Elias-Fano per stride and picks whichever +produces the smaller output. This is the right choice for almost all +workloads — the explicit `DELTA` / `EF` variants exist mainly for +benchmarking. ```questdb-sql -- Default adaptive encoding (recommended for most workloads) @@ -174,16 +177,19 @@ All column types except the indexed symbol column itself can be included: | Type | Compression | Notes | |------|-------------|-------| -| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Raw copy | 1 byte per value | -| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference | 2 bytes uncompressed | -| INT, FLOAT, IPv4, GEOINT, DECIMAL32 | FoR (int) / ALP (float) | 4 bytes uncompressed | -| LONG, DOUBLE, TIMESTAMP, DATE, GEOLONG, DECIMAL64 | FoR / ALP / linear prediction | 8 bytes uncompressed | -| SYMBOL | Frame-of-Reference | Stored as integer key, resolved at query time | +| BOOLEAN, BYTE, GEOBYTE, DECIMAL8 | Frame-of-Reference bitpacking | ≤1 byte per value (worst case) | +| SHORT, CHAR, GEOSHORT, DECIMAL16 | Frame-of-Reference bitpacking | ≤2 bytes per value | +| INT, IPv4, GEOINT, DECIMAL32 | Frame-of-Reference bitpacking | ≤4 bytes per value | +| FLOAT | ALP (Adaptive Lossless floating-Point) | Lossless float compression | +| LONG, DATE, GEOLONG, DECIMAL64 | Frame-of-Reference bitpacking | ≤8 bytes per value | +| TIMESTAMP | Linear-prediction + Frame-of-Reference | Designed for monotonic timestamps | +| DOUBLE | ALP (Adaptive Lossless floating-Point) | Lossless float compression | +| SYMBOL | Frame-of-Reference bitpacking | Stored as integer key, resolved at query time | | UUID, DECIMAL128 | Raw copy | 16 bytes per value | | LONG256, DECIMAL256 | Raw copy | 32 bytes per value | -| VARCHAR, STRING | FSST compressed | Variable-width, typically 2-5x compression | -| BINARY | Variable-width sidecar | Stored in offset-based format | -| Arrays (DOUBLE[], INT[], etc.) | Variable-width sidecar | Stored in offset-based format | +| VARCHAR, STRING | FSST compressed (≥4 KB strides) | Typically 2–5× compression on repetitive text | +| BINARY | Length-prefixed raw bytes | Variable-width, no compression | +| Arrays (DOUBLE[], INT[], etc.) | Length-prefixed raw bytes | Variable-width, no compression | ### How to choose INCLUDE columns @@ -224,16 +230,17 @@ type and covered columns: SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | -|--------|------|---------|-------------------|-----------|-------------|-------------|----------------|------------|-----------| -| timestamp | TIMESTAMP | false | 0 | | | false | 0 | true | false | -| symbol | SYMBOL | true | 256 | POSTING | exchange,price | true | 128 | false | false | -| exchange | SYMBOL | false | 0 | | | true | 128 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | false | false | -| quantity | DOUBLE | false | 0 | | | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +|-----------|-----------|---------|--------------------|--------------|----------------|-----------------|------------|-----------|-----------|---------------------------| +| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | +| symbol | SYMBOL | true | 256 | true | 256 | 0 | false | false | POSTING | exchange,price,timestamp | +| exchange | SYMBOL | false | 256 | true | 256 | 0 | false | false | | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| quantity | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | -The `indexType` column shows `POSTING`, `BITMAP`, or is empty for -non-indexed columns. The `indexInclude` column lists covered column names. +The `indexType` column shows `POSTING`, `POSTING DELTA`, `POSTING EF`, +`BITMAP`, or is empty for non-indexed columns. The `indexInclude` column +lists covered column names — note the auto-included designated timestamp. ### Verifying covering index usage @@ -393,19 +400,24 @@ SELECT /*+ no_index */ price FROM trades WHERE symbol = 'AAPL'; ### Storage -The posting index itself is very compact (~1 byte per indexed value). -The covering sidecar adds storage proportional to the included columns: - -- **Numeric columns** (DOUBLE, FLOAT): compressed with ALP (Adaptive - Lossless floating-Point) and Frame-of-Reference bitpacking -- **Integer columns** (INT, LONG, etc.): Frame-of-Reference bitpacking; - TIMESTAMP additionally uses linear-prediction encoding -- **Small fixed-width types** (BYTE, BOOLEAN, etc.): stored as raw copies -- **Wide fixed-width types** (UUID, LONG256, DECIMAL128/256): stored as - raw copies with a count header -- **Variable-width columns** (VARCHAR, STRING): FSST compressed in sealed - partitions, typically 2-5x smaller than raw column data -- **BINARY and arrays**: stored in an offset-based variable-width sidecar +The posting index itself is very compact (~1 byte per indexed value, vs. +~15 bytes per value for the bitmap index). The covering sidecar adds +storage proportional to the included columns: + +- **DOUBLE, FLOAT**: ALP (Adaptive Lossless floating-Point), backed by + Frame-of-Reference bitpacking with an exception list for outliers. +- **TIMESTAMP**: linear-prediction header with Frame-of-Reference residual + bitpacking — designed for monotonic timestamp data. +- **Other fixed-width integer types** (BOOLEAN, BYTE, SHORT, CHAR, INT, + LONG, DATE, IPv4, GEO\*, DECIMAL8–DECIMAL64, SYMBOL keys): + Frame-of-Reference bitpacking sized to the column's natural width, so + the worst case is the column-file byte size and typical case is much + smaller. +- **UUID, LONG256, DECIMAL128, DECIMAL256**: stored raw at full width + with a small count header. +- **VARCHAR, STRING**: FSST-compressed once a stride exceeds 4 KB of raw + data; typically 2–5× smaller than the column file. +- **BINARY and arrays**: length-prefixed raw bytes (no compression). ### Write performance diff --git a/documentation/query/functions/meta.md b/documentation/query/functions/meta.md index d978206ab..e02f30f6b 100644 --- a/documentation/query/functions/meta.md +++ b/documentation/query/functions/meta.md @@ -356,17 +356,19 @@ Returns a `table` with the following columns: - `indexed` - if indexing is applied to this column - `indexBlockCapacity` - how many row IDs to store in a single storage block on disk (bitmap indexes only) -- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) - (`POSTING`, `POSTING DELTA`, `POSTING EF`, `BITMAP`, or empty) -- `indexInclude` - comma-separated names of columns included in a - [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar - `symbolCached` - whether this `symbol` column is cached - `symbolCapacity` - how many distinct values this column of `symbol` type is expected to have +- `symbolTableSize` - current number of distinct values stored in this + `symbol` column's table - `designated` - if this is set as the designated timestamp column for this table - `upsertKey` - if this column is a part of UPSERT KEYS list for table [deduplication](/docs/concepts/deduplication) +- `indexType` - the [index type](/docs/concepts/deep-dive/indexes/) + (`POSTING`, `POSTING DELTA`, `POSTING EF`, `BITMAP`, or empty) +- `indexInclude` - comma-separated names of columns included in a + [posting index's](/docs/concepts/deep-dive/posting-index/) covering sidecar For more details on the meaning and use of these values, see the [CREATE TABLE](/docs/query/sql/create-table/) documentation. @@ -377,12 +379,12 @@ For more details on the meaning and use of these values, see the table_columns('my_table'); ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | designated | upsertKey | -| ------ | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | ---------- | --------- | -| symb | SYMBOL | true | 1048576 | BITMAP | | false | 256 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | false | false | -| ts | TIMESTAMP | false | 0 | | | false | 0 | true | false | -| s | VARCHAR | false | 0 | | | false | 0 | false | false | +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +| ------ | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | --------- | ------------ | +| symb | SYMBOL | true | 1048576 | false | 256 | 0 | false | false | BITMAP | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| ts | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | +| s | VARCHAR | false | 0 | false | 0 | 0 | false | false | | | ```questdb-sql title="Get designated timestamp column" SELECT "column", type, designated FROM table_columns('my_table') WHERE designated = true; diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index c1d59d0ae..77d432c1f 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -71,18 +71,19 @@ SHOW TABLES; SHOW COLUMNS FROM trades; ``` -| column | type | indexed | indexBlockCapacity | indexType | indexInclude | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | -| --------- | --------- | ------- | ------------------ | --------- | ------------ | ------------ | -------------- | --------------- | ---------- | --------- | -| symbol | SYMBOL | false | 0 | | | true | 256 | 42 | false | false | -| side | SYMBOL | false | 0 | | | true | 256 | 2 | false | false | -| price | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | -| amount | DOUBLE | false | 0 | | | false | 0 | 0 | false | false | -| timestamp | TIMESTAMP | false | 0 | | | false | 0 | 0 | true | false | - -The `indexType` column shows the index type (`POSTING`, `BITMAP`, or empty for -non-indexed columns). The `indexInclude` column lists the names of columns -included in a [posting index's](/docs/concepts/deep-dive/posting-index/) -covering sidecar, as a comma-separated string. +| column | type | indexed | indexBlockCapacity | symbolCached | symbolCapacity | symbolTableSize | designated | upsertKey | indexType | indexInclude | +| --------- | --------- | ------- | ------------------ | ------------ | -------------- | --------------- | ---------- | --------- | --------- | ------------ | +| symbol | SYMBOL | false | 0 | true | 256 | 42 | false | false | | | +| side | SYMBOL | false | 0 | true | 256 | 2 | false | false | | | +| price | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| amount | DOUBLE | false | 0 | false | 0 | 0 | false | false | | | +| timestamp | TIMESTAMP | false | 0 | false | 0 | 0 | true | false | | | + +The `indexType` column shows the index type (`POSTING`, `POSTING DELTA`, +`POSTING EF`, `BITMAP`, or empty for non-indexed columns). The +`indexInclude` column lists the names of columns included in a +[posting index's](/docs/concepts/deep-dive/posting-index/) covering +sidecar, as a comma-separated string. ### SHOW CREATE TABLE From e18de5694d2912c62e3c4e0797432b929cb33be2 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:51:25 +0100 Subject: [PATCH 08/12] Drop speculative data-distribution claims from encoding example block MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The table above the example block was already corrected to drop the unverified "irregular data, point queries" / "regular data, large scans" claims about EF and DELTA. Update the example block's inline comments to match — both explicit variants are positioned as benchmarking-only. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 4d44e1246..539ba4077 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -143,11 +143,11 @@ benchmarking. CREATE TABLE t1 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING) TIMESTAMP(ts) PARTITION BY DAY WAL; --- EF encoding (irregular data, point queries) +-- Force Elias-Fano only (benchmarking) CREATE TABLE t2 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING EF) TIMESTAMP(ts) PARTITION BY DAY WAL; --- Delta-only encoding (regular data, large scans) +-- Force delta + Frame-of-Reference only (benchmarking) CREATE TABLE t3 (ts TIMESTAMP, s SYMBOL INDEX TYPE POSTING DELTA) TIMESTAMP(ts) PARTITION BY DAY WAL; ``` From a3f9cfb770de39d40d7a3ffe3c94e5f05f00c838 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:54:01 +0100 Subject: [PATCH 09/12] Clarify auto-include of timestamp on ALTER ADD INDEX alter-table-alter-column-add-index.md: - State explicitly that bare ALTER ... ADD INDEX TYPE POSTING (no INCLUDE clause) already covers timestamp + symbol queries because the designated timestamp is auto-included. - Add the EF variant alongside DELTA in the encoding-variant example. alter-mat-view-alter-column-add-index.md: - Replace the "INCLUDE not supported, use posting without INCLUDE" note with a more accurate explanation: the parser rejects an explicit INCLUDE clause on materialized views, but the view's designated timestamp is still auto-added, so the bare form produces a covering index over timestamp. Verified live via table_columns(). Co-Authored-By: Claude Opus 4.7 (1M context) --- .../alter-mat-view-alter-column-add-index.md | 8 +++++-- .../sql/alter-table-alter-column-add-index.md | 21 ++++++++++++------- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/documentation/query/sql/alter-mat-view-alter-column-add-index.md b/documentation/query/sql/alter-mat-view-alter-column-add-index.md index d866e788b..eb98d3629 100644 --- a/documentation/query/sql/alter-mat-view-alter-column-add-index.md +++ b/documentation/query/sql/alter-mat-view-alter-column-add-index.md @@ -50,8 +50,12 @@ ALTER MATERIALIZED VIEW trades_hourly :::note -The `INCLUDE` clause for covering indexes is not supported on materialized -views. Use a posting index without `INCLUDE` for faster filtered lookups. +An explicit `INCLUDE` clause for covering indexes is not currently +accepted on materialized views — the parser rejects it. The view's +designated timestamp is still auto-added, so `INDEX TYPE POSTING` on a +view's symbol column produces a covering index over the timestamp, +which is enough to accelerate `WHERE symbol = … LATEST ON ts` and +similar timestamp-only covering queries. ::: diff --git a/documentation/query/sql/alter-table-alter-column-add-index.md b/documentation/query/sql/alter-table-alter-column-add-index.md index 4c63811ca..2df5159f5 100644 --- a/documentation/query/sql/alter-table-alter-column-add-index.md +++ b/documentation/query/sql/alter-table-alter-column-add-index.md @@ -30,11 +30,18 @@ ALTER TABLE trades ALTER COLUMN side ADD INDEX; ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING; ``` -An encoding variant can be specified: +The designated timestamp is auto-included as a covered column even when +no explicit `INCLUDE` clause is given, so the bare form above already +covers `SELECT timestamp, instrument FROM trades WHERE instrument = 'X'`. + +An encoding variant can also be forced: ```questdb-sql --- Force delta-only encoding +-- Force delta + Frame-of-Reference (benchmarking) ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING DELTA; + +-- Force Elias-Fano (benchmarking) +ALTER TABLE trades ALTER COLUMN instrument ADD INDEX TYPE POSTING EF; ``` ### Adding a posting index with covering columns @@ -47,12 +54,10 @@ ALTER TABLE trades ALTER COLUMN symbol ADD INDEX TYPE POSTING INCLUDE (price, quantity); ``` -The designated timestamp column is automatically included in the covering -index — you do not need to list it explicitly. - -After this, queries that only select columns from the `INCLUDE` list (plus the -indexed symbol column and designated timestamp) are served from the index -sidecar: +The designated timestamp is appended to the `INCLUDE` list automatically. +After this, queries that only select columns from the `INCLUDE` list (plus +the indexed symbol column and designated timestamp) are served from the +index sidecar: ```questdb-sql -- This query reads from the index sidecar, not from column files From 3b0df234673bd5f79428d12b46f4e771644f89fd Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 08:58:33 +0100 Subject: [PATCH 10/12] Distinguish filtered vs unfiltered LATEST ON in bitmap/posting comparison Live verification: bitmap uses LatestByAllIndexed for unfiltered LATEST ON (index-accelerated), while posting falls back to LatestByDeferredListValuesFiltered for the unfiltered case. The previous "LATEST ON | Yes | Yes" row hid this difference. Split into two rows so readers see that bitmap retains the edge for unfiltered LATEST ON, while posting wins on filtered LATEST ON via the covering path. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index 539ba4077..ee76081e4 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -300,7 +300,8 @@ doesn't filter on the indexed symbol. | Covering index (INCLUDE) | No | Yes | | DISTINCT acceleration | No | Yes | | Write overhead | Low | Low (without INCLUDE), moderate with INCLUDE | -| LATEST ON optimization | Yes | Yes | +| Filtered LATEST ON | Yes | Yes (covering) | +| Unfiltered LATEST ON | Yes (`LatestByAllIndexed`) | Falls back to deferred-list scan | | `CAPACITY` clause | Yes | No (parse error) | | Syntax | `INDEX` or `INDEX TYPE BITMAP` | `INDEX TYPE POSTING` | From 83f84f5e4ba6ccd80580485698dea727a4524522 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 09:01:44 +0100 Subject: [PATCH 11/12] Tighten .pci description and COUNT example comment posting-index.md: - .pci was described as "per-column header" but it's a single index- level header listing all covered columns by writer index (PCI1 magic, count, writerIndex array). Reword accordingly. - COUNT example comment said "uses index" but the actual plan is Count over CoveringIndex with no column data read. Make the comment describe what the plan node actually says. Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/concepts/deep-dive/posting-index.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/documentation/concepts/deep-dive/posting-index.md b/documentation/concepts/deep-dive/posting-index.md index ee76081e4..cb4580d07 100644 --- a/documentation/concepts/deep-dive/posting-index.md +++ b/documentation/concepts/deep-dive/posting-index.md @@ -361,7 +361,7 @@ SELECT DISTINCT symbol FROM trades WHERE timestamp > '2024-01-01'; ### COUNT queries ```questdb-sql --- Uses index to scan only matching rows instead of full table +-- Plan: Count over CoveringIndex, no column data read SELECT COUNT(*) FROM trades WHERE symbol = 'AAPL'; ``` @@ -455,11 +455,12 @@ The posting index stores data in three file types per partition: Frame-of-Reference bitpacking or Elias-Fano (depending on the index's encoding variant), organised into stride-indexed generations. - **`.pci` + `.pc0`, `.pc1`, …** — Sidecar files: covered column values - stored alongside the posting list. `.pci` holds the per-column header - (including the `coverCount`); each `.pcN` (with txn-segment suffix on - disk, e.g. `s.pc0.0.0`) holds the encoded data for one `INCLUDE` - column. The auto-included designated timestamp counts as one of the - covered columns and gets its own `.pcN` file. + stored alongside the posting list. The single `.pci` header lists the + covered columns by writer index (`PCI1` magic, plus the `coverCount` + used by readers to size their sidecar mappings). Each `.pcN` (with + txn-segment suffix on disk, e.g. `s.pc0.0.0`) holds the encoded data + for one `INCLUDE` column. The auto-included designated timestamp + counts as one of the covered columns and gets its own `.pcN` file. ### Generations and sealing From 12abb60d69a44e0d3ed6a86415fb0a40076d7eb7 Mon Sep 17 00:00:00 2001 From: Nick Woolmer <29717167+nwoolmer@users.noreply.github.com> Date: Tue, 5 May 2026 09:53:24 +0100 Subject: [PATCH 12/12] Reflect auto-included timestamp in SHOW CREATE TABLE posting example Co-Authored-By: Claude Opus 4.7 (1M context) --- documentation/query/sql/show.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/documentation/query/sql/show.md b/documentation/query/sql/show.md index 77d432c1f..b71252116 100644 --- a/documentation/query/sql/show.md +++ b/documentation/query/sql/show.md @@ -111,12 +111,15 @@ WITH maxUncommittedRows=500000, o3MaxLag=600000000us; #### Posting index with covering columns When a symbol column has a posting index with `INCLUDE`, the DDL reflects -the index type and covered columns: +the index type and covered columns. The designated timestamp is appended +to the `INCLUDE` list automatically, so a table created with +`INCLUDE (price, exchange)` round-trips as +`INCLUDE (price, exchange, timestamp)`: ```questdb-sql CREATE TABLE trades ( - symbol SYMBOL CAPACITY 128 CACHE INDEX TYPE POSTING INCLUDE (price, exchange), - exchange SYMBOL CAPACITY 128 CACHE, + symbol SYMBOL CAPACITY 256 CACHE INDEX TYPE POSTING INCLUDE (price, exchange, timestamp), + exchange SYMBOL CAPACITY 256 CACHE, price DOUBLE, amount DOUBLE, timestamp TIMESTAMP