Skip to content

[Feature] Support columnar-extend storage layout for MAP columns #342

@lxy-9602

Description

@lxy-9602

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

In time-series / IoT / observability workloads, a common pattern is storing free-schema fields in a MAP<STRING, T> column (e.g. metrics MAP<STRING, DOUBLE>). The default MAP storage (two KV arrays) provides:

  • No per-key columnar access
  • No per-key statistics
  • No predicate pushdown on individual keys

This makes queries like SELECT ext_map['usage'] FROM metrics WHERE ext_map['usage'] > 30 scan the entire MAP column — extremely inefficient when only 1–3 keys out of thousands are needed per query.

The PIP: Columnar-Extend Storage Layout for MAP Columns proposes a new extend storage layout that stores MAP values in K reusable physical columns within a Struct, achieving near-full columnar access with per-key statistics and predicate pushdown — without changing the logical type (MAP<STRING, T>).

Solution

Physical Layout

Each MAP<STRING, T> column marked with map-storage-layout = extend is physically stored as:

STRUCT<
  __field_mapping: FixedSizeList<Int32, K>,   -- per-row: which field_id each col holds
  __col_0: T, __col_1: T, ..., __col_{K-1}: T,  -- reusable typed columns
  __overflow: MAP<INT32, T>                    -- rare fallback for rows with > K fields
>

File metadata (footer) stores: field name↔id dictionary, field_id→physical column set S, overflow set O, K, and max row width.

Write Path

  1. Schema conversion utilities — Logical MAP → physical Struct schema rewriting; metadata serialization/deserialization; EXTEND column detection via field metadata marker.

  2. FormatWriter::AddMetadata — New virtual method (default no-op) for writing key-value metadata to file footer before Finish(). Parquet implementation calls AddKeyValueMetadata.

  3. Column allocator — Streaming per-row slot allocator (Hit/Evict/Retain/Overflow) maintaining K physical column assignments across batches within a file. LRU-based eviction. Accumulates file-level statistics (S, O, max row width).

  4. Logical→physical batch converter — Parses logical MAP, encodes field names to integer IDs (file-level dictionary), invokes allocator per row, assembles physical Struct array.

  5. Writer integration — Extended DataFileWriter that performs conversion before writing + injects metadata on close. AppendOnlyWriter detects EXTEND columns and routes accordingly. Cross-file K adaptation (P99 of recent max row widths, capped by K_max).

Read Path

  1. File metadata parsing — Parse EXTEND metadata from file footer (dictionary, S, O, K). New GetFileKeyValueMetadata() method on FileBatchReader with Parquet implementation.

  2. Predicate translation — Translate logical predicates on MAP keys into conservative OR predicates over physical sub-columns. Requires extending LeafPredicate to support nested field paths and updating PredicateConverter to emit nested FieldRef.

  3. Read planning — At SetReadSchema time: look up which physical columns to read (from S), decide whether __overflow is needed (from O), translate predicates, and pass the physical schema + physical predicate down to the inner FileBatchReader unchanged.

  4. Batch reconstruction — After NextBatch: read __field_mapping per row to identify which column holds which field (fine-grained filter), gather values into logical MAP<STRING, T>. Merge overflow when needed. Correctness relies on per-row __field_mapping, not on pushdown precision.

  5. Reader integration — A wrapper reader (implements FileBatchReader) sits between the upper layer and the format-level reader. Per-file instance. Compatible with varying K across files. Orthogonal to DataEvolutionFileReader (schema evolution).

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions