Skip to content

Feature request: argMax default handling of NULLs #1538

@wyattZarkLab

Description

@wyattZarkLab

Is your feature request related to a problem? Please describe.
We have been using the ReplacingMergeTree engine in production often, have used insert as update motifs. In my RMT pipelines, we frequently need to materialize most recent state and have done so under aggregation, using

WITH base_table_final AS (
    SELECT
        a.order_by_col_1,
        a.order_by_col_2,
        argMax(a.col_3, a.version_col) AS col_3
    FROM base_table
    GROUP BY a.order_by_col_1, a.order_by_col_2
)

instead of using the FINAL keyword. It was not clear that the default behavior, when multiple unmerged parts exist, is to skip NULLs (NULLs are a returnable value for argMax when and only when all part values are NULL), and took some time to discover this effect (silent bug) in production.

The argMax approach is written about frequently and has been our primary tool for materializing most recent state, and while it is an aggregation function (and its handling of NULLs is hence logically consistent with other aggregate functions), would imagine in most production use cases it is used for this 'materialize most recent state' purpose.

We have discovered there is a workaround motif where argMax value is wrapped as a tuple and unwrapped after the aggregation function, as

WITH base_table_final AS (
    SELECT
        a.order_by_col_1,
        a.order_by_col_2,
        argMax(tuple(a.col_3), a.version_col).1 AS col_3
    FROM base_table
    GROUP BY a.order_by_col_1, a.order_by_col_2
)

Describe the solution you'd like
We would propose to make the default behavior of argMax to respect NULLs and not skip NULLs.

Describe alternatives you've considered
Alternatively, we think an additional aggregate function, argMaxOrNull would suffice

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions