Skip to content

Add compatible timezone support for date_format expression #3202

@andygrove

Description

@andygrove

Description

The date_format expression was added in PR #3201, but currently only supports UTC timezone with full compatibility. Non-UTC timezones are marked as Incompatible and fall back to Spark by default.

This issue tracks adding proper timezone conversion support so that date_format can be fully compatible with Spark for all timezones.

Current Behavior

  • UTC timezone: Compatible() - runs natively in Comet
  • Non-UTC timezones: Incompatible() - falls back to Spark by default
  • Users can enable non-UTC with spark.comet.expr.DateFormatClass.allowIncompatible=true but results may differ from Spark

Desired Behavior

All timezones should be Compatible() and produce results identical to Spark.

Technical Details

The current implementation uses DataFusion's to_char function which formats timestamps without timezone conversion. Spark's date_format applies the session timezone when formatting.

Possible approaches:

  1. Convert the timestamp to the target timezone before calling to_char
  2. Use a timezone-aware formatting function if available in DataFusion
  3. Implement custom Rust logic to handle timezone conversion

Related


Note: This issue was generated with AI assistance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions