Skip to content

[Feature] Support Variant type for PyPaimon #7655

@chenghuichen

Description

@chenghuichen

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Paimon Java already supports the VARIANT type, but pypaimon has no equivalent implementation. This PR adds VARIANT read/write support to pypaimon, enabling Python-based compute engines to integrate it. For example, Daft is planning native VARIANT support at the engine level.

The ideal long-term solution is to wait for PyArrow's official Variant support (apache/arrow#45937), which would guarantee both correctness and performance from the upstream ecosystem. We should continue tracking that progress.

That said, a Python-based implementation now is a reasonable short-term step: users and compute engines can begin integrating VARIANT support early, and the underlying implementation can later be swapped for PyArrow's native one without breaking existing users.

Moreover, some aspects of VARIANT integration are the responsibility of PyPaimon itself and cannot be delegated to PyArrow — for example, shredded column pruning and shredded column predicate pushdown. These will remain part of pypaimon regardless of upstream changes.

Solution

This work is broken into three parts:

  1. VARIANT read/write support
  2. Shredded column pruning
  3. Shredded column predicate pushdown

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions