Skip to content

feat: make DefaultLogicalExtensionCodec support serialisation of buil…#20638

Open
Acfboy wants to merge 5 commits intoapache:mainfrom
Acfboy:support-logicalextension-serialisztion
Open

feat: make DefaultLogicalExtensionCodec support serialisation of buil…#20638
Acfboy wants to merge 5 commits intoapache:mainfrom
Acfboy:support-logicalextension-serialisztion

Conversation

@Acfboy
Copy link
Contributor

@Acfboy Acfboy commented Mar 2, 2026

Which issue does this PR close?

Rationale for this change

Currently, the LogicalExtensionCodec implementation for DefaultLogicalExtensionCodec leaves try_decode_file_format / try_encode_file_format unimplemented (returning "not implemented" errors). However, the actual serialization logic for built-in file formats — arrow, parquet, csv, and json— already exists in their respective codec implementations. All we need to do is tag which format is being used, and delegate to the corresponding format-specific codec to handle the data.

What changes are included in this PR?

Added a FileFormatKind enum and a FileFormatProto message to datafusion.proto to identify the file format type during transmission. Implemented try_decode_file_format and try_encode_file_format for DefaultLogicalExtensionCodec, which dispatch serialization/deserialization to the corresponding format-specific codec based on the format kind. Note that Avro is not covered because the upstream repository has not yet implemented the corresponding Avro codec, so Avro support is not functional at this time.

Are these changes tested?

Yes. Roundtrip tests are included for csv, json, parquet, and arrow.

Are there any user-facing changes?

@github-actions github-actions bot added the proto Related to proto crate label Mar 2, 2026
@Acfboy
Copy link
Contributor Author

Acfboy commented Mar 4, 2026

@milenkovicm Looking forward to your advice!

@milenkovicm
Copy link
Contributor

Thanks @Acfboy will have a look today

Copy link
Contributor

@milenkovicm milenkovicm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @Acfboy it looks good, just a few little suggestions

use datafusion_datasource_json::file_format::JsonFormatFactory;
use prost::Message;

let any = node.as_any();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to extract variable here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should inline node.as_any() instead of binding it to let any? I don't quite follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sorry for confusion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for clarification!

FILE_FORMAT_KIND_ARROW = 4;
FILE_FORMAT_KIND_AVRO = 5;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add some comments desribing what this message does

use datafusion_datasource_json::file_format::JsonFormatFactory;
use prost::Message;

let any = node.as_any();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to extract it as a variable here

@Acfboy Acfboy force-pushed the support-logicalextension-serialisztion branch from 260c723 to 7364d53 Compare March 6, 2026 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proto Related to proto crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make DefaultLogicalExtensionCodec support serialisation of build in file formats

2 participants