Skip to content

feat(encryption): write encrypted parquet data files#2701

Open
aarushigupta132 wants to merge 4 commits into
apache:mainfrom
aarushigupta132:feat/encrypted-parquet-data-files
Open

feat(encryption): write encrypted parquet data files#2701
aarushigupta132 wants to merge 4 commits into
apache:mainfrom
aarushigupta132:feat/encrypted-parquet-data-files

Conversation

@aarushigupta132

@aarushigupta132 aarushigupta132 commented Jun 23, 2026

Copy link
Copy Markdown

Which issue does this PR close?

What changes are included in this PR?

Adds build_encrypted on ParquetWriterBuilder that takes an EncryptedOutputFile and produces a ParquetWriter AES-GCM-encrypting each parquet block as it streams.

This is the data-file write counterpart to the manifest-list work in #2677, using the same EncryptionManager::encrypt flow.

Are these changes tested?

Tests: writes an encrypted parquet file via build_encrypted, reads it back through EncryptedInputFile, and asserts the rows match and the key metadata round-trips.

@aarushigupta132 aarushigupta132 changed the title Feat/encrypted parquet data files feat(encryption): write encrypted parquet data files Jun 23, 2026
@aarushigupta132 aarushigupta132 marked this pull request as ready for review June 23, 2026 23:45

@xanderbailey xanderbailey left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks carrying the flame on the encryption work. Have made a couple of comments to get us started.

Encrypted(EncryptedOutputFile),
}

impl OutputTarget {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should by able to avoid this enum, parquet doesn’t use streaming encryption, it uses parquet PME so it should have a plan text writer

async fn writer(&self) -> Result<Box<dyn FileWrite>> {
match self {
Self::Plain(o) => o.writer().await,
Self::Encrypted(e) => e.writer().await,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we we don’t want stream encryption

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants