[Feature] Support Spark expression: aes_encrypt

## What is the problem the feature request solves?

> **Note:** This issue was generated with AI assistance. The specification details have been extracted from Spark documentation and may need verification.

Comet does not currently support the Spark `aes_encrypt` function, causing queries using this function to fall back to Spark's JVM execution instead of running natively on DataFusion.

The `AesEncrypt` expression provides AES (Advanced Encryption Standard) encryption functionality in Spark SQL. It encrypts binary input data using a specified key, encryption mode, padding scheme, initialization vector (IV), and optional additional authenticated data (AAD). This expression is implemented as a runtime replaceable that delegates to native implementation methods for performance.

Supporting this expression would allow more Spark workloads to benefit from Comet's native acceleration.

## Describe the potential solution

### Spark Specification

**Syntax:**
```sql
aes_encrypt(input, key [, mode [, padding [, iv [, aad]]]])
```

```scala
// DataFrame API
import org.apache.spark.sql.catalyst.expressions.AesEncrypt
AesEncrypt(inputExpr, keyExpr, modeExpr, paddingExpr, ivExpr, aadExpr)
```

**Arguments:**
| Argument | Type | Description |
|----------|------|-------------|
| input | BinaryType | The binary data to encrypt |
| key | BinaryType | The encryption key as binary data |
| mode | StringType | The encryption mode (defaults to "GCM") |
| padding | StringType | The padding scheme (defaults to "DEFAULT") |
| iv | BinaryType | The initialization vector (defaults to empty) |
| aad | BinaryType | Additional authenticated data for GCM mode (defaults to empty) |

**Return Type:** Returns `BinaryType` - the encrypted data as a binary array.

**Supported Data Types:**
- **Input data**: Binary type only
- **Key**: Binary type only  
- **Mode**: String type with collation support (trim collation supported)
- **Padding**: String type with collation support (trim collation supported)
- **IV**: Binary type only
- **AAD**: Binary type only

**Edge Cases:**
- **Null inputs**: Follows standard Spark null propagation - any null input produces null output
- **Empty AAD**: When AAD parameter is omitted, defaults to empty binary literal
- **Empty IV**: When IV parameter is omitted, defaults to empty binary literal  
- **Invalid key sizes**: Behavior depends on underlying AES implementation in ExpressionImplUtils
- **Mode/padding combinations**: Some mode and padding combinations may not be supported

**Examples:**
```sql
-- Basic encryption with default GCM mode
SELECT base64(aes_encrypt('Spark', 'abcdefghijklmnop12345678ABCDEFGH'));

-- Full specification with all parameters
SELECT base64(aes_encrypt(
  'Spark', 
  'abcdefghijklmnop12345678ABCDEFGH', 
  'GCM', 
  'DEFAULT', 
  unhex('000000000000000000000000'), 
  'This is an AAD mixed into the input'
));
```

```scala
// DataFrame API usage
import org.apache.spark.sql.functions._
df.select(base64(expr("aes_encrypt(data, key, 'GCM', 'DEFAULT', iv, aad)")))

// Using expression directly
import org.apache.spark.sql.catalyst.expressions._
val encrypted = AesEncrypt(col("data").expr, col("key").expr)
```

### Implementation Approach

See the [Comet guide on adding new expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html) for detailed instructions.

1. **Scala Serde**: Add expression handler in `spark/src/main/scala/org/apache/comet/serde/`
2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if needed
4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has built-in support first)


## Additional context

**Difficulty:** Large
**Spark Expression Class:** `org.apache.spark.sql.catalyst.expressions.AesEncrypt`

**Related:**
- `AesDecrypt` - corresponding decryption function
- `base64/unbase64` - commonly used for encoding encrypted binary output
- `unhex/hex` - for converting hexadecimal strings to binary data

---
*This issue was auto-generated from Spark reference documentation.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support Spark expression: aes_encrypt #3187

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Argument	Type	Description
input	BinaryType	The binary data to encrypt
key	BinaryType	The encryption key as binary data
mode	StringType	The encryption mode (defaults to "GCM")
padding	StringType	The padding scheme (defaults to "DEFAULT")
iv	BinaryType	The initialization vector (defaults to empty)
aad	BinaryType	Additional authenticated data for GCM mode (defaults to empty)

[Feature] Support Spark expression: aes_encrypt #3187

Description

What is the problem the feature request solves?

Describe the potential solution

Spark Specification

Implementation Approach

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions