Skip to content

API: Add ReadRestrictions Actions #16198

Open
singhpk234 wants to merge 6 commits intoapache:mainfrom
singhpk234:feature/add-action-to-core
Open

API: Add ReadRestrictions Actions #16198
singhpk234 wants to merge 6 commits intoapache:mainfrom
singhpk234:feature/add-action-to-core

Conversation

@singhpk234
Copy link
Copy Markdown
Contributor

@singhpk234 singhpk234 commented May 3, 2026

About the change

Introduces Action<T> and the 10 concrete column-projection actions defined by the ReadRestrictions spec (#13879), plus an UnknownAction forward-compat carrier, since the input type and output type are same

Scoped to just the type definitions - parser, REST wiring, and engine enforcement follow in separate PRs.

Note:

Shape and layout mirror org.apache.iceberg.transforms.*: one class per file, generic <S, T> extends Serializable interface, each concrete class owns its
bind(Type) returning a SerializableFunction. Spec invariant "null input → null output" is enforced via a shared NullSafeFunction base.

Spec PR: #13879

Introduces the Action abstraction from the ReadRestrictions spec
(PR apache#13879) as a standalone addition, covering the 10 predefined
column projection actions:

- MaskAlphanum, ShowFirst4, ShowLast4, MaskToFixedValue,
  ReplaceWithNull, TruncateToYear, TruncateToMonth, Sha256Global,
  Sha256QueryLocal, ApplyExpression
- Plus an Unknown forward-compat carrier for unrecognized server-side
  action types so the client fails closed at bind time

Each action carries a field id plus any action-specific payload and
owns its own bind(Type) returning a SerializableFunction. This mirrors
the Transform<S, T> pattern used in partition transforms. Null input
produces null output for every action, per spec.

No REST wire-format plumbing or engine integration yet; those follow
in separate PRs.
Mirrors the org.apache.iceberg.transforms.* layout: one file per concrete
action instead of 10 nested classes in Action.java. Action.java now holds
only the interface plus the BaseAction abstract (fieldId carrier).

Algorithmic simplifications along the way:
- ShowLast4 rewritten single-pass with a 4-offset ring buffer
  (was O(2n) from a separate codePointCount + mask loop).
- Sha256 4 per-type subclasses (Sha256String/Integer/Long/Binary)
  collapsed into one Sha256Fn with a Codec enum carrying the update/
  encode pair.
- Truncate 3 per-storage subclasses (Date/Timestamp/TimestampNano)
  collapsed into one TruncateTemporalFn with Unit + Storage enums.
- mapCodePoint moved from Actions to MaskAlphanum.maskCodePoint;
  matches the spec phrasing "redacts the remainder using mask-alphanum
  rules" that ShowFirst4/ShowLast4 reference.

UnknownAction.bind() now throws IllegalArgumentException for consistency
with the other actions' bind-time type rejection.

35 TestActions tests pass, spotless + revapi clean.
Rename Sha256Fn -> Sha256 and TruncateTemporalFn -> TruncateTemporal.
Both are package-private helpers that don't need the Fn abbreviation
to disambiguate from the public action classes (Sha256Global /
Sha256QueryLocal and TruncateToYear / TruncateToMonth respectively).
@github-actions github-actions Bot added the core label May 3, 2026
Iceberg checkstyle requires local variable names with at least 2
characters (pattern ^[a-z][a-zA-Z0-9]++$). The refactor introduced
five single-char locals that the CI build-checks job rejected:

- Sha256.java: int v / long v -> intVal / longVal
- ShowLast4.java: int o -> maskOffset
- TruncateTemporal.java: LocalDate d / LocalDateTime d -> date / truncated
Drops three explanatory paragraphs from the Action interface, Sha256
helper, and TruncateTemporal helper that restated what the signatures
already convey.

Adds equals/hashCode/toString to BaseAction (compares actionType +
fieldId) so concrete actions behave as value objects out of the box.
ApplyExpression overrides to include its Expression payload in the
comparison; UnknownAction inherits the base since its actionType()
already returns the raw discriminator string.
Per spec, all predefined actions preserve the input column type
("For all predefined actions except apply-expression, the output type
matches the input column type"). Only apply-expression could differ,
and that path currently throws since Iceberg Expressions are
boolean-only.

Drop the source/target distinction: Action<T> with bind(Type) returning
SerializableFunction<T, T>. SerializableFunction itself stays two-param
since it's shared with Transforms which legitimately use S != T (Bucket,
Days, etc.).
@singhpk234 singhpk234 marked this pull request as ready for review May 5, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants