Skip to content

[SPARK-56134][SQL] Make BufferedRowIterator.unsafeRow public to avoid IllegalAccessError.#54942

Open
chenhao-db wants to merge 1 commit intoapache:masterfrom
chenhao-db:SPARK-56134
Open

[SPARK-56134][SQL] Make BufferedRowIterator.unsafeRow public to avoid IllegalAccessError.#54942
chenhao-db wants to merge 1 commit intoapache:masterfrom
chenhao-db:SPARK-56134

Conversation

@chenhao-db
Copy link
Contributor

What changes were proposed in this pull request?

This is very similar to #20779. When a generated code has split classes and they try to access protected fields in BufferedRowIterator, an IllegalAccessError will happen.

I am not making partitionIndex public too because I cannot find a real example that can trigger an IllegalAccessError on it too. In the code base, partitionIndex is mostly accessed via addPartitionInitializationStatement. Since it is used in the partition initialization code, not in split classes, there should be no issue with it.

Why are the changes needed?

Fix runtime error in large queries.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

Was this patch authored or co-authored using generative AI tooling?

No.

@chenhao-db
Copy link
Contributor Author

@viirya @hvanhovell @rednaxelafx could you take a look? Thanks!

@pan3793 pan3793 changed the title [SPARK-56134] Make BufferedRowIterator.unsafeRow public to avoid IllegalAccessError. [SPARK-56134][SQL] Make BufferedRowIterator.unsafeRow public to avoid IllegalAccessError. Mar 23, 2026
@rednaxelafx
Copy link
Contributor

Looks good (not official Apache Spark reviewer).

The PR description mentioned that the partitionIndex is not included in this PR. Practically that's okay, because currently the partition initialization code is not yet subject to code splitting.
As can be seen from the init mutable state's case, when these init code grows big, there may be a future point in time that someone decides that partition init code also need to support splitting. But that might still be okay.
If I were to do it, I'd probably do specialized splitting so that the split functions specific to partition init would still take partitionIndex as the argument, then we still don't need to make it "public" since it's not a field to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants