[VL] Hoist per-partition constants out of ColumnarCachedBatchSerializer.serialize hot path#12166
Merged
yaooqinn merged 1 commit intoMay 28, 2026
Conversation
…er.serialize hot path In convertInternalRowToCachedBatch, three values that are constant for the lifetime of the per-partition Iterator[CachedBatch] were being re-evaluated on every next() call: 1. BackendsApiManager.getBackendName (twice per batch) 2. GlutenConfig.get.getConf(COLUMNAR_TABLE_CACHE_PARTITION_STATS_ENABLED) -- GlutenConfig.get allocates a fresh GlutenConfig(SQLConf.get) on every call (GlutenConfig.scala L584-L586) 3. ColumnarBatchSerializerJniWrapper.create(Runtimes.contextInstance(...)) Hoist all three out of next() into the mapPartitions body, alongside the structSchema value that the same block already hoists for the same many-small-batch GC-pressure reason. Only the per-batch handle remains inside next() since it depends on the batch. Wire format is byte-identical. Pure refactor with no new test file; behavior fully covered by ColumnarCachedBatchKryoSuite and ColumnarCachedBatchKryoBoundaryProbeBugSuite (7 tests, all green locally on -Pspark-4.1 -Pscala-2.13). refs: todos/features/gluten-ccbs-iterator-hoist/docs/0002-decision.md refs: todos/features/gluten-ccbs-iterator-hoist/docs/0003-implementation-plan.md
jackylee-ch
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
In
ColumnarCachedBatchSerializer.convertInternalRowToCachedBatch,three values that are constant for the lifetime of the per-partition
Iterator[CachedBatch]were being re-evaluated on everynext()call:BackendsApiManager.getBackendName-- invoked twice per batch(once for the JNI runtime context, once for
getNativeHandle)GlutenConfig.get.getConf(COLUMNAR_TABLE_CACHE_PARTITION_STATS_ENABLED)--
GlutenConfig.getallocates a freshGlutenConfig(SQLConf.get)on every call (
GlutenConfig.scalaL584-L586) plus aSparkConfhashmap lookup
ColumnarBatchSerializerJniWrapper.create(Runtimes.contextInstance(...))-- JNI wrapper construction
This PR hoists all three out of
next()into themapPartitionsbody,alongside the
structSchemavalue that the same block already hoistsfor the identical many-small-batch GC-pressure reason. Only the
per-batch
handle = ColumnarBatches.getNativeHandle(backendName, batch)remains inside
next(), since it genuinely depends on the batch.Why are the changes needed?
The block immediately preceding
new Iterator[CachedBatch]alreadyhoists
structSchemawith a comment explicitly citing many-small-batchGC pressure. The three values listed above belong in the same hoist --
they are all partition-iterator-scoped constants whose per-batch
re-evaluation produces zero behavioral signal but consumes allocator /
JNI traffic on every batch.
GlutenConfig.getis especially worth hoisting because it eagerlyconstructs a new
GlutenConfig(SQLConf.get)on each invocation ratherthan returning a cached instance:
Does this PR introduce any user-facing change?
No. Wire format is byte-identical;
partitionStatsEnabledis nowcaptured once at partition start instead of re-read per batch, which
matches the existing semantics of
SQLConf.getin Spark taskexecution (already TaskContext-thread-snapshot) -- no observable
difference from the per-batch re-read.
How was this patch tested?
The change is a pure refactor with byte-identical wire output. The
behavior is fully covered by the existing serializer test suites:
ColumnarCachedBatchKryoSuite-- 6 tests (incl. V1 wire back-compat contract)ColumnarCachedBatchKryoBoundaryProbeBugSuite-- 1 test ([VL] Fix false-EOF probe in ColumnarCachedBatchSerializer.read trailing markers #12147)All 7 tests green locally on
-Pspark-4.1 -Pscala-2.13.No new test file is added. Cross-batch reuse of the hoisted JNI
wrapper is safe because (a)
ColumnarBatchSerializerJniWrapperholdsonly a
final Runtime runtimereference with no mutable per-batchstate, and (b) the native side (
cpp/core/jni/JniWrapper.ccL1280-L1325)constructs a fresh
ctx->createColumnarBatchSerializer(nullptr)onevery JNI call -- the wrapper is a stateless route to the runtime.
Was this patch authored or co-authored using generative AI tooling?
No