MINOR: Preserve empty list offset buffer#30
Conversation
This comment has been minimized.
This comment has been minimized.
|
LGTM |
|
On debugging the newly added TestFragmentWritableBatch.emptyListVectorOffsetBufferIsInconsistentAfterUnload in my dremio PR https://github.com/dremio/dremio/pull/25327, it is clear that VectorUnloader call set the offset buffer writerIndex to 4. And this call in FragmentWritableBatch gets the buffers to be used by Netty. The VectorUnloaded calls private void appendNodes( The statement in bold calls } Fixing in setReaderAndWriterIndex is needed and it is sync with BaseVariableWidthVector.setReaderAndWriterIndex for VarcharVector. It is also possible that ListVector.empty() should have created the offsetBuffer with one entry. empty() method is available in ListVector and StructVector only. |
What's Changed
Preserve the Arrow-required empty list offset buffer entry while avoiding malformed Netty buffer state.
ListVectorandLargeListVectornow materialize a one-entry offset buffer when an empty vector still needs to exposeoffset[0], then set the writer index from(valueCount + 1) * OFFSET_WIDTH.This keeps the IPC serialization behavior from apache#967 without producing an ArrowBuf with
writerIndex > capacity, which caused Netty unwrap failures in Dremio sender paths.Testing
mvn -pl vector -am -Dmaven.gitcommitid.skip=true -Dsurefire.failIfNoSpecifiedTests=false -Dtest=TestListVector,TestLargeListVector testResult: BUILD SUCCESS. TestListVector and TestLargeListVector passed under both Netty and Unsafe allocator executions.
Closes DX-122953.