Skip to content

HIVE-29488: KryoException: NullPointerException: Cannot invoke "java.util.Collection.isEmpty()" because "this.delegate" is null#6352

Open
thomasrebele wants to merge 1 commit intoapache:masterfrom
thomasrebele:tr/HIVE-29488
Open

HIVE-29488: KryoException: NullPointerException: Cannot invoke "java.util.Collection.isEmpty()" because "this.delegate" is null#6352
thomasrebele wants to merge 1 commit intoapache:masterfrom
thomasrebele:tr/HIVE-29488

Conversation

@thomasrebele
Copy link
Contributor

See HIVE-29488.

Thank you @nareshpr for providing an initial version of the q file test and a first version of the fix!

What changes were proposed in this pull request?

Put the children of ExprNodeGenericFuncDesc in their own list object.

Why are the changes needed?

Fixes an NPE due to the Kryo library when CBO is disabled.

Does this PR introduce any user-facing change?

No

How was this patch tested?

A q file test was added.

assert (genericUDF != null);
this.genericUDF = genericUDF;
this.children = children;
this.children = children == null ? new ArrayList<>() : new ArrayList<>(children);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you need a new ArrayList<>(children) here? why it can't be just

    this.children = children == null ? List.of() : children;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ArrayList<>(children) is required, because otherwise the NPE occurs. I've seen that some callers of getChildren modify the list, e.g., DynamicPartitionPruningOptimization, so I've I opted for new ArrayList<>() instead of List.of().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ayushtkn if we don't explicitly convert it to ArrayList, kryo cannot determine the actual runtime List object for ExprNodeGenericFuncDesc.children and uses AbstractMapBasedMultimap$WrappedCollection which is throwing NPE at deserializer in Tez Task.

Explicit cast ensure kryo knows its ArrayList and won't use AbstractMapBasedMultimap$WrappedCollection avoiding this NPE.

Copy link
Contributor

@nareshpr nareshpr Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasrebele I suspect its more of Kryo-Guava deseralizer issue when children object is not null. Do you think we need to convert null to empty ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is safer to avoid null for children, as there are several places without null check, e.g., in getExprString. The children are exposed to other classes by getChildren(), so it's simpler to just use an empty list instead of adding null checks everywhere.

…util.Collection.isEmpty()" because "this.delegate" is null

Based on a fix by Naresh Panchetty Ramanaiah.
@thomasrebele
Copy link
Contributor Author

The test TestVectorizationContext had failed, because it changed the children list after creating the ExprNodeGenericFuncDesc. I checked the code, and this modification-after-instantiation seems to be limited to the test class. There are a few candidates that in principle could modify the list, but I don't think that happens in the code:

  • VectorizationContext#getWhenExpression: passes a sublist, which in principle could be modifiable. It seems it is only used for transforming the ExprNode to a VectorExpression
  • ExprNodeDescExprFactory#replaceFieldNamesInStruct passes the children of another ExprNodeGenericFuncDesc. The caller seems to transform the original expr node into a new one; I think the original expr will not be used afterwards
  • StatsRulesProcFactory.JoinStatsRule#process passes some object from JoinDesc#getResidualFilterExprs. AFAIK, the class StatsRulesProcFactory just visits but does not modify the expr nodes

I therefore propose to change TestVectorizationContext so that it takes into account that ExprNodeGenericFuncDesc makes a copy of the children list.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 9, 2026

assert (genericUDF != null);
this.genericUDF = genericUDF;
this.children = children;
this.children = children == null ? new ArrayList<>() : new ArrayList<>(children);
Copy link
Member

@deniskuzZ deniskuzZ Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thomasrebele, why not use jdk21 List.of() instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen that some callers of getChildren modify the list, e.g., DynamicPartitionPruningOptimization, so I've I opted for new ArrayList<>() instead of List.of().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants