Skip to content

Conversation

@sagiruthvik
Copy link

@sagiruthvik sagiruthvik commented Jan 11, 2026

Why?

What does this PR do?

Related issues

Does this PR introduce any user-facing change?

  • Does this PR introduce any public API change?
  • Does this PR introduce any binary protocol compatibility change?

Benchmark

private static final boolean isLittleEndian = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN;
private static final byte BITMAP = isLittleEndian ? isLittleEndianFlag : 0;
private static final short MAGIC_NUMBER = 0x62D4;
private static final AtomicInteger runtimeHashCounter = new AtomicInteger(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this, we could update configHash instance field direclty when invoke registerSerializer everytime. Andu compute new hash based hash of full classname of registered Serializer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chaokunyang
I think the intent was to keep with having the generated class suffix be a relatively small integer.
If going for a full hash of the registered serializers, then perhaps we implement hashCode() for Fory to include Config.hashCode()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suffix don't use hash, it use hash as key to store a counter for suffix.
And we should not impl hash code for Fory. Maybe we should not name this method as hash either

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That's why we were following that pattern, but moved it to Fory to cover it being not just the initial config.
Now the incrementAndGet in the ctor and when modifying means that e.g. Fory#1 will get 1, and Fory#2 will get 2, but then Fory#2 gets a serializer added so it becomes 3.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, does this cover pooled Fory? Does that create multiple instances, so should still use the same trick, so we should use the Map in Fory but include the serializer hashes for computeIfAbsent

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pooled Fory will apply register callback, which will invoke registerSerializer API on every pooled Fory instance. All fory instance will compute to same hash value

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. So we cannot do what has been added on line 145. This approach is wrong.

You are agreeing with me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could update configHash instance field direclty when invoke registerSerializer everytime. And compute new hash based hash of full classname of registered Serializer

This will work, why don't you use this approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fory will never be invoked By multiple threads, we never need to use any concurrent data structure here. Please remove AtomicInteger here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the implementation to compute the hash based on registered serializers instead of using a counter. Now configHash is calculated from config.hashCode() plus the hash of all registered serializer class names. This way, pooled Fury instances with the same serializers will share the same hash, while instances with different serializers get different hashes.

@sagiruthvik sagiruthvik force-pushed the fix-codegen-config-hash branch from f19a744 to 23d71c7 Compare January 16, 2026 16:57
@sagiruthvik sagiruthvik marked this pull request as ready for review January 16, 2026 17:12
@chaokunyang chaokunyang changed the title Fix codegen config hash fix(java): fix codegen config hash when register serializers Jan 16, 2026
private final IdentityMap<Object, Object> originToCopyMap;
private int classDefEndOffset;
private int configHash;
private final Set<String> registeredSerializers = new HashSet<>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete this, we don't need to store registered serializer

return configHash;
}

private void invalidateCodegenCache() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also delete this method

Comment on lines +237 to +238
registeredSerializers.add(type.getName() + ":" + serializerClass.getName());
invalidateCodegenCache();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
registeredSerializers.add(type.getName() + ":" + serializerClass.getName());
invalidateCodegenCache();
this.configHash = Objects.hashCode(configHash, type, serializerClass);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And org.apache.fory.Fory#registerSerializer(java.lang.Class<?>, org.apache.fory.serializer.Serializer<?>) also needs updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants