Skip to content

Implement Provider Defined Types (PDT)#3433

Draft
Cole-Greer wants to merge 5 commits into
masterfrom
PDT
Draft

Implement Provider Defined Types (PDT)#3433
Cole-Greer wants to merge 5 commits into
masterfrom
PDT

Conversation

@Cole-Greer
Copy link
Copy Markdown
Contributor

Add CompositePDT (0xF0) support enabling graph providers to define custom types that serialize/deserialize seamlessly across all GLVs without driver-side configuration. Replaces the TP3 CustomTypeSerializer mechanism.

Core (gremlin-core):

  • Add @ProviderDefined annotation and immutable ProviderDefinedType POJO
  • Add ProviderDefinedTypeSerializer for GraphBinary with wire format: fully-qualified type string + fully-qualified fields map
  • Add PdtGraphSONSerializersV4 with g:CompositePdt type tag
  • Add ProviderDefinedTypeAdapter and ProviderDefinedTypeRegistry with ServiceLoader discovery, recursive hydration, and graceful degradation on adapter failure
  • Integrate auto-hydration into GraphBinaryReader and GraphSONMapper
  • Add GraphBinaryWriter auto-conversion for @ProviderDefined objects
  • Cache reflection metadata per class for performance
  • Support inherited fields via superclass walking
  • Remove legacy CUSTOM(0x00) type mechanism entirely

Gremlin-lang:

  • Add PDT("name", ["key":value]) literal to ANTLR grammar
  • Server-side parser constructs ProviderDefinedType from PDT literals
  • All GLV translators emit PDT literal syntax
  • Registry-based and annotation-based auto-dehydration in translators
  • All TranslateVisitors handle PDT for cross-language translation

GLV support (all languages):

  • Python: ProviderDefinedType, serializer, registry with @provider_defined decorator, entry_points auto-discovery, registry wired through Client/DriverRemoteConnection
  • JavaScript: ProviderDefinedType, CompositePDTSerializer, registry with explicit function pair registration, client options wiring
  • Go: ProviderDefinedType struct, serializer/deserializer, PDTRegistry with struct tags, RegisterFuncs, PDTProvider interface, client wiring
  • .NET: ProviderDefinedType, CompositePDTSerializer, registry with IProviderDefinedTypeAdapter, [ProviderDefined] attribute with IncludedFields/ExcludedFields, assembly scanning, IMessageSerializer .SetPdtRegistry() interface method, client/connection wiring

Server and testing:

  • PDT flows end-to-end through gremlin-server with TinkerGraph storing original objects and conversion at serialization boundary
  • Test-jar with Point, Address, Person test types for Docker server
  • Integration tests in all GLVs using gremlin-lang PDT literals
  • Traversal API tests covering raw PDT, registry hydration, and annotation-based auto-dehydration round-trips

Add CompositePDT (0xF0) support enabling graph providers to define
custom types that serialize/deserialize seamlessly across all GLVs
without driver-side configuration. Replaces the TP3 CustomTypeSerializer
mechanism.

Core (gremlin-core):
- Add @ProviderDefined annotation and immutable ProviderDefinedType POJO
- Add ProviderDefinedTypeSerializer for GraphBinary with wire format:
  fully-qualified type string + fully-qualified fields map
- Add PdtGraphSONSerializersV4 with g:CompositePdt type tag
- Add ProviderDefinedTypeAdapter<T> and ProviderDefinedTypeRegistry
  with ServiceLoader discovery, recursive hydration, and graceful
  degradation on adapter failure
- Integrate auto-hydration into GraphBinaryReader and GraphSONMapper
- Add GraphBinaryWriter auto-conversion for @ProviderDefined objects
- Cache reflection metadata per class for performance
- Support inherited fields via superclass walking
- Remove legacy CUSTOM(0x00) type mechanism entirely

Gremlin-lang:
- Add PDT("name", ["key":value]) literal to ANTLR grammar
- Server-side parser constructs ProviderDefinedType from PDT literals
- All GLV translators emit PDT literal syntax
- Registry-based and annotation-based auto-dehydration in translators
- All TranslateVisitors handle PDT for cross-language translation

GLV support (all languages):
- Python: ProviderDefinedType, serializer, registry with
  @provider_defined decorator, entry_points auto-discovery,
  registry wired through Client/DriverRemoteConnection
- JavaScript: ProviderDefinedType, CompositePDTSerializer, registry
  with explicit function pair registration, client options wiring
- Go: ProviderDefinedType struct, serializer/deserializer, PDTRegistry
  with struct tags, RegisterFuncs, PDTProvider interface, client wiring
- .NET: ProviderDefinedType, CompositePDTSerializer, registry with
  IProviderDefinedTypeAdapter<T>, [ProviderDefined] attribute with
  IncludedFields/ExcludedFields, assembly scanning, IMessageSerializer
  .SetPdtRegistry() interface method, client/connection wiring

Server and testing:
- PDT flows end-to-end through gremlin-server with TinkerGraph storing
  original objects and conversion at serialization boundary
- Test-jar with Point, Address, Person test types for Docker server
- Integration tests in all GLVs using gremlin-lang PDT literals
- Traversal API tests covering raw PDT, registry hydration, and
  annotation-based auto-dehydration round-trips

Also: documentation (docs/src/dev/provider/pdt.asciidoc), CHANGELOG
entry, .dockerignore update for test-jar.
Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
}
----

The `name` attribute is a unique identifier for the type, typically namespaced (e.g. `"x:Point"`). By default, all
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be good to at least suggest guidance on the namespacing we'd like. i assume, ":Point" or something.

Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
Map<String, Object> props = pdt.getProperties(); // {x: 1.0, y: 2.0}
----

==== Optional Hydration
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So many tiny subsections - necessary?

Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
Implement `ProviderDefinedTypeAdapter<T>` and register via ServiceLoader (same mechanism as the provider side). The
driver discovers adapters on the classpath and hydrates automatically.

===== Python (planned)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"planned"? not actually implemented yet or is this just a mistake in the docs? same question for .NET...

if not implemented, why is it present for some languages but not others? when will that be available?

Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
----

[[pdt-wire-format-graphbinary]]
=== Wire Format (GraphBinary V4)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this just GraphBinary documentation? link to that instead of placing it here?

Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
The `fields` value is a typed map following standard GraphSON map encoding.

[[pdt-migration-from-tp3]]
=== Migrating from TP3 Custom Types
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we need Provider Upgrade Documentation that calls attention to PDTs.

Comment thread docs/src/dev/provider/pdt.asciidoc Outdated
[[pdt-for-graph-providers]]
=== For Graph Providers

==== Basic Usage
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think examples from the Gremlin perspective would be useful here - basically using your example types in a query to help complete the full model.

Any reason there's no Reference Docs? It's probably up to providers to provide specifics about their custom types, but isn't there a use case with TinkerGraph? Maybe use Color as an example as we've done in the past.

client.submit(
"g.addV('location').property('point', PDT(\"Point\", [\"x\":10, \"y\":20])).iterate()").all().get();

// Retrieve via bytecode traversal (gremlin-lang)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytecode?

}
}

@Test
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see we don't have a test outside of Groovy for using Point directly in Gremlin. In practice, can you not just do that? Is it some test environment limitations that prevents this? i've forgotten how this works maybe...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants