Skip to content

fix(server): keep schema ~create_time userdata as Date after reload#3026

Open
dpol1 wants to merge 1 commit into
apache:masterfrom
dpol1:fix/3013-create-time-userdata-roundtrip
Open

fix(server): keep schema ~create_time userdata as Date after reload#3026
dpol1 wants to merge 1 commit into
apache:masterfrom
dpol1:fix/3013-create-time-userdata-roundtrip

Conversation

@dpol1
Copy link
Copy Markdown
Contributor

@dpol1 dpol1 commented May 14, 2026

  • close [Bug] schema ~create_time userdata round-trips Date as String after cache reload #3013

    Userdata.CREATE_TIME ("~create_time") is written as java.util.Date
    by SchemaTransaction.setCreateTimeIfNeeded(), then persisted through backend
    serializers as JSON. On reload, userdata is deserialized through raw
    Map.class paths, so Jackson cannot restore the value type and
    schemaElement.userdata().get(Userdata.CREATE_TIME) can come back as String.

    That breaks callers and existing tests that expect ~create_time to keep the
    server-side Date contract after a schema is reloaded from backend storage.

    Main Changes

    • Added a single SchemaElement.normalizeUserdataValue(key, value) helper.

      • Converts Userdata.CREATE_TIME string values back with DateUtil.parse(...).
      • Leaves existing Date values and all other userdata keys unchanged.
      • Throws a clearer IllegalArgumentException if a ~create_time string is invalid.
    • Routed both userdata setters through the helper:

      • userdata(String, Object) covers serializer reload paths such as
        TextSerializer, BinarySerializer, MysqlSerializer, and
        CassandraSerializer.
      • userdata(Userdata) covers schema fromMap() hydration paths for
        PropertyKey, VertexLabel, EdgeLabel, and IndexLabel.
    • Added a null check to userdata(Userdata) for consistent argument validation.

    • Added regression coverage for schema userdata normalization and serializer
      round trips.

    This PR is intentionally limited to the server-side SchemaElement. The mirror
    class in hugegraph-struct has a similar userdata shape and can be handled in a
    separate follow-up PR.

    Verifying these changes

    • Trivial rework / code cleanup without any test coverage. (No Need)
    • Already covered by existing tests, such as (please modify tests here).
    • Need tests and can be verified as follows:
      • SchemaElementTest covers single-key normalization, bulk normalization,
        idempotency for Date, invalid ~create_time strings, null bulk userdata,
        and non-CREATE_TIME keys.
      • TextSerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDate
        verifies the text serializer schema round trip keeps CREATE_TIME as Date.
      • BinarySerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDate
        verifies the binary serializer schema round trip keeps CREATE_TIME as Date.
      • The new test class is registered in UnitTestSuite.

    Does this PR potentially affect the following parts?

    The public SchemaElement.userdata(...) signatures are unchanged. The behavior
    change is limited to internal userdata normalization for Userdata.CREATE_TIME.

    Documentation Status

    • Doc - TODO
    • Doc - Done
    • Doc - No Need

Notes

  • This PR fixes the server-side schema reload path for Userdata.CREATE_TIME (~create_time) only.
  • ~create_time is a server-reserved userdata key; valid string values are normalized back to java.util.Date, while malformed values now fail with IllegalArgumentException.
  • Follow-up: [Bug] Preserve typed DEFAULT_VALUE and struct userdata after JSON reload #3028 tracks preserving typed Userdata.DEFAULT_VALUE (~default_value) reloads and checking the same raw userdata deserialization gap in hugegraph-struct.

Normalize server-side schema ~create_time userdata in SchemaElement so serializer reloads and fromMap paths keep the Date contract.

Add SchemaElement, TextSerializer, and BinarySerializer coverage.

Closes apache#3013
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working tests Add or improve test cases labels May 14, 2026
@imbajin imbajin requested a review from Copilot May 14, 2026 14:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a schema reload regression where internal schema userdata Userdata.CREATE_TIME ("~create_time") can round-trip through backend JSON serializers as a String and come back as String after reload, violating the server-side contract that it should be a java.util.Date.

Changes:

  • Added SchemaElement.normalizeUserdataValue(key, value) to re-type ~create_time string values back into Date (and throw a clearer error on invalid values).
  • Routed both SchemaElement.userdata(String, Object) and SchemaElement.userdata(Userdata) through the normalization logic (and added a null-argument check for the bulk setter).
  • Added regression tests for normalization behavior and for Text/Binary serializer round-trips, and registered them in UnitTestSuite.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/schema/SchemaElement.java Normalizes ~create_time userdata on set (single and bulk), ensuring reload paths restore Date instead of String.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/core/SchemaElementTest.java Unit coverage for normalization behavior, idempotency, invalid strings, and null bulk userdata validation.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/TextSerializerTest.java Regression test verifying TextSerializer schema round-trip keeps CREATE_TIME as Date.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/serializer/BinarySerializerTest.java Regression test verifying BinarySerializer schema round-trip keeps CREATE_TIME as Date.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/UnitTestSuite.java Registers the new unit tests in the suite.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed correctness, design, and tests. No blocking issues — fix is correct and narrowly scoped, all backend reload paths (Text / Binary / Mysql / Cassandra / HBase / Postgres / Palo / Hstore + the four fromMap paths) route through the new helper, and DateUtil.parse is symmetric with the "yyyy-MM-dd HH:mm:ss.SSS" format used by JsonUtil/HugeGraphSONModule. Inline comments cover important refinements (helper placement, public setter docs, test depth). A few additional minor notes below that don't have a clean inline target:

  • 🧹 removeUserdata(Userdata) (SchemaElement.java:114-118) lacks a null check — adding E.checkArgumentNotNull(userdata, ...) keeps the bulk setters symmetric with the new check on userdata(Userdata).
  • 🧹 Behavior change worth a release note: any REST client that posts ~create_time as a JSON string in user-data now silently receives a Date (or an IllegalArgumentException if malformed). ~create_time is a server-reserved key (~ prefix, written by SchemaTransaction.setCreateTimeIfNeeded), so risk is small, but flagging it in the PR body / changelog is helpful.
  • 🧹 Follow-up tracking: Userdata.DEFAULT_VALUE has the same Jackson type-erasure problem (e.g., a property key whose data type is Date will reload its default value as String). The PR description already defers the hugegraph-struct mirror — please add DEFAULT_VALUE to the same follow-up list, since the root cause and fix shape are identical.

* a raw {@code Map}. This method restores the original type after
* deserialization. Idempotent for values already of the expected type.
*/
private static Object normalizeUserdataValue(String key, Object value) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Consider moving normalizeUserdataValue onto Userdata itself (next to CREATE_TIME / DEFAULT_VALUE constants and the existing check() validator). Three benefits:

  1. Cohesion — the rule is about userdata semantics, not about SchemaElement.
  2. Reuse for the announced follow-up — the PR notes that hugegraph-struct has a mirror Userdata with the same shape; if normalization lives on Userdata, the struct fix becomes a one-line delegation.
  3. Future userdata readers — anything that hydrates a Userdata from raw JSON gets the fix for free, not just the two SchemaElement setters.

Not blocking — the current placement works. But shifting it would prevent re-implementing this for every userdata-bearing class.

@@ -99,11 +100,37 @@ public Map<String, Object> userdata() {
public void userdata(String key, Object value) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ The two public userdata(...) setters now silently coerce String → Date for ~create_time and throw IllegalArgumentException on a malformed string. The detailed Javadoc lives on the private helper, where IDE tooltips / public-API readers won't see it.

A one-line note on each public setter (or on the class header) calling out the implicit conversion would prevent surprise for callers reading the public API.


@Test
public void testSingleSetterLeavesOtherStringKeysUntouched() {
SchemaElement schema = newSchema();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ The PR specifically wraps DateUtil.parse(...)'s RuntimeException as the cause of the thrown IllegalArgumentException. Worth locking that contract in:

Suggested change
SchemaElement schema = newSchema();
}, e -> {
Assert.assertContains(Userdata.CREATE_TIME, e.getMessage());
Assert.assertContains("not-a-date", e.getMessage());
Assert.assertNotNull(e.getCause());
});

Protects against a future refactor that drops the cause chain (which would silently degrade error diagnostics).

}, e -> {
Assert.assertContains("userdata", e.getMessage());
});
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Consider one more test that asserts schema.userdata(Userdata.CREATE_TIME, null) still throws IllegalArgumentException from the existing E.checkArgumentNotNull(value, ...).

Reason: the new helper sits between the null check and the put. A future refactor that reorders these (e.g., parse first, null-check second) would silently break the null contract — currently nothing pins it down for the CREATE_TIME key specifically.

Assert.assertTrue("CREATE_TIME should be a Date after round-trip, " +
"was " + (value == null ? "null" : value.getClass()),
value instanceof Date);
Assert.assertEquals(created, value);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Two coverage gaps worth one small change each:

  1. Only PropertyKey is round-tripped. VertexLabel, EdgeLabel, and IndexLabel reach the helper through the bulk setter (fromMapuserdata(new Userdata(...))), which is a different control path than readPropertyKey's per-entry single setter. Adding even one assertion for any of those three would harden coverage cheaply.
  2. Near-duplicate test classes. This file and the new BinarySerializerTest#testPropertyKeyUserdataCreateTimeRoundTripsAsDate differ only in serializer type. A @Parameterized test or a shared helper in BaseUnitTest taking Function<HugeConfig, AbstractSerializer> would remove ~50 lines and stop the two from drifting.

Neither blocks merge; (1) is the more valuable of the two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files. tests Add or improve test cases

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] schema ~create_time userdata round-trips Date as String after cache reload

3 participants