Skip to content

[#10445] improvement(lance): Upgrade lance-core to v2.0.1 and lance-namespace-core to v0.4.5#10637

Open
bbiiaaoo wants to merge 3 commits intoapache:mainfrom
bbiiaaoo:lance-catalog
Open

[#10445] improvement(lance): Upgrade lance-core to v2.0.1 and lance-namespace-core to v0.4.5#10637
bbiiaaoo wants to merge 3 commits intoapache:mainfrom
bbiiaaoo:lance-catalog

Conversation

@bbiiaaoo
Copy link
Copy Markdown

@bbiiaaoo bbiiaaoo commented Apr 1, 2026

What changes were proposed in this pull request?

  • Upgraded Lance dependencies to the new coordinates and versions:
    • org.lance:lance-core:2.0.1
    • org.lance:lance-namespace-core:0.4.5
  • Migrated Lance-related imports and model usages from com.lancedb.* to org.lance.*.
  • Adapted code paths to upstream API changes, including mode handling (Enum -> String) and updated builder-style APIs.
  • Updated Lance namespace/table operations, exception mapping, and serialization compatibility in Gravitino Lance modules.
  • Added internal compatibility utilities (ObjectIdentifier, PageUtil, CommonUtil, JsonArrowSchemaConverter) to align with the new Lance namespace SDK behavior.
  • Updated Lance REST service docs and examples to match new dependencies and request fields.
  • Updated related unit/integration tests to cover the new APIs.

Why are the changes needed?

  • Lance upstream migrated artifacts/packages from com.lancedb to org.lance and introduced breaking API changes in newer versions.
  • Without this upgrade, Gravitino Lance modules may face dependency incompatibility and runtime/API mismatch issues.
  • This patch keeps Gravitino Lance integrations compatible with the latest Lance ecosystem while preserving expected REST behaviors.

Fix: #10445

Does this PR introduce any user-facing change?

  • Documentation/examples are updated to use org.lance dependencies and the new request style (for example, lowercase mode values such as create).
  • No new Gravitino public API is introduced.

How was this patch tested?

  • Ran unit tests and integration tests for catalog-lakehouse-generic, lance-common, and lance-rest-server.
  • Completed end-to-end verification with curl -> Gravitino requests, covering Lance namespace and table workflows on the upgraded dependencies.
  • This round focused on the impacted modules; the full-repo test matrix was not executed.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 2, 2026

Code Coverage Report

Overall Project 65.04% -0.42% 🟢
Files changed 40.88% 🔴

Module Coverage
aliyun 1.73% 🔴
api 47.14% 🟢
authorization-common 85.96% 🟢
aws 1.1% 🔴
azure 2.6% 🔴
catalog-common 10.2% 🔴
catalog-fileset 80.02% 🟢
catalog-hive 80.98% 🟢
catalog-jdbc-clickhouse 79.06% 🟢
catalog-jdbc-common 42.89% 🟢
catalog-jdbc-doris 80.28% 🟢
catalog-jdbc-hologres 54.03% 🟢
catalog-jdbc-mysql 79.23% 🟢
catalog-jdbc-oceanbase 78.38% 🟢
catalog-jdbc-postgresql 82.05% 🟢
catalog-jdbc-starrocks 78.27% 🟢
catalog-kafka 77.01% 🟢
catalog-lakehouse-generic 45.14% -12.63% 🟢
catalog-lakehouse-hudi 79.1% 🟢
catalog-lakehouse-iceberg 87.16% 🟢
catalog-lakehouse-paimon 77.71% 🟢
catalog-model 77.72% 🟢
cli 44.51% 🟢
client-java 77.63% 🟢
common 49.35% 🟢
core 81.42% +0.2% 🟢
filesystem-hadoop3 76.97% 🟢
flink 40.55% 🟢
flink-runtime 0.0% 🔴
gcp 14.2% 🔴
hadoop-common 10.39% 🔴
hive-metastore-common 45.82% 🟢
iceberg-common 50.73% 🟢
iceberg-rest-server 65.82% 🟢
integration-test-common 0.0% 🔴
jobs 66.17% 🟢
lance-common 19.6% -46.8% 🔴
lance-rest-server 63.02% +63.02% 🟢
lineage 53.02% 🟢
optimizer 82.87% 🟢
optimizer-api 21.95% 🔴
server 85.89% 🟢
server-common 70.3% 🟢
spark 32.79% 🔴
spark-common 39.09% 🔴
trino-connector 33.83% 🔴
Files
Module File Coverage
catalog-lakehouse-generic LanceTableOperations.java 26.92% 🔴
core LancePartitionStatisticStorage.java 96.68% 🟢
lance-common SerializationUtils.java 100.0% 🟢
LanceNamespaceOperations.java 0.0% 🔴
LanceTableOperations.java 0.0% 🔴
CommonUtil.java 0.0% 🔴
GravitinoLanceNameSpaceOperations.java 0.0% 🔴
GravitinoLanceNamespaceWrapper.java 0.0% 🔴
GravitinoLanceTableAlterHandler.java 0.0% 🔴
GravitinoLanceTableOperations.java 0.0% 🔴
JsonArrowSchemaConverter.java 0.0% 🔴
ObjectIdentifier.java 0.0% 🔴
PageUtil.java 0.0% 🔴
lance-rest-server LanceTableOperations.java 96.88% 🟢
LanceNamespaceOperations.java 88.89% 🟢
LanceExceptionMapper.java 74.42% 🟢

…ance-namespace-core to v0.4.5

- Bump com.lancedb:lance-core to org.lance:lance-core v2.0.1
- Bump com.lancedb:lance-namespace-core to org.lance:lance-namespace-core v0.4.5
- Refactor all package imports from com.lancedb to org.lance
- Adapt LanceTableOperations where ModeEnum was replaced by String
- Fix corresponding unit and IT tests to adapt to the new APIs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Upgrades Gravitino’s Lance integration to the new org.lance artifact coordinates/versions and adapts REST server + common operations to upstream breaking API/protocol changes (models, builders, mode handling, error mapping), along with doc/test updates.

Changes:

  • Upgraded Lance dependencies to org.lance:lance-core:2.0.1 and org.lance:lance-namespace-core:0.4.5, migrating imports/packages from com.lancedb.* to org.lance.*.
  • Updated Lance REST server endpoints and exception mapping to the new namespace SDK semantics (string modes, new error codes, additional root routes).
  • Added/updated compatibility utilities and refactored tests/integration tests and docs to match the new client/server behavior.

Reviewed changes

Copilot reviewed 26 out of 26 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
lance/lance-rest-server/src/test/java/org/apache/gravitino/lance/service/rest/TestLanceNamespaceOperations.java Updates REST resource tests for new Lance models/error codes/mode strings.
lance/lance-rest-server/src/test/java/org/apache/gravitino/lance/service/rest/TestGravitinoLanceTableOperations.java Updates unit tests for alter/drop column request model changes.
lance/lance-rest-server/src/test/java/org/apache/gravitino/lance/integration/test/LanceRESTServiceIT.java Refactors IT to new org.lance client APIs and avoids reflection for TableApi.
lance/lance-rest-server/src/main/java/org/apache/gravitino/lance/service/rest/LanceTableOperations.java Adapts table REST endpoints to new SDK (string modes, new models, error behavior).
lance/lance-rest-server/src/main/java/org/apache/gravitino/lance/service/rest/LanceNamespaceOperations.java Adapts namespace REST endpoints and adds “root” routes (/list, /describe).
lance/lance-rest-server/src/main/java/org/apache/gravitino/lance/service/LanceExceptionMapper.java Remaps Gravitino exceptions to org.lance exception hierarchy + HTTP status mapping.
lance/lance-rest-server/build.gradle.kts Updates excludes for new groupId and adds test dependency as needed.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/utils/SerializationUtils.java Reimplements header properties JSON parsing using Gravitino’s Jackson utilities.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/LanceTableOperations.java Updates common table ops interface to new SDK types/signatures (mode as string).
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/LanceNamespaceOperations.java Updates common namespace ops interface to new SDK types/signatures (mode/behavior as string).
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/PageUtil.java Adds internal pagination helper for list APIs.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/ObjectIdentifier.java Adds internal identifier parsing helper compatible with new SDK behavior.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/JsonArrowSchemaConverter.java Adds Arrow-schema-to-JSON conversion for new JsonArrowSchema models.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/GravitinoLanceTableOperations.java Updates Gravitino-backed table ops implementation for new models/error types/mode normalization.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/GravitinoLanceTableAlterHandler.java Updates alter-columns handling for AlterColumnsEntry model changes.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/GravitinoLanceNamespaceWrapper.java Updates exception mapping when validating/loading catalogs with new SDK exceptions.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/GravitinoLanceNameSpaceOperations.java Updates namespace operations + adds mode/behavior parsing compatible with string inputs.
lance/lance-common/src/main/java/org/apache/gravitino/lance/common/ops/gravitino/CommonUtil.java Adds internal stacktrace formatting utility (replacement for upstream util).
lance/lance-common/build.gradle.kts Updates excludes for new groupId and adds Lance core dependency.
gradle/libs.versions.toml Bumps Lance/Lance-namespace versions and switches to org.lance coordinates.
docs/lance-rest-service.md Updates docs/examples for new dependency coordinates and request fields.
core/src/main/java/org/apache/gravitino/stats/storage/LancePartitionStatisticStorage.java Migrates to new org.lance dataset/fragment builder APIs.
core/build.gradle.kts Updates dependency exclusion to new groupId.
catalogs/catalog-lakehouse-generic/src/test/java/org/apache/gravitino/catalog/lakehouse/lance/TestLanceTableOperations.java Updates unit tests for new index API (IndexOptions).
catalogs/catalog-lakehouse-generic/src/test/java/org/apache/gravitino/catalog/lakehouse/lance/integration/test/CatalogGenericCatalogLanceIT.java Updates ITs for new dataset open/write APIs.
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java Migrates table ops to new Lance dataset/index APIs.
Comments suppressed due to low confidence (3)

docs/lance-rest-service.md:286

  • The create-empty-table curl example still sends properties in the JSON body, but the server implementation now reads table properties only from the x-lance-table-properties header for /create-empty. Update the documentation to match the actual API behavior, or (preferably) keep body properties supported for backward compatibility.
# Create a new empty table
curl -X POST http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table02/create-empty \
  -H 'Content-Type: application/json' \
  -d '{
    "id": ["lance_catalog", "schema", "table02"],
    "location": "/tmp/lance_catalog/schema/table02",
    "properties": { "description": "This is table02"  }
  }'  

docs/lance-rest-service.md:354

  • In the Java example, CreateTableRequest sets id and mode but does not set a table location (or any equivalent). Since the REST server requires the location (via header or request field depending on client), this snippet is likely to fail as written. Please update the example to include the location in the supported way for the new SDK.
// Create a table with schema inferred from Arrow IPC file
CreateTableRequest createTableRequest = new CreateTableRequest();
createTableRequest.setId(Lists.newArrayList("lance_catalog", "schema", "table03"));
createTableRequest.setMode("create");
org.apache.arrow.vector.types.pojo.Schema schema =
        new org.apache.arrow.vector.types.pojo.Schema(
                Arrays.asList(
                        Field.nullable("id", new ArrowType.Int(32, true)),
                        Field.nullable("value", new ArrowType.Utf8())));
byte[] body = ArrowUtils.generateIpcStream(schema);
ns.createTable(createTableRequest, body);

docs/lance-rest-service.md:397

  • In the Python example, CreateTableRequest sets id and mode but omits location. Unless the new client implicitly supplies location another way, this example will fail to create a table. Update the snippet to include the required location (or document the correct mechanism for supplying it).
# Create a table with schema inferred from Arrow IPC file
create_table_request = ln.CreateTableRequest(
    id=['lance_catalog', 'schema', 'table03'],
    mode='create'
)
with open('schema.ipc', 'rb') as f:
    body = f.read()

ns.create_table(create_table_request, body)

@FANNG1
Copy link
Copy Markdown
Contributor

FANNG1 commented Apr 3, 2026

@bbiiaaoo could you fix the CI and address the review comment to make this PR ready for review?

@FANNG1
Copy link
Copy Markdown
Contributor

FANNG1 commented Apr 7, 2026

@yuqi1129 @roryqi could you help to check whether it's on the correct direction?

@bbiiaaoo bbiiaaoo force-pushed the lance-catalog branch 2 times, most recently from 261bb70 to 5f1ac2b Compare April 7, 2026 07:25
@yuqi1129
Copy link
Copy Markdown
Contributor

yuqi1129 commented Apr 7, 2026

@bbiiaaoo
Is this PR ready for review?

…0.4.5 and polish validations/tests

- use explicit root namespace id ("") for /v1/namespace/list and /v1/namespace/describe
- keep create-empty backward compatibility: accept body properties and merge with header properties (header wins)
- clarify alter_columns validation message when rename is missing
- update Lance REST Java doc example to use "uri"/"delimiter" connection keys
- catch JsonProcessingException (instead of generic Exception) in SerializationUtils
- add TestSerializationUtils for valid/blank/invalid JSON cases
- replace FQN TableNotFoundException usages in TestLanceNamespaceOperations with import + simple name
- extend REST tests for root namespace endpoints and create-empty location/properties behaviors
@bbiiaaoo bbiiaaoo closed this Apr 8, 2026
@bbiiaaoo bbiiaaoo reopened this Apr 8, 2026
@bbiiaaoo
Copy link
Copy Markdown
Author

bbiiaaoo commented Apr 8, 2026

Hi @yuqi1129 @FANNG1 ,

All previous review comments have been addressed. And verified that all CI workflows are now passing.
This PR is now ready for review. Thanks!

@bbiiaaoo bbiiaaoo marked this pull request as ready for review April 8, 2026 08:51
# Create a new empty table
curl -X POST http://localhost:9101/lance/v1/table/lance_catalog%24schema%24table02/create-empty \
-H 'Content-Type: application/json' \
-H "x-lance-table-properties: {\"description\":\"This is table02\"}" \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the header x-lance-table-properties optional or required?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x-lance-table-properties is optional for create-empty. If omitted, it defaults to an empty map. Request-body properties are still accepted for backward compatibility (header wins on key conflicts). I will clarify this in the documentation in the next commit.


private CommonUtil() {}

static String formatCurrentStackTrace() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the method used for? Can Throwables.getStackTraceAsString() meet your needs?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatCurrentStackTrace() is used to populate the Lance exception detail field. Good point on utility reuse; I will switch it to Throwables.getStackTraceAsString(...) in the next commit.

private final GravitinoLanceNamespaceWrapper namespaceWrapper;
private final GravitinoClient client;

private enum CreateMode {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add those three enums? Can't we use enums in the Lance API?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In lance-namespace 0.4.5, mode/behavior are now plain strings in generated request models (no public enums to reuse). We keep local enums only as normalized internal states for type-safe branching after parsing input strings.

@PathParam("id") String tableId,
@QueryParam("delimiter") @DefaultValue(NAMESPACE_DELIMITER_DEFAULT) String delimiter,
CreateEmptyTableRequest request,
Map<String, Object> requestBody,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the value type of requestBody Object?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. We intentionally use Map<String, Object> here for backward compatibility.

In lance-namespace 0.4.5, CreateEmptyTableRequest does not include a properties field, but we still accept legacy request-body properties from existing clients. Using Object allows us to safely normalize non-string JSON values (for example numbers/booleans) into strings before passing them downstream.

* @return the response of the create table operation
*/
@SuppressWarnings("deprecation")
CreateEmptyTableResponse createEmptyTable(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recalled that createEmptyTable has been replaced with declareTable, so will you plan to support it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder, you are absolutely right.

createEmptyTable is deprecated in lance-namespace 0.4.5, and declareTable is the preferred API now. We plan to add declareTable support in the next commit, while keeping create-empty for backward compatibility with existing clients.

import org.lance.namespace.model.JsonArrowSchema;

/** Converts Arrow schema to Lance Namespace JsonArrowSchema model. */
class JsonArrowSchemaConverter {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class directly copied from Lance repo?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not directly copied from the current Lance repo as-is.

During the upgrade, the old utility class we previously relied on was no longer available from the new Java artifacts, so we added a local converter in Gravitino to preserve compatible JsonArrowSchema behavior for this code path.

return Stream.concat(setPropertiesStream, removePropertiesStream).toArray(arrayCreator);
}

private static CreateMode parseCreateMode(String instance, String mode) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems to be duplicated with normalizeCreateMode. Can you try to optimize it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. There is duplicated normalization logic between namespace parseCreateMode and table normalizeCreateMode.

They are not fully identical because namespace mode is converted to an internal enum, while table mode is normalized to a string property passed downstream (and register mode also accepts REGISTER as an alias).

I’ll optimize this by extracting the shared token-normalization logic and reusing it in both paths, while keeping the behavior-specific mappings unchanged.

… mode normalization

- add `declareTable` to `LanceTableOperations` and implement it in
  `GravitinoLanceTableOperations` via metadata-only create flow
- add REST endpoint `POST /v1/table/{id}/declare` with request handling and validation
- deprecate `createEmptyTable` in common ops API and document `DeclareTable` as preferred
- update docs:
  - add `DeclareTable` endpoint entry and usage example
  - clarify `x-lance-table-properties` is optional for `create-empty`
- extract shared token normalization to `CommonUtil.normalizeToken(...)`
  and reuse it in namespace/table mode parsing to remove duplication
- replace manual stacktrace building with
  `Throwables.getStackTraceAsString(...)`
- extend REST unit tests and integration tests for declare-table behavior
@bbiiaaoo bbiiaaoo closed this Apr 9, 2026
@bbiiaaoo bbiiaaoo reopened this Apr 9, 2026
| DeregisterTable | Unregister a table from a namespace (metadata only, data remains) | POST | `/lance/v1/table/{id}/deregister` | 1.1.0 |
| CreateEmptyTable | Declare a table and store the metadata without touching lance table data, for more, please refer to [doc](https://docs.lancedb.com/api-reference/rest/table/create-an-empty-table) | POST | `/lance/v1/table/{id}/create-empty` | 1.1.0 |
| CreateEmptyTable | **Deprecated**: Use `DeclareTable` instead. Declare a table and store the metadata without touching lance table data, for more, please refer to [doc](https://docs.lancedb.com/api-reference/rest/table/create-an-empty-table) | POST | `/lance/v1/table/{id}/create-empty` | 1.1.0 |
| DeclareTable | Declare a table and store the metadata without touching lance table data. This is the preferred replacement for `CreateEmptyTable`. | POST | `/lance/v1/table/{id}/declare` | 1.1.0 |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version should be 1.3.0

@yuqi1129
Copy link
Copy Markdown
Contributor

yuqi1129 commented Apr 9, 2026

@FANNG1 @roryqi
Would you also like to take a look?

@yuqi1129
Copy link
Copy Markdown
Contributor

@bbiiaaoo

  1. Have you tested it with the LanceSpark and the LanceRay with the corresponding version?
  2. You may also need to update the Compatibility Matrix part in the document lance-rest-integration.md.

return DropMode.FAIL;
}
String normalized = CommonUtil.normalizeToken(mode);
if ("FAIL".equals(normalized)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could u use DropMode.valueOf(normalized)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement] Align Gravitino Lance REST module with latest lance-namespace and ensure backward compatibility

5 participants