Skip to content

Core, AWS, REST: Promote the S3 signing endpoint to the main spec#15112

Open
adutra wants to merge 17 commits intoapache:mainfrom
adutra:promote-sign-endpoint
Open

Core, AWS, REST: Promote the S3 signing endpoint to the main spec#15112
adutra wants to merge 17 commits intoapache:mainfrom
adutra:promote-sign-endpoint

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Jan 22, 2026

Dev ML discussion: https://lists.apache.org/thread/2kqdqb46j7jww36wwg4txv6pl2hqq9w7

This commit promotes the S3 remote signing endpoint from an AWS-specific implementation to a first-class REST catalog API endpoint.

This enables other storage providers (GCS, Azure, etc.) to eventually reuse the same signing endpoint pattern without duplicating the API definition.

OpenAPI Specification changes:

  • Add /v1/{prefix}/namespaces/{namespace}/tables/{table}/sign/{provider} endpoint to the main REST catalog OpenAPI spec
  • Define RemoteSignRequest, RemoteSignResult and RemoteSignResponse schemas
  • Remove the separate s3-signer-open-api.yaml from the AWS module
  • Update the Python client

Core Module changes (iceberg-core):

  • Add RemoteSignRequest and RemoteSignResponse model classes, copied from the iceberg-aws module
  • Add RemoteSignRequestParser and RemoteSignResponseParser for JSON serialization, copied from the iceberg-aws module
  • Add SIGNER_URI and SIGNER_ENDPOINT properties to CatalogProperties for configuring the signing endpoint
  • Add V1_TABLE_REMOTE_SIGN field and remoteSign() method to ResourcePaths
  • Register the new endpoint in Endpoint.java
  • Add abstract RemoteSignerServlet base class for remote signing tests, copied from the iceberg-aws module

AWS Module changes (iceberg-aws):

  • Deprecate S3SignRequest and S3SignResponse for removal
  • Deprecate S3SignRequestParser and S3SignResponseParser for removal
  • Deprecate S3ObjectMapper for removal
  • Refactor S3SignerServlet to extend RemoteSignerServlet
  • Update S3V4RestSignerClient
  • Move relevant tests to iceberg-core

5XX:
$ref: '#/components/responses/ServerErrorResponse'

/v1/{prefix}/namespaces/{namespace}/tables/{table}/sign/{provider}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{provide} why do we need that ? a table would ideally be in one object store ? if there are multiple thats fine too, i believe we give absolute path of the uri right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this, because if/when a catalog server eventually has remote signing available for more than one object storage provider (say, S3 and Azure), it would be good if the server could determine how exactly to sign the request. Without this path parameter, the server would need to apply some heuristics to determine the right object store provider, and hence how to sign the request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the server would need to apply some heuristics to determine the right object store provider

didn't get this part, we give the path we want to be signed from client to server as part of payload of this request right ? can't we extract that from there (Are you concerned with s3 / s3a / s3n semantics ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that easy.

As an example, a request to sign looks like the one below for S3:

PUT /warehouse/db/sales_table/data/date=2024-05/00022-44-55.parquet HTTP/1.1
Host: my-datalake.s3.us-east-1.amazonaws.com
Date: Fri, 24 May 2024 12:45:00 GMT
Content-Length: 134217728
Content-Type: application/octet-stream

A similar request to GCP would look like:

POST /upload/storage/v1/b/my-datalake-bucket/o?uploadType=media&name=warehouse/db/sales/data/file.parquet HTTP/1.1
Host: storage.googleapis.com
Date: Fri, 24 May 2024 12:45:00 GMT
Content-Length: 134217728
Content-Type: application/octet-stream

And for Azure:

PATCH /my-container/warehouse/db/sales/data/file.parquet?action=append&position=0 HTTP/1.1
Host: my-datalake.dfs.core.windows.net
x-ms-date: Fri, 24 May 2024 12:45:00 GMT
x-ms-version: 2023-11-03
Content-Length: 134217728
Content-Type: application/octet-stream

The question is: how do you know the object storage provider so that the server can pick the right signing algorithm? The only (heuristic) way is to inspect the Host header, but that's brittle. It's much simpler if the client tells the server what object storage provider to use.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am taking of sign request from IRC client to IRC server, i believe what you are showing is IRC server to object store sign ? am i missing something
like IRC client will do a post to /v1/{prefix}/namespaces/{namespace}/tables/{table}/sign with uri as param
https://github.com/apache/iceberg/pull/15112/changes#diff-02549ca620d020dc9ead80088cc14e311e12a69651fa8d394cd41a4308debb2eR4725

i think this would an absolute path right ? s3:////table/data/a.parquet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'm fine with that. Let's remove the provider parameter. @singhpk234 would that work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from end too to keep the provider in the body of the request rather in the URL !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for working on this, @alaturqua.

I’m leaning toward option 3 since it covers more use cases without requiring a mandatory provider in the path. The catalog implementation can still decide how to behave based on whether provider is present or whether it can reliably infer it from the input uri.

When the provider is explicitly provided, I think the catalog should honor it. Making it optional also keeps things backward compatible while leaving room for more complex scenarios, like supporting multiple storage providers.

3. Include the provider parameter in the request body
a. the client impl is responsible for generating it
b. must be optional for backwards compatibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let me implement option 3 then. Thank you all for the feedback!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 3 implemented, FYI.


If remote signing for a specific storage provider is enabled, clients must respect the following configurations when creating a remote signer client:
- `signer.uri`: the base URI of the remote signer endpoint. Optional; if absent, defaults to the catalog's base URI.
- `signer.endpoint`: the path of the remote signer endpoint. Required. Should be concatenated with `signer.uri` to form the complete URI.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SHOULD or MUST ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's complicated 😄

The signer client impl uses org.apache.iceberg.rest.RESTUtil#resolveEndpoint to perform the concatenation of signer.uri and signer.endpoint.

So, signer.endpoint could also be an absolute URL, in which case, signer.uri would be ignored.

I will try to come up with a better wording.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased, lmk what you think!

allOf:
- $ref: '#/components/schemas/Expression'

MultiValuedMap:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is S3Headers eq section in the s3 signer spec ? can we say like ObjectStoreProviderHeader ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for a more generic name because there is nothing specific to remote signing here. This component could perfectly be used for something else in the spec.

- `s3.secret-access-key`: secret for credentials that provide access to data in S3
- `s3.session-token`: if present, this value should be used for as the session token
- `s3.remote-signing-enabled`: if `true` remote signing should be performed as described in the `s3-signer-open-api.yaml` specification
- `s3.remote-signing-enabled`: if `true` remote signing should be performed as described in the `RemoteSignRequest` schema section of this spec document.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I chose to keep this property specific to S3. I think that even if the signer endpoint is now generic, enablement should be performed for each specific object storage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually do google GCS cloud access via its s3 gateway; same signing algorithm, just a few different settings to change listing version, endpoint, &c

https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/third_party_stores.md#google-cloud-storage-through-the-s3a-connector

public String baseSignerUri() {
return properties().getOrDefault(S3_SIGNER_URI, properties().get(CatalogProperties.URI));
return properties()
.getOrDefault(CatalogProperties.SIGNER_URI, properties().get(CatalogProperties.URI));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this breaking existing behavior where one could have provided the s3.signer.uri but now we don't read that property anymore and rely on signer.uri. The same for the endpoint

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, since the new client introduced in 1.11 is resilient to older servers.

Let's break this into concrete situations:

ServerClientSigner configResulting URIResult
1.101.11s3.signer.uri=...
s3.signer.endpoint=...
s3.signer.uri + s3.signer.endpoint
1.101.11s3.signer.uri=...s3.signer.uri + default endpoint
1.101.11s3.signer.endpoint=...catalog URI + s3.signer.endpoint
1.101.11(empty)catalog URI + default endpoint
1.111.10signer.uri=...
signer.endpoint=...
s3.signer.uri=...
s3.signer.endpoint=...
s3.signer.uri + s3.signer.endpoint
1.111.10signer.endpoint=...
s3.signer.endpoint=...
catalog URI + s3.signer.endpoint
1.111.10signer.uri=...
signer.endpoint=...
catalog URI + default endpoint
1.111.10signer.endpoint=...catalog URI + default endpoint

So in summary:

  • 1.11 Clients won't break older servers.
  • 1.11 Servers won't break older clients iif they include both signer.* and s3.signer.* properties for backwards compatibility (which they all should do).

However, once support for the deprecated s3.signer.* properties is removed (1.12), newer clients would break older servers (<= 1.10). If that's concerning, we could e.g. wait a few more minor releases before removal.

Copy link
Contributor

@nastra nastra Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we be reading S3_SIGNER_URI first and only then fall back to CatalogProperties.SIGNER_URI and then to CatalogProperties.URI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I planned for it and forgot to implement 🤦‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and also added a unit test to verify the precedence behavior.

* @deprecated since 1.11.0, will be removed in 1.12.0; use {@link CatalogProperties#SIGNER_URI}
* instead.
*/
@Deprecated public static final String S3_SIGNER_URI = CatalogProperties.SIGNER_URI;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can just change the value here as that would break backwards compatibility

"true",
CatalogProperties.URI,
uri,
CatalogProperties.SIGNER_ENDPOINT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this wasn't needed before but is needed now, which indicates that this is a breaking change for users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it's not required, but it will become required in a future release (1.12 or later).

There is a check + warning here:

https://github.com/adutra/iceberg/blob/f3fc09595bc47d4f0ba7a879d3c7eaf3d70c1e38/aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java#L219-L224

I proactively updated the tests so that they don't break when we make this property required.


paths:

/v1/aws/s3/sign:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we would want to remove this spec yet. We should probably first deprecate it

}
}

public static class RemoteSignRequestSerializer<T extends RemoteSignRequest>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these all should probably just be package-private and not public

gen.writeEndArray();
}
gen.writeEndObject();
public static void headersToJson(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure whether we need to make this one and the one below public

default:
throw new UnsupportedOperationException("Unsupported grant_type: " + grantType);
protected void validateSignRequest(RemoteSignRequest request) {
if ("POST".equalsIgnoreCase(request.method()) && request.uri().getQuery().contains("delete")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor : should we use rest constants instead of raw string for "POST" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I went back to raw strings because org.apache.iceberg.rest.HttpMethod is missing constants for PUT, OPTIONS, etc. which was causing integration tests to fail.

5XX:
$ref: '#/components/responses/ServerErrorResponse'

/v1/{prefix}/namespaces/{namespace}/tables/{table}/sign/{provider}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would require support for request properties

not necessarily, mainly because its something client can infer on its own, for example we wanna R/W path client can just send back the s3 as the provider while making the request ? we just need to define this field in the POST then ?

* @param response the HTTP response to add headers to
*/
protected void addSignResponseHeaders(RemoteSignRequest request, HttpServletResponse response) {
if (request.method().equalsIgnoreCase("GET") || request.method().equalsIgnoreCase("HEAD")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we previously had this defined in a CACHEABLE_METHODS set, so would be good to keep this for easier readability

Preconditions.checkArgument(
properties().containsKey(S3_SIGNER_URI) || properties().containsKey(CatalogProperties.URI),
properties().containsKey(S3_SIGNER_URI)
|| properties().containsKey(RESTCatalogProperties.SIGNER_URI)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some tests for these new properties to testS3RemoteSignerWithoutUri()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, but the tests look very similar to the ones in TestS3V4RestSignerClient.legacySignerProperties().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I actually meant to only add tests to testS3RemoteSignerWithoutUri(), which verifies that the error msg S3 signer service URI is required is properly thrown when any of these new props are missing. No need to duplicate TestS3V4RestSignerClient.legacySignerProperties() into TestS3FileIOProperties

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the error is only thrown when all of these props are missing. So the existing test is imho already testing sufficiently. Wdyt?

}

/**
* The base URI of the remote signer endpoint. Optional, defaults to {@linkplain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just {@link CatalogProperties#URI} here?


lint-spec:
uv run yamllint --strict rest-catalog-open-api.yaml
uv run yamllint --strict ../aws/src/main/resources/s3-signer-open-api.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should still leave this in until we actually remove the openapi spec file

* @deprecated since 1.11.0, will be removed in 1.12.0; use {@link
* org.apache.iceberg.rest.requests.RemoteSignRequestParser} instead.
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually liked that you switched the impl here to using functionality from the new RemoteSignRequestParser for stuff, not sure why you decided to change that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK 😅 I thought it was a bit too invasive. Let me go back to the previous version.

* @deprecated since 1.11.0, will be removed in 1.12.0; the serializers for S3 signing are now
* registered in {@link RESTSerializers}.
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class I would probably just deprecate and not switch to using RESTSerializers.registerAll(MAPPER).

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, but left a few minor comments that would be good to address

@adutra adutra force-pushed the promote-sign-endpoint branch from 7ee473b to 90e4eba Compare January 30, 2026 16:02
@adutra
Copy link
Contributor Author

adutra commented Jan 30, 2026

Heads up: I had to rebase because of a conflict with fec9800.

@adutra adutra force-pushed the promote-sign-endpoint branch from 0fcc2da to ee33e03 Compare January 30, 2026 16:46
@adutra
Copy link
Contributor Author

adutra commented Jan 30, 2026

Rebased again because of another conflict with b2f312f.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented.

Looking at the signer servlet I'm wondering how over-enthusiastic about signing it is; it probably needs more validation of requests so the javadoc should warn of this.

properties: dict[str, str] | None = None
body: str | None = Field(
None,
description='Optional body of the request to send to the signing API. This should only be populated for requests which do not have the relevant data in the URI itself (e.g. DeleteObjects requests)',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that the body of PUT and multipart uploads are to be be included. They don't have to be, provided the checksum information is provided as a signed header. This avoids having to upload many GB of data to the service, needlessly.

Not sure about whether DeleteObjects qualifies here. I think technically it'd be OK to just sign the checksum, but given the damage the request can do, a production signing service SHOULD validate the entire request and every path in the body.

This'd argue for a list of which commands the body MUST be included (DeleteObjects), and where it SHOULD be omitted

  • PUT
  • UploadPart
  • CompleteMultipartUpload

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current description is admittedly a bit too normative, and biaised towards S3 on top of that. It should probably be revisited. But again, let's focus on promoting existing functionality from aws to core in this PR, and tackle enhancements to both the spec and test infra to follow-up PRs, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also note that the S3 signer client currently only includes the body for DeleteObjects. PUT operations in particular do not have the body included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the signer doesn't put the big binaries, but the spec implies they should, so should at least say "requests where the body of the message contains content which must be validated before a request is signed, such as the S3 DeleteObjects call".

FWIW I don't know which of the other stores have operations with large payloads other than their own equivalents of DeleteObjects, otherwise operations are mutating HTTP where custom headers are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point. I rephrased the description with your suggestion.

if (HttpMethod.POST.name().equalsIgnoreCase(request.method())
&& request.uri().getQuery().contains("delete")) {
String body = request.body();
Preconditions.checkArgument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be escalated to a 400 response somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR doesn't change the behavior of this servlet, just moves some portions of it from the aws module to the core module. I'm open to enhancements to this servlet, but I think they should be introduced separately.

public class S3SignerServlet extends RemoteSignerServlet {

static final Clock SIGNING_CLOCK = Clock.fixed(Instant.now(), ZoneId.of("UTC"));
static final Set<String> UNSIGNED_HEADERS =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good time to review the headers which mustn't be used in signing to assist in caching

Can you add "Referer" to this list; it's used used in s3a audit log, and if people work off this list they won't include it.

it's almost worth pulling this is out to list headers to to skip.

  • User-Agent
  • Referer (yes, that's the spelling)
  • any others people know of where they'd taint cacheing and are harmless

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding Referer would be something that should be done in a separate PR if necessary, but not here, since the functionality should stay the same

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx for the change.

Dev ML discussion: https://lists.apache.org/thread/2kqdqb46j7jww36wwg4txv6pl2hqq9w7

This commit promotes the S3 remote signing endpoint from an AWS-specific
implementation to a first-class REST catalog API endpoint.

This enables other storage providers (GCS, Azure, etc.) to eventually reuse
the same signing endpoint pattern without duplicating the API definition.

OpenAPI Specification changes:

- Add `/v1/{prefix}/namespaces/{namespace}/tables/{table}/sign/{provider}`
  endpoint to the main REST catalog OpenAPI spec
- Define `RemoteSignRequest`, `RemoteSignResult` and `RemoteSignResponse` schemas
- Remove the separate `s3-signer-open-api.yaml` from the AWS module
- Update the Python client

Core Module changes (iceberg-core):

- Add `RemoteSignRequest` and `RemoteSignResponse` model classes, copied from
  the iceberg-aws module
- Add `RemoteSignRequestParser` and `RemoteSignResponseParser` for JSON
  serialization, copied from the iceberg-aws module
- Add `SIGNER_URI` and `SIGNER_ENDPOINT` properties to `CatalogProperties`
  for configuring the signing endpoint
- Add `V1_TABLE_REMOTE_SIGN` field and `remoteSign()` method to
  `ResourcePaths`
- Register the new endpoint in `Endpoint.java`
- Add abstract `RemoteSignerServlet` base class for remote signing tests, copied
  from the iceberg-aws module

AWS Module changes (iceberg-aws):

- Deprecate `S3SignRequest` and `S3SignResponse` for removal
- Deprecate `S3SignRequestParser` and `S3SignResponseParser` for removal
- Deprecate `S3ObjectMapper` for removal
- Refactor `S3SignerServlet` to extend `RemoteSignerServlet`
- Update `S3V4RestSignerClient`
@adutra adutra force-pushed the promote-sign-endpoint branch from 914bb9b to 852f1cb Compare February 10, 2026 17:12
for requests where the body of the message contains content which must be validated before a request is
signed, such as the S3 DeleteObjects call.
provider:
type: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we define an ENUM for this, as s3a / s3e .... can create confusion for the catalog, we can say all are s3 ? similar for GCS / Azure check the resolving file IO i believe we do something similar there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An enum means we need a spec change for every new provider – unless we go ahead an create enums for all major cloud providers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we go ahead an create enums for all major cloud providers?

thats what we had in mind, like let say s3 / azure / gcp and we can add more in case to case basis, open string still leave room for interpretation in catalog is what i think, but i can be convinced other wise :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I'm fine with that! I just introduced the enum, the constants are s3, gcs and adls (these names match the naming convention used by ResolvingFileIO).

Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec changes looks good to me.
Added a suggestion for the provider being ENUM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants