Skip to content

feat: add user metadata support for RLS policies#825

Merged
fenos merged 25 commits intomasterfrom
tyler/STORAGE-258/user-metadata-in-rls
Mar 26, 2026
Merged

feat: add user metadata support for RLS policies#825
fenos merged 25 commits intomasterfrom
tyler/STORAGE-258/user-metadata-in-rls

Conversation

@TylerHillery
Copy link
Copy Markdown
Contributor

@TylerHillery TylerHillery commented Jan 23, 2026

What kind of change does this PR introduce?

resolves #757 by adding support for referencing user_metadata field in RLS policies for inserts

What is the current behavior?

Currently user_metadata is not passed into canUpload checks, so RLS policies can't reference it during file uploads.

What is the new behavior?

Now user_metadata is passed through all upload paths (standard HTTP, S3, TUS, multipart) so RLS policies can use it.

Additional context

  • The main change I made was to canUpload to accept userMetatdata. I updated all callers to pass in userMetadata. Because upload gets called in prepareUpload which gets called in upload I also updated all upload calls to ensure userMetadata is passed through. I found it difficult to ensure I caught every cause though, e.g. it was very easy to miss uploadNewObject needed to be updated in the uploadFromRequest. I'm open to feedback to see if there is a more robust solution we could implement that could be caught with typescript types.
  • The part I found most confusing to update was the canUpload checks for multipart uploads. It required changing the order of certain operations around so that user_metadata was available. Wasn't entirely sure if these changes were okay.
  • Added some context as to why I moved userMetadata from the FileUpload interface to the UploadRequest but wasn't sure if this was the right approach so would appreciate some feedback

@coveralls
Copy link
Copy Markdown

coveralls commented Jan 23, 2026

Pull Request Test Coverage Report for Build 21612994975

Details

  • 135 of 139 (97.12%) changed or added relevant lines in 9 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage increased (+0.04%) to 75.596%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/http/routes/object/getSignedUploadURL.ts 18 20 90.0%
src/http/routes/tus/lifecycle.ts 22 24 91.67%
Files with Coverage Reduction New Missed Lines %
src/storage/protocols/s3/s3-handler.ts 1 83.59%
Totals Coverage Status
Change from base Build 21589555306: 0.04%
Covered Lines: 26008
Relevant Lines: 34122

💛 - Coveralls

Comment thread src/storage/uploader.ts
Comment thread src/http/routes/tus/lifecycle.ts Outdated
@TylerHillery TylerHillery marked this pull request as ready for review January 23, 2026 23:27
@TylerHillery TylerHillery requested a review from a team January 23, 2026 23:27
@fenos
Copy link
Copy Markdown
Contributor

fenos commented Jan 26, 2026

@TylerHillery awesome! Can we also pass in the metadata into the canUpload?
User metadata is metadata passed by the user where metadata is the file metadata such as size, content-type etc...

@TylerHillery
Copy link
Copy Markdown
Contributor Author

@TylerHillery awesome! Can we also pass in the metadata into the canUpload? User metadata is metadata passed by the user where metadata is the file metadata such as size, content-type etc...

I thought we couldn't do this because we don't get the objectMetadata until the object has been uploaded. We check canUpload before uploading to object storage to test RLS. Can you confirm if this is the case?

We could switch the order but then we have to ensure to delete the object if user can't upload and we would be do the "expensive" part uploading when we didn't have to. Much quicker to test RLS first and skip uploading if we don't have to.

Here is the upload method on the Uploader class with notes:

  /**
   * Extracts file information from the request and upload the buffer
   * to the remote storage
   * @param request
   * @param options
   */
  async upload(request: UploadRequest) {
   // NOTE: canUpload() gets called inside prepare upload
    const version = await this.prepareUpload(request)

    try {
      const file = request.file

      const s3Key = this.location.getKeyLocation({
        tenantId: this.db.tenantId,
        bucketId: request.bucketId,
        objectName: request.objectName,
      })
        // NOTE: we don't get object metadata until here
      const objectMetadata = await this.backend.uploadObject(
        storageS3Bucket,
        s3Key,
        version,
        file.body,
        file.mimeType,
        file.cacheControl,
        request.signal
      )

      if (request.file.xRobotsTag) {
        objectMetadata.xRobotsTag = request.file.xRobotsTag
      }

      if (file.isTruncated()) {
        throw ERRORS.EntityTooLarge()
      }

      return this.completeUpload({
        ...request,
        version,
        objectMetadata: objectMetadata,
        userMetadata: { ...request.userMetadata },
      })
    } catch (e) {
      await ObjectAdminDelete.send({
        name: request.objectName,
        bucketId: request.bucketId,
        tenant: this.db.tenant(),
        version: version,
        reqId: this.db.reqId,
      })
      throw e
    }
  }

@fenos
Copy link
Copy Markdown
Contributor

fenos commented Jan 27, 2026

I thought we couldn't do this because we don't get the objectMetadata until the object has been uploaded.
You are correct on this, but potentially we could rely on the:

  • content-length of the request for the file size
  • content-type for the mimeType

Other fields like etag, lastModified, it's fine if they are empty on the first RLS check

.from(bucketName)
.signUploadObjectUrl(objectName, urlPath as string, uploadSignedUrlExpirationTime, owner, {
upsert: request.headers['x-upsert'] === 'true',
userMetadata: userMetadata,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't actually support userMetadata on upload signed urls (which we should) so I'm glad you found this.

I believe we need to store the userMetadata in the returned JWT payload; otherwise, this information will be lost during the signedUpload request. We need to be careful, though, if we decide to store the metadata in the JWT, because it can be abused with large payloads. We can for now restrict the amount of data that we can store in the JWT to something like 1MB i think we already have a constant MAX_CUSTOM_METADATA_SIZE, which can be reused (re-refactored as a validation function, potentially in limits.ts)

But we also need to add the second part (on the upload route) to store the userMetadata received from the JWT. This way, the metadata can't be tempered.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do a follow up PR to handle the upload part.

@TylerHillery TylerHillery force-pushed the tyler/STORAGE-258/user-metadata-in-rls branch from 57b96fb to 05a322a Compare February 3, 2026 01:24
@TylerHillery
Copy link
Copy Markdown
Contributor Author

I thought we couldn't do this because we don't get the objectMetadata until the object has been uploaded.
You are correct on this, but potentially we could rely on the:

  • content-length of the request for the file size
  • content-type for the mimeType

Other fields like etag, lastModified, it's fine if they are empty on the first RLS check

I have updated canUpload to accept a metadata parameter that gets passed to the RLS check. Only content-length and content-typ get passed through.

Multipart uploads required a little more work as the metadata field was currently present on in s3_multipart_uploads so I had to create migration.

@TylerHillery TylerHillery requested a review from fenos February 3, 2026 01:46
Comment thread src/http/routes/object/getSignedUploadURL.ts Outdated
Comment thread src/http/routes/tus/lifecycle.ts Outdated
Comment thread src/storage/uploader.ts Outdated
bucketId: string
objectName: string
file: FileUpload
userMetadata: Record<string, unknown> | undefined
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we diverge from optionality? userMetadata?: Record<string, unknown>

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While implementing this change I found it very easy to forget to pass userMetadata somewhere when it should be passed so I thought I make it required but allow undefined if it's not needed.

I called this out in the PR as wasn't sure if there was a better way and wasn't 100% confident about this solution. If you think optionality is better here I am happy to change back.

calls to ensure userMetadata is passed through. I found it difficult to ensure I caught every cause though, e.g. it was very easy to miss uploadNewObject needed to be updated in the uploadFromRequest. I'm open to feedback to see if there is a more robust solution we could implement that could be caught with typescript type

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point but I would go with optional since more idiomatic. Maybe for cleanup afterwards

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted back to optional

Comment thread src/test/scanner.test.ts Outdated
Comment thread src/http/routes/tus/lifecycle.ts Outdated
Comment thread src/storage/database/knex.ts Outdated
owner_id: owner,
user_metadata: metadata,
user_metadata: userMetadata,
metadata: metadata,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need a guard rule for new column similar to user_metadata?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you're right. I added a guard rule but it feels a bit hacky. The reason I didn't add it to rules in normalizedColumns is because metadata column does exist in objects table. It's my understanding any metadata column would get remove regardless of the table it's selecting from.

So instead I checked if latest migration was >= so s3-multipart-uploads-metadata and added it to the list of columns to select.

I almost feel normalizedColumns should be table dependent. This might be better in a follow up PR thought.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally agree, also see #825 (comment)

Comment thread src/storage/protocols/s3/s3-handler.ts
@TylerHillery TylerHillery force-pushed the tyler/STORAGE-258/user-metadata-in-rls branch from 05a322a to 5d4edd0 Compare February 10, 2026 20:25
@TylerHillery TylerHillery force-pushed the tyler/STORAGE-258/user-metadata-in-rls branch from 5d4edd0 to 8c316f6 Compare February 23, 2026 19:06
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a nullable jsonb metadata column to storage.s3_multipart_uploads and updates code paths so client-provided user metadata and content metadata (mimetype, contentLength) are accepted, persisted, and propagated into upload authorization checks. Changes touch HTTP routes (signed upload headers), TUS lifecycle, uploader interfaces, storage handlers, DB adapter/knex, multipart schemas, and tests; the DB migration mapping is updated and tests for RLS and migration compatibility were added.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant HTTP_Route as HTTP Route / TUS Entry
    participant Uploader
    participant Storage
    participant DB
    participant RLS

    Client->>HTTP_Route: Upload request (headers: x-metadata, content-type, content-length)
    HTTP_Route->>Uploader: call canUpload/prepareUpload (includes userMetadata, metadata)
    Uploader->>Storage: create/prepare multipart (passes userMetadata, metadata)
    Storage->>DB: insert/find multipart row (user_metadata, metadata)
    DB->>RLS: evaluate policy with user_metadata & metadata
    RLS-->>DB: allow/deny
    DB-->>Storage: result
    alt allowed
        Storage->>Uploader: success (tokens/parts)
        Uploader-->>Client: signed URL / upload acceptance
    else denied
        Uploader-->>Client: error (403/400)
    end
Loading

Assessment against linked issues

Objective Addressed Explanation
Enable user_metadata access in RLS policies during uploads [#757]
Propagate user_metadata through upload operations to canUpload validation [#757]
Support user_metadata in multipart upload workflows [#757]
Support content metadata (mimetype, contentLength) in RLS policy checks [#757]

Out-of-scope changes

Code Change Explanation
New multipart resilience test and retry behavior (src/test/s3-protocol.test.ts) Focuses on in_progress_size mutation and retry semantics, not required by the RLS/user_metadata objective.
Extensive test harness additions including tus-js-client helpers (src/test/rls.test.ts) Adds broad test scenarios and client tooling to exercise flows; these expand coverage but are outside the single-issue objective.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

properties: {
'x-upsert': { type: 'string' },
'content-type': { type: 'string' },
'content-length': { type: 'string' },
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need x-metadata as well since we try to read below?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added

const customMd = request.headers['x-metadata']

if (typeof customMd === 'string') {
userMetadata = parseUserMetadata(customMd)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this parser returns Record<string, string> by casting due to its expectation (and S3 compatibility) but there is no validation. Values could be anything.

This is an existing helper, feel free to add a TODO to be addressed later

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into supabase-js and s3 part of this, type is wrong but handling is correct and we return x-amz-meta-missing header for non-string/non-printable

Copy link
Copy Markdown
Member

@ferhatelmas ferhatelmas Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to note is that when it's set when content-type is multipart/form-data, we check/limit its size but not when it comes from a header.

However, the value is 1kb and a header can definitely be larger than that as well saw value wrong, it's 1mb so this is a valid assumption. We could reuse the code, instead of two parsers as a cleaup but not in this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added TODO

Comment thread src/storage/protocols/s3/s3-handler.ts Outdated

const uploader = new Uploader(this.storage.backend, this.storage.db, this.storage.location)

const multipart = await this.shouldAllowPartUpload(UploadId, ContentLength, maxFileSize)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this ordering can cause issues for valid uploads because it mutates the state.

For example (numbers are arbitrary)

  1. in progress 50mb and new part is 5mb
  2. in progress becomes 55mb
  3. RLS fails
  4. next request in progress becomes 60mb
  5. RLS checks but size fails because it was expecting at most 55mb

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go catch, changed the order to have canUpload called before shouldAllowPartUpload. What I don't like about this is that this adds an extra call to findMultipartUpload as this is also called in shouldAllowPartUpload I thought about moving canUpload within in shouldAllowPartUpload but it uses asSuperUser before starting the transaction which would bypass RLS.

Can you confirm my understanding? Is there anyway to do this that I'm missing that doesn't involve extra findMultipartUpload call?

const multipart = await this.storage.db
.asSuperUser()
.findMultipartUpload(UploadId, 'id,version')
.findMultipartUpload(UploadId, 'id,version,user_metadata,metadata')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need the column guard for these reads similar to create as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the guard to the findMultipartUpload with a TODO comment to refactor once we change normalizeColumns to be table aware.

Comment thread src/storage/object.ts Outdated
objectName: destinationKey,
owner,
isUpsert: upsert,
userMetadata: userMetadata,
Copy link
Copy Markdown
Member

@ferhatelmas ferhatelmas Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to metadata, aren't we supposed to use origin.user_metadata if copyMetadata is false because it's the written value eventually?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we are, changed it to the same logic we use for metadata

Copy link
Copy Markdown
Contributor Author

@TylerHillery TylerHillery Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back, using the same logic breaks some tests for userMetadata we don't want to merge the two. Either use all of the origin if copy or use the new destination metadata

@ferhatelmas
Copy link
Copy Markdown
Member

Can we also add more tests please?

  1. copy with copyMetadata=true uses correct user metadata for RLS check
  2. multipart s3 RLS with metadata
  3. multipart progress accounting on auth failure
  4. signed upload RLS with metadata
  5. migration compatibility

metadata checked at sign time but mutable at upload time (to be addressed separately by me when this is merged)

@TylerHillery TylerHillery force-pushed the tyler/STORAGE-258/user-metadata-in-rls branch from 8c316f6 to 6ea5123 Compare February 27, 2026 00:28
Copilot AI review requested due to automatic review settings March 24, 2026 13:12
@TylerHillery TylerHillery force-pushed the tyler/STORAGE-258/user-metadata-in-rls branch from a473d9b to f62ddfa Compare March 24, 2026 13:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR resolves #757 by ensuring user_metadata (and related request metadata) is available during pre-upload RLS permission checks across upload flows, so RLS policies can reference it during inserts.

Changes:

  • Extends Uploader.canUpload/prepareUpload to accept userMetadata and metadata and threads these through standard, signed URL, TUS, and S3/multipart paths.
  • Persists multipart upload metadata in s3_multipart_uploads with migration/version guards for backwards compatibility.
  • Expands/updates integration tests and RLS test spec to cover metadata-driven insert policies and multipart ordering.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/storage/uploader.ts Updates canUpload/prepareUpload API and threads metadata/userMetadata; adjusts request parsing.
src/storage/object.ts Passes userMetadata/metadata into upload/copy/sign flows for RLS checks.
src/storage/protocols/s3/s3-handler.ts Propagates metadata/userMetadata for S3 multipart + ordering changes around canUpload.
src/storage/database/adapter.ts Updates DB interface for multipart upload persistence.
src/storage/database/knex.ts Stores/fetches multipart metadata with migration guard for older schemas.
src/storage/schemas/multipart.ts Adds metadata to multipart upload schema.
src/internal/database/migrations/types.ts Registers migration ID for multipart metadata column.
migrations/tenant/0057-s3-multipart-uploads-metadata.sql Adds metadata jsonb column to storage.s3_multipart_uploads.
src/http/routes/tus/lifecycle.ts Ensures TUS requests can run canUpload with parsed metadata/user_metadata.
src/http/routes/tus/index.ts Wires datastore into onIncomingRequest callback.
src/http/routes/object/getSignedUploadURL.ts Allows signed-upload RLS checks to include metadata/user_metadata from headers.
src/test/rls_tests.yaml Adds RLS policies/tests for user_metadata + metadata-based checks.
src/test/rls.test.ts Extends test runner to exercise new upload variants and metadata inputs.
src/test/s3-protocol.test.ts Adds multipart ordering regression test + migration compatibility integration tests.
src/test/scanner.test.ts Updates test uploads for adjusted UploadRequest shape.
src/test/cdn.test.ts Updates uploadNewObject usage to pass userMetadata at request level.
Comments suppressed due to low confidence (1)

src/storage/uploader.ts:479

  • The return object from fileUploadFromRequest sets contentLength twice (first from fileContentLength, then overwritten by the raw content-length header). This both causes a duplicate-key/TS error and changes behavior by overwriting the intentionally-undefined multipart file size with the (unreliable) multipart request content-length. Keep a single contentLength value (likely fileContentLength) and, if the header value is needed for RLS, store it under a different name and validate it separately.
    body,
    mimeType,
    cacheControl,
    contentLength: fileContentLength,
    isTruncated,
    userMetadata,
    maxFileSize,
    xRobotsTag,
  }
}

export function parseUserMetadata(metadata: string) {
  try {
    const json = Buffer.from(metadata, 'base64').toString('utf8')
    return JSON.parse(json) as Record<string, string>

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/http/routes/tus/lifecycle.ts
Comment thread src/storage/uploader.ts
Comment thread src/storage/protocols/s3/s3-handler.ts
Comment thread src/test/rls.test.ts
Comment thread src/http/routes/object/getSignedUploadURL.ts
@fenos fenos merged commit 7459d65 into master Mar 26, 2026
3 checks passed
@fenos fenos deleted the tyler/STORAGE-258/user-metadata-in-rls branch March 26, 2026 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can't use user_metadata in RLS policies when uploading a file

5 participants