Skip to content

HDDS-14660. Implement new table to store part information for multipart uploads#9886

Open
spacemonkd wants to merge 7 commits intoapache:masterfrom
spacemonkd:HDDS-14660
Open

HDDS-14660. Implement new table to store part information for multipart uploads#9886
spacemonkd wants to merge 7 commits intoapache:masterfrom
spacemonkd:HDDS-14660

Conversation

@spacemonkd
Copy link
Contributor

What changes were proposed in this pull request?

HDDS-14660. Implement new table to store part information for multipart uploads

Please describe your PR in detail:

  • This PR adds a new multipartPartsTable to handle storing of parts for Multipart Uploads.
  • The existing MultipartKeyInfo is updated to handle new fields like owner and ACLs, volume, bucket and key name.
  • The codec for PartKey and PartInformation has been added.
  • The new table is added to the list of compacted tables

How is the PartKey encoded?

Say for the below sample input (we are taking 47 as it has 2F in the lower byte):

uploadId = <abcd-1234...>
partNumber = 47 -> taking 47 because it has 2f in the byte representation

Encoding format:
keyBytes = UTF8(uploadId) + / + int32_be(partNumber)

For 47:
int32 big-endian bytes = 00 00 00 2f

So final tail is:

  • separator / = 2f
  • part bytes = 00 00 00 2f
  • Total tail bytes = 2f 00 00 00 2f

Now we run the following checks:

  • Check full-key first: if byte at len - 5 is /, decode as full key (used to identify full row)
  • Else, if byte at len - 1 is /, decode as prefix. (this is used for prefix scan for iterating)
  • Else invalid.

A logical key will look something like the following in the RocksDB entry:

[61 62 63 31 32 33 2d 75 75 69 64 2d 34 35 36 2f 00 00 00 2f]
[--------------uploadId UTF-8---------------][2f][-int32 BE-]

and will be represented as:

{
  uploadID: "abcd123-uuid-456",
  partNumber: 47
} : {
... 
}

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14660

How was this patch tested?

Patch has been tested by running unit tests.

@spacemonkd spacemonkd marked this pull request as ready for review March 8, 2026 18:15
@spacemonkd spacemonkd self-assigned this Mar 8, 2026
@spacemonkd spacemonkd added the s3 S3 Gateway label Mar 8, 2026
@spacemonkd spacemonkd requested a review from ChenSammi March 8, 2026 18:16
OmMultipartKeyInfo.getCodec());

public static final String MULTIPART_PARTS_TABLE = "multipartPartsTable";
/** multipartPartsTable: /uploadId/partNumber :- PartKeyInfo. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is "/volume/bucket/key/uploadId/partNumber" in Column Family comments, while it's "/uploadId/partNumber" here. What's the real key format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, need to update the comment.
It is actually uploadId/part number.

The multipartInfo table is one with vol/buck/key/uploadID

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the /, /uploadId/partNumber -> uploadId/partNumber

@spacemonkd spacemonkd requested a review from ChenSammi March 12, 2026 18:29
Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @spacemonkd for the patch, overall LGTM. Left some comments,

OmMultipartKeyInfo.getCodec());

public static final String MULTIPART_PARTS_TABLE = "multipartPartsTable";
/** multipartPartsTable: /uploadId/partNumber :- PartKeyInfo. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can remove the /, /uploadId/partNumber -> uploadId/partNumber


private Table<String, OmKeyInfo> openKeyTable;
private Table<String, OmMultipartKeyInfo> multipartInfoTable;
private Table<OmMultipartPartKey, OmMultipartPartInfo> multipartPartTable;
Copy link
Contributor

@ivandika3 ivandika3 Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Standardize to multipartPartsTable. Similarly for getMultipartPartTable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in the latest commit.

Comment on lines +1532 to +1549
if (omMultipartKeyInfo.getSchemaVersion() == 0) {
numParts += omMultipartKeyInfo.getPartKeyInfoMap().size();
} else {
OmMultipartPartKey prefix =
OmMultipartPartKey.prefix(expiredMultipartUpload.getUploadId());
try (TableIterator<OmMultipartPartKey, ? extends KeyValue<OmMultipartPartKey, OmMultipartPartInfo>>
partIterator = getMultipartPartTable().iterator(prefix)) {
while (partIterator.hasNext()) {
KeyValue<OmMultipartPartKey, OmMultipartPartInfo> partEntry =
partIterator.next();
if (!expiredMultipartUpload.getUploadId().equals(
partEntry.getKey().getUploadId())) {
break;
}
numParts++;
}
}
}
Copy link
Contributor

@ivandika3 ivandika3 Mar 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not tested yet (we are still creating schemaVersion 0 MPU), let's exclude this first and add this once the MPU flow patches are done and listParts behaving normally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this in the latest commit.
I have commented out this part and added a TODO in case I forget about the changes 😅

Will just uncomment the section after the actual request and response modifications are done

Comment on lines +49 to +51
public static OmMultipartPartKey of(String uploadId, int partNumber) {
return new OmMultipartPartKey(uploadId, partNumber);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also use the boxed Integer instead? If we pass null Integer here, it might throw NPE. We should also always use this static constructor and not normal constructor (in fromPersistedFormat)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to use Integer and validate that Integer is nonnull in the latest commit.
Also using prefix() and of() methods to return the values instead of direct constructor calls in fromPersistedFormat()

Comment on lines +96 to +97
private static final class OmMultipartPartKeyCodec
implements Codec<OmMultipartPartKey> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can support CodecBuffer to prevent buffer copying, should be able to be addressed here, but can raise in another ticket (under HDDS-2274) if you prefer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented this in the current change itself, added new tests to validate that the change is working fine.

@spacemonkd spacemonkd requested a review from ivandika3 March 15, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s3 S3 Gateway

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants