add ArrayBuffer, ArrayBufferView, SharedArrayBuffer, and BackingStore to Rust JSG#6398
add ArrayBuffer, ArrayBufferView, SharedArrayBuffer, and BackingStore to Rust JSG#6398
Conversation
77dedb2 to
2ccf890
Compare
There was a problem hiding this comment.
I'm Bonk, and I've done a quick review of your PR.
This PR adds ArrayBuffer, ArrayBufferView, SharedArrayBuffer, and BackingStore Rust bindings to the JSG layer, with extensive test coverage.
Issues
-
[High]
SharedArrayBuffer::as_slice/as_mut_sliceandBackingStore::as_mut_slicefor shared backing stores are unsound —SharedArrayBuffermemory can be concurrently modified by other threads or by JS (via Atomics or typed array writes from another agent). Creating&[u8]/&mut [u8]from shared memory is undefined behavior in Rust because it violates the aliasing/provenance model. The&mut [u8]case is especially dangerous: Rust assumes exclusive access through&mutreferences. Even the&[u8]case is technically UB if another thread writes during the borrow. The doc comments mention this but "caller is responsible for synchronization" isn't enforceable. This may be acceptable as a pragmatic choice for now since workerd is single-threaded per-isolate, but the API should either (a) make these methodsunsafeto signal the invariant, or (b) return raw pointers / useUnsafeCellsemantics instead. Addingunsafewould be the minimal fix. See inline comment. -
[Medium]
new_uninitis misleading — the buffer IS zero-initialized —ArrayBuffer::new_uninitandSharedArrayBuffer::new_uninitcallNewBackingStore(isolate, length)which zero-initializes the memory. The namenew_uninitstrongly implies the contents are uninitialized (as inMaybeUninit), but the test itself asserts the memory is zeroed. This is a naming footgun that could lead callers to skip initialization thinking it's redundant, or worse, treat the memory as potentially uninitialized. Considernew_zeroedor justwith_byte_length. See inline comment. -
[Low] Doc comment references non-existent method names —
BackingStoredoc says "Obtained viaLocal<ArrayBuffer>::get_backing_store" but the actual method is calledbacking_store()(noget_prefix).
|
Review posted successfully. Here's a summary of the findings:
|
2ccf890 to
22dbb60
Compare
|
|
||
| size_t local_array_buffer_get_backing_store(Isolate* isolate, const Local& buffer) { | ||
| return reinterpret_cast<size_t>( | ||
| new BackingStorePtr(local_as_ref_from_ffi<v8::ArrayBuffer>(buffer)->GetBackingStore())); |
There was a problem hiding this comment.
The fact that this is heap allocating a std::shared_ptr gets lost behind the using alias. newing a std::shared_ptr is itself a bit of an unsafe antipattern and it's unfortunate that we're having to resort to that here, especially under the guise of making things safer by porting them to rust.
In nothing else tho, I'd drop the using alias and just use the std::shared_ptr<v8::BackingStore> directly just so it's clearer at a glance what is happening.
| auto* ptr = static_cast<uint8_t*>(ab->Data()); | ||
| for (size_t i = 0; i < byteLen; i++) { | ||
| result.push_back(ptr[i]); | ||
| } |
There was a problem hiding this comment.
is there no memcpy equivalent? Does the rust compiler optimize this away? For large array buffers this O(n) loop is going to be slow.
| result.push_back(ptr[i]); | ||
| } | ||
| return result; | ||
| } |
There was a problem hiding this comment.
There's a fair amount of duplication between these unwrap methods. Can they can condensed into one?
|
|
||
| ::rust::Vec<uint8_t> unwrap_array_buffer_view(Isolate* isolate, Local value) { | ||
| auto v8Val = local_from_ffi<v8::Value>(kj::mv(value)); | ||
| KJ_REQUIRE(v8Val->IsArrayBufferView()); |
There was a problem hiding this comment.
This will only work for Uint8Array, Uint8ClampedArray, and DataView. All the remaining TypedArray types are multi-byte, not to mention the Floating point and Bigint types that will have completely different intepretations.
| } | ||
|
|
||
| // Unwrappers | ||
| ::rust::Vec<uint8_t> unwrap_array_buffer(Isolate* isolate, Local value) { |
There was a problem hiding this comment.
This isn't really "unwrap". It's really "copy"
| kj::Maybe<Local> array_buffer_maybe_new( | ||
| Isolate* isolate, size_t byte_length, BackingStoreInitializationMode mode) { | ||
| auto maybe = v8::ArrayBuffer::MaybeNew( | ||
| isolate, byte_length, static_cast<v8::BackingStoreInitializationMode>(mode)); |
There was a problem hiding this comment.
We should actually drop our MaybeNew patch in favor of the newer version of v8::ArrayBuffer::New that accepts the flag telling it how to handle allocation failure. This MaybeNew is redundant now.
| // `ArrayBufferView`-specific implementations | ||
| // ============================================================================= | ||
|
|
||
| impl<'a> Local<'a, ArrayBufferView> { |
There was a problem hiding this comment.
For streams use cases, this will need element size and whether the underlying view is an integer type or not.
| pub unsafe fn buffer_data(&self) -> *mut u8 { | ||
| // SAFETY: handle is valid within the current HandleScope. | ||
| unsafe { ffi::local_array_buffer_view_buffer_data(self.isolate.as_ffi(), &self.handle) } | ||
| } |
There was a problem hiding this comment.
we'll need the ability to get a view of the elements, not just the bytes. If we can't do that reasonably in the ArrayBufferView, then we might need separate rust classes for each of the TypedArray kinds.
| } | ||
|
|
||
| impl Local<'_, ArrayBuffer> { | ||
| /// Returns the byte length of this `ArrayBuffer`. |
There was a problem hiding this comment.
For streams use cases we'll need a detach / is_detached / is_detachable API.
| } | ||
| } | ||
|
|
||
| impl Local<'_, SharedArrayBuffer> { |
There was a problem hiding this comment.
Honestly, I'm not sure there's enough value in this as a separate class given the duplication/overlap with ArrayBuffer. We could likely simplify and have ArrayBuffer encapsulate this and just add a is_shared API there.
9b527bb to
d88cf6b
Compare
| /// # Safety | ||
| /// For `SharedArrayBuffer` backing stores (`is_shared() == true`) the | ||
| /// underlying memory may be concurrently read or written by other agents | ||
| /// (JS threads, `Atomics` operations). The caller must ensure no concurrent | ||
| /// writes occur for the duration of the borrow, or use atomic operations. |
There was a problem hiding this comment.
This is returning a non-mutable slice, so why does the safety not mention mutability?
| /// | ||
| /// Zero-copy. The caller must ensure no other live slice aliases this region. | ||
| #[inline] | ||
| pub fn as_mut_slice(&mut self) -> &mut [u8] { |
|
|
||
| /// Returns a mutable byte slice view into the `ArrayBuffer`'s backing store. | ||
| /// | ||
| /// Zero-copy. The caller must ensure no other live slice aliases this region. |
There was a problem hiding this comment.
The caller must ensure no other live slice aliases this region.
This very much goes against the concept of singular ownership in Rust... do we not have any better guarantees ever?
| /// | ||
| /// Zero-copy. The caller must ensure no other live slice aliases this region. | ||
| #[inline] | ||
| pub fn as_mut_slice(&mut self) -> &mut [u8] { |
There was a problem hiding this comment.
Again, it is unclear to me why BackingStore::as_mut_slice is unsafe, while ArrayBuffer::as_mut_slice is? Would be good to understand better the difference in ownership between these modes.
guybedford
left a comment
There was a problem hiding this comment.
Shouldn't we require a lock to obtain a mutable slice, so that the mutable slice is only valid for the duration of the lock? It seems this is necessary to properly support ownership semantics?
Built on top of #6386 branch