Skip to content

Make GraphicsResource inherit from Buffer#1701

Open
rparolin wants to merge 4 commits intoNVIDIA:mainfrom
rparolin:rparolin/graphics_feedback_fixes_1
Open

Make GraphicsResource inherit from Buffer#1701
rparolin wants to merge 4 commits intoNVIDIA:mainfrom
rparolin:rparolin/graphics_feedback_fixes_1

Conversation

@rparolin
Copy link
Collaborator

@rparolin rparolin commented Feb 27, 2026

Summary

  • Refactors GraphicsResource to inherit from Buffer instead of wrapping a separate Buffer object, eliminating the _MappedBufferContext intermediary class
  • map() now populates the GraphicsResource itself with device pointer/size (since it IS-A Buffer) and returns self
  • Makes stream a required parameter for map() and unmap() (removes implicit default stream usage)
  • Adds stream parameter to from_gl_buffer() for a convenient register-and-map-in-one-step pattern
  • Renames handle property to resource_handle to avoid conflicting with Buffer.handle (which now exposes the mapped device pointer)

Motivation

Review feedback requested that GraphicsResource be a Buffer directly rather than producing a separate Buffer via map(). This simplifies the API surface: a mapped GraphicsResource can be passed directly anywhere a Buffer is accepted (e.g., StridedMemoryView.from_buffer()), without needing a wrapper context manager.

Key changes

  • GraphicsResource(Buffer) — inherits from Buffer; handle and size are valid while mapped
  • Removed _MappedBufferContext — no longer needed since GraphicsResource itself is the buffer
  • __enter__/__exit__ — moved onto GraphicsResource directly; __exit__ auto-unmaps using the stream from map()
  • stream requiredmap(stream=) and unmap(stream=) no longer default to stream 0
  • from_gl_buffer(stream=) — optional stream param to register + map in one call, enabling with GraphicsResource.from_gl_buffer(vbo, stream=s) as buf: pattern
  • close(stream=None) — accepts stream kwarg for compatibility with Buffer.close()
  • resource_handle — renamed from handle to avoid shadowing Buffer.handle

🤖 Generated with Claude Code

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 27, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rparolin rparolin changed the title Rparolin/graphics feedback fixes 1 Make GraphicsResource inherit from Buffer Feb 27, 2026
@rparolin rparolin requested a review from leofang February 27, 2026 19:48
@rparolin rparolin self-assigned this Feb 27, 2026
@rparolin rparolin added the enhancement Any code-related improvements label Feb 27, 2026
@rparolin rparolin added this to the cuda.core v0.7.0 milestone Feb 27, 2026
@rparolin rparolin marked this pull request as ready for review February 27, 2026 19:49
@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Feb 27, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rparolin
Copy link
Collaborator Author

/ok to test

@rparolin rparolin closed this Feb 27, 2026
@rparolin rparolin reopened this Feb 27, 2026
@github-actions
Copy link

@leofang leofang added P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Feb 28, 2026
@rparolin
Copy link
Collaborator Author

rparolin commented Mar 2, 2026

/ok to test

@rparolin rparolin enabled auto-merge (squash) March 3, 2026 16:11
return
if self._mapped:
# Best-effort unmap before unregister
# Best-effort unmap before unregister (use stream 0 as fallback)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential stream-ordering bug here: map()/unmap() now require an explicit stream, but close() always unmaps on stream 0 and ignores both _map_stream and the optional close(stream=...) arg.

If a resource was mapped on a non-default stream and then closed while mapped, unmap can be issued on the wrong stream, which may break ordering guarantees with in-flight work. Could we unmap using _map_stream (or the passed stream, with a clear fallback policy) instead of hard-coding stream 0?

self._map_stream = None

def __enter__(self):
return self
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__enter__ currently returns self even when the resource is not mapped. That allows patterns like with GraphicsResource.from_gl_buffer(vbo) as buf: (without stream=), where buf.handle/buf.size are not valid.

Could we either (a) raise in __enter__ when not self._mapped, or (b) require/perform mapping for the context-manager path to avoid this silent footgun?

Copy link
Contributor

@cpcloud cpcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the refactor here — the GraphicsResource(Buffer) direction looks good overall.

Requesting changes for two behavior issues called out inline:

  1. close() currently unmaps on stream 0 even when mapping happened on a non-default stream, which can break stream-ordering guarantees.
  2. __enter__ returns self even when unmapped, allowing context-manager usage where handle/size are invalid.

Please address these and I’m happy to re-review.

@cpcloud
Copy link
Contributor

cpcloud commented Mar 4, 2026

ping @rparolin. happy to fix this up if you want.

@leofang
Copy link
Member

leofang commented Mar 4, 2026

Thanks for the refactor here — the GraphicsResource(Buffer) direction looks good overall.

Yeah and thinking about it more, I suspect this is a must:

  • If GL buffer registration is context-independent: This PR makes the UX nicer
  • If GL buffer registration is locked to the current CUDA context: We need to track the underlying device/context, so we either inherit from Buffer (this PR) or add Buffer.from_gl_buffer() constructor (and keep GraphicsResource internal).
    • From the viewpoint of staying consistent across cuda.core, the latter might be worth considering too, but one challenge is that Buffer today does not have the map/unmap concept.
    • I would be happy for you @rparolin @Andy-Jost @cpcloud to brainstorm and I can review later 🙂

I will ask internally if it's context-independent or not, but it's not a blocker.

@Andy-Jost
Copy link
Contributor

Andy-Jost commented Mar 4, 2026

Thanks for the discussion here. I'd like to propose an alternative approach that I think addresses both of Phillip's concerns and Leo's observation about Buffer not having map/unmap.

Proposal: Use the resource_handles module instead of modifying Buffer

We already have a C++ handle layer (cuda/core/_cpp/resource_handles.hpp) that provides RAII-based lifetime management with std::shared_ptr custom deleters. This pattern already solves the exact problems raised here:

  1. Stream capture: DevicePtrHandle already captures the allocation stream in its deleter and uses it for cuMemFreeAsync. We can do the same for graphics resources - capture the map stream and use it for unmap.

  2. Dependency tracking: Handles can hold references to other handles, preventing out-of-order destruction. A mapped buffer handle can hold a reference to the GraphicsResourceHandle, ensuring the graphics resource outlives the mapped pointer.

  3. Arbitrary release actions: The shared_ptr deleter can chain actions - e.g., unmap then unregister.

Concrete design:

  • GraphicsResource.map(stream) returns a Buffer (or just the existing DevicePtrHandle under the hood)
  • The returned handle's deleter captures:
    • The GraphicsResourceHandle (keeps it alive)
    • The map StreamHandle (for stream-ordered unmap)
  • When the mapped buffer handle is released, the deleter calls cuGraphicsUnmapResources on the captured stream
  • GraphicsResource itself doesn't need to track _mapped state - the existence of mapped buffer handles implies mapped state

Benefits:

  • Buffer stays unchanged (no map/unmap concept needed)
  • Stream ordering is correct by construction (cpcloud's concern 1)
  • No invalid __enter__ state possible (cpcloud's concern 2) - you either have a valid mapped handle or you don't
  • Dependency tracking prevents use-after-free
  • Consistent with how we handle DeviceMemoryResource allocations

This is the same pattern we use for memory pool allocations, where the pointer handle captures the pool handle and stream for proper cleanup ordering.

Thoughts?

@cpcloud
Copy link
Contributor

cpcloud commented Mar 6, 2026

+1 to @Andy-Jost's design. Looks solid.

@leofang
Copy link
Member

leofang commented Mar 7, 2026

FYI

I will ask internally if it's context-independent or not

it is context-dependent.

Proposal: Use the resource_handles module instead of modifying Buffer

Modifying Buffer in what way? Sorry it's not clear to me.

Can we sketch a bit how the user code would look like with this proposal? The changes in this PR were motivated by the sketch in here: #1608 (comment). Rob asked the right question there. Inheritance is probably not necessary, as long as the same UX can be delivered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.core Everything related to the cuda.core module enhancement Any code-related improvements P0 High priority - Must do!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants