-
Notifications
You must be signed in to change notification settings - Fork 253
Add explicit CUDA graph construction API #1729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
Andy-Jost
wants to merge
25
commits into
NVIDIA:main
Choose a base branch
from
Andy-Jost:explicit-graph-construction
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
478101c
Convert _graph.py to _graph/ package for explicit graph construction …
Andy-Jost ee55795
Added GraphHandle to RAII module.
Andy-Jost 3b3a715
Add GraphDef and Node classes for explicit graph construction
Andy-Jost 7559d2f
Add Node class hierarchy with type-specific properties and parameteri…
Andy-Jost ae5d706
Add MemsetNode with shared _parse_fill_value utility
Andy-Jost ba7d2f9
Add EventRecordNode and EventWaitNode with Event.from_handle support
Andy-Jost 77a5dac
Replace GraphDef.root with forwarding methods and GraphDef.join
Andy-Jost 1a790ce
Improve __repr__ for graph nodes, add Node.handle, use as_py for Grap…
Andy-Jost 3e25c64
Add MemcpyNode with auto-detected memory types
Andy-Jost 228de38
Add ChildGraphNode with embed() builder and non-owning graph handle
Andy-Jost 506850b
Add HostCallbackNode with Python callable and ctypes CFUNCTYPE support
Andy-Jost 4ee0ed7
Fix dangling child graph references by capturing parent handle
Andy-Jost 82f0ec7
Add conditional graph nodes (IfNode, IfElseNode, WhileNode, SwitchNode)
Andy-Jost 5cf85c4
Reconstruct conditional node subtypes on CUDA 13.2+ drivers
Andy-Jost d993f9c
Apply developer guide styling to _graphdef.pyx
Andy-Jost da73536
Fix conditional node body graphs and add integration tests
Andy-Jost d228830
Merge remote-tracking branch 'origin/main' into explicit-graph-constr…
Andy-Jost b463ccb
Add lifetime and error/edge-case tests for explicit graph construction
Andy-Jost f1cceb2
Add RAII NodeHandle, event/kernel lifetime via user objects, consolid…
Andy-Jost 133719b
Move Event metadata fields to EventBox, access via get_box() pointer …
Andy-Jost e7ebe53
Add HandleRegistry template and event reverse-lookup
Andy-Jost f0bbf66
Add HandleRegistry template, event reverse-lookup, refactor IPC cache
Andy-Jost b830e6e
Add kernel reverse-lookup registry, fix create_kernel_handle_ref sema…
Andy-Jost b55782a
Rename NodeHandle to GraphNodeHandle for consistency with driver term…
Andy-Jost 73ba7fe
Merge remote-tracking branch 'origin/main' into explicit-graph-constr…
Andy-Jost File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These properties are set at event creation time and cannot be queried through the driver API. Moreover, graph-attached events are returned from the driver as plain
CUeventhandles, and reconstructing the Cython Event object from one of those would lose this information.The solution is to move the property metadata into C++ and set up a reverse look-up so that the driver-returned
CUeventcan be used to retrieve the managingshared_ptr, which holds thisEventBox.Graph-attached kernels are handled similarly.