Skip to content

cuda.bindings latency benchmarks#1736

Open
danielfrg wants to merge 3 commits intomainfrom
cuda-bindings-bench
Open

cuda.bindings latency benchmarks#1736
danielfrg wants to merge 3 commits intomainfrom
cuda-bindings-bench

Conversation

@danielfrg
Copy link

Description

closes #1580

Description

closes #1580

@leofang @mdboom I migrated one benchmark from the pytest suite to use pyperf and added a C++ equivalent.

  • Added a small benchmark discovery to find bench_*.py files with bench_*() functions
  • Uses bench_time_func
  • C++ benchmarks output pyperf-compatible JSON so both sides can be analyzed with the same pyperf stats / pyperf hist commands.
  • The readme explain how to run it on the different envs using pixi

The benchmark is cuPointerGetAttribute, both Python and C++ call the same driver API with error checking.

These are one set of results for Python and C++ in my system, so we are ok under the <1us. They dont run the same warmup and runs for each, i still need to finish that but just to give you an idea.

# Python (pyperf bench_time_func)
bindings.pointer_attributes.pointer_get_attribute: Mean +- std dev: 603 ns +- 25 ns

# C++ (driver API baseline)
cpp.pointer_attributes.pointer_get_attribute: Mean +- std dev: 29 ns +- 1 ns

I still need to work on matching params for all the benchmarks and so on and so on but wanted to get feedback first if this looks fine to keep going.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@copy-pr-bot
Copy link
Contributor

copy-pr-bot bot commented Mar 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@leofang leofang requested review from leofang and mdboom March 7, 2026 02:40
@leofang leofang added this to the cuda.bindings next milestone Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python latency testing & benchmarking

2 participants