Allow public tuning of cub::DeviceScan::*ByKey#9215
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR refactors internal delay-constructor and scan policy types to public APIs. Renames ChangesPolicy Type Refactoring and API Introduction
Tuning Policy Updates
Dispatch Layer Updates
Kernel Implementation Updates
Tests and Supporting Updates
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
cub/cub/device/dispatch/tuning/tuning_scan.cuh (1)
130-141: 💤 Low valuesuggestion: Missing
[[nodiscard]]onoperator==andoperator!=.Other policy structs in this PR (
ScanLookbackPolicy,ScanPolicy,ScanByKeyPolicy,LookbackDelayPolicy) have[[nodiscard]]on their comparison operators. Add it here for consistency.- _CCCL_HOST_DEVICE_API constexpr friend bool operator==(const ScanWarpspeedPolicy& lhs, const ScanWarpspeedPolicy& rhs) + [[nodiscard]] _CCCL_HOST_DEVICE_API constexpr friend bool operator==(const ScanWarpspeedPolicy& lhs, const ScanWarpspeedPolicy& rhs) { return lhs.num_reduce_and_scan_warps == rhs.num_reduce_and_scan_warps && lhs.look_ahead_items_per_thread == rhs.look_ahead_items_per_thread && lhs.items_per_thread == rhs.items_per_thread && lhs.lookahead_stages == rhs.lookahead_stages && lhs.block_idx_stages == rhs.block_idx_stages; } - _CCCL_HOST_DEVICE_API constexpr friend bool operator!=(const ScanWarpspeedPolicy& lhs, const ScanWarpspeedPolicy& rhs) + [[nodiscard]] _CCCL_HOST_DEVICE_API constexpr friend bool operator!=(const ScanWarpspeedPolicy& lhs, const ScanWarpspeedPolicy& rhs) { return !(lhs == rhs); }cub/test/catch2_test_device_scan_by_key_env.cu (1)
134-143: 💤 Low valuesuggestion: Consider adding
[[nodiscard]]andnoexceptto the operator(). As per coding guidelines, most functions with non-void return type should use[[nodiscard]], and functions that don't throw should usenoexcept. While this is test code, consistency helps maintain best practices.cub/cub/device/dispatch/kernels/kernel_radix_sort.cuh (1)
154-163: ⚡ Quick winsuggestion: The type alias
ScanPolicyshadows the publiccub::ScanPolicytype used at line 151, creating confusion within this function scope. For consistency with the pattern inkernel_scan.cuh:223, rename the alias toScanPolicyTorAgentScanPolicyTto avoid shadowing.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: fea37719-5a6b-4be1-918c-a245d8e327e7
📒 Files selected for processing (33)
c/parallel/src/radix_sort.cuc/parallel/src/scan.cuc/parallel/src/unique_by_key.cucub/benchmarks/bench/copy/memcpy.cucub/benchmarks/bench/scan/exclusive/by_key.cucub/benchmarks/bench/scan/policy_selector.hcub/cub/detail/delay_constructor.cuhcub/cub/device/device_scan.cuhcub/cub/device/dispatch/dispatch_radix_sort.cuhcub/cub/device/dispatch/dispatch_rle.cuhcub/cub/device/dispatch/dispatch_scan.cuhcub/cub/device/dispatch/dispatch_scan_by_key.cuhcub/cub/device/dispatch/dispatch_select_if.cuhcub/cub/device/dispatch/dispatch_three_way_partition.cuhcub/cub/device/dispatch/kernels/kernel_radix_sort.cuhcub/cub/device/dispatch/kernels/kernel_scan.cuhcub/cub/device/dispatch/kernels/kernel_scan_warpspeed.cuhcub/cub/device/dispatch/tuning/tuning_batch_memcpy.cuhcub/cub/device/dispatch/tuning/tuning_radix_sort.cuhcub/cub/device/dispatch/tuning/tuning_reduce_by_key.cuhcub/cub/device/dispatch/tuning/tuning_rle_encode.cuhcub/cub/device/dispatch/tuning/tuning_rle_non_trivial_runs.cuhcub/cub/device/dispatch/tuning/tuning_scan.cuhcub/cub/device/dispatch/tuning/tuning_scan_by_key.cuhcub/cub/device/dispatch/tuning/tuning_select_if.cuhcub/cub/device/dispatch/tuning/tuning_three_way_partition.cuhcub/cub/device/dispatch/tuning/tuning_unique_by_key.cuhcub/test/catch2_test_device_scan_by_key_env.cucub/test/catch2_test_device_scan_by_key_env_api.cucub/test/catch2_test_device_scan_env.cucub/test/catch2_test_device_scan_env_api.cunvbench_helper/nvbench_helper/look_back_helper.cuhthrust/testing/scan.cu
| .load_modifier = cub::LOAD_DEFAULT, | ||
| .store_algorithm = cub::BLOCK_STORE_WARP_TRANSPOSE, | ||
| .scan_algorithm = cub::BLOCK_SCAN_WARP_SCANS, | ||
| .lookback_delay = cub::LookbackDelayPolicy{LookbackDelayAlgorithm::fixed_delay, 832, 1165}}, |
There was a problem hiding this comment.
critical: Missing namespace qualification. LookbackDelayAlgorithm::fixed_delay should be cub::LookbackDelayAlgorithm::fixed_delay to match the fully-qualified style used elsewhere in this initializer (lines 428, 433-436).
Fix namespace qualification
- .lookback_delay = cub::LookbackDelayPolicy{LookbackDelayAlgorithm::fixed_delay, 832, 1165}},
+ .lookback_delay = cub::LookbackDelayPolicy{cub::LookbackDelayAlgorithm::fixed_delay, 832, 1165}},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .lookback_delay = cub::LookbackDelayPolicy{LookbackDelayAlgorithm::fixed_delay, 832, 1165}}, | |
| .lookback_delay = cub::LookbackDelayPolicy{cub::LookbackDelayAlgorithm::fixed_delay, 832, 1165}}, |
😬 CI Workflow Results🟥 Finished in 2h 17m: Pass: 36%/351 | Total: 7d 18h | Max: 2h 16m | Hits: 35%/509618See results here. |
cub::DeviceScan#8853Fixes: #8852
New public entities
cub::ScanByKeyPolicythreads_per_block,items_per_thread,load_algorithm,load_modifier,store_algorithm,scan_algorithm,lookback_delay