Refactor: share dump arg selection state#1069
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughIntroduces ChangesProfiling Config Centralization and DumpArgSelection Extraction
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors profiling and tensor dump configurations across multiple runtime directories by introducing a centralized profiling_config.h header and a new DumpArgSelection helper class to manage argument selection and masking. The code review feedback highlights several critical safety concerns in the new DumpArgSelection class, specifically pointing out potential undefined behavior from out-of-bounds bitwise shifts and several instances of missing bounds checks on indices and counts that could lead to out-of-bounds memory reads or writes.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| uint64_t dump_arg_index_ambiguous_mask() const { return dump_arg_index_ambiguous_mask_; } | ||
| const uint8_t *scalar_dtypes() const { return scalar_dtypes_; } | ||
|
|
||
| void mark_index(int32_t index) { dump_arg_mask_ |= (uint64_t{1} << index); } |
There was a problem hiding this comment.
Shifting a 64-bit integer by an out-of-bounds index (negative or >= 64) is undefined behavior in C++. To prevent this, validate that index is within the safe range [0, 63] before performing the bitwise shift.
void mark_index(int32_t index) {
if (index >= 0 && index < 64) {
dump_arg_mask_ |= (uint64_t{1} << index);
}
}There was a problem hiding this comment.
Addressed in current head 9166ebb: mark_index now validates the argument with is_valid_arg_index(index) before doing the 64-bit shift.
| bool mark_scalar_by_ptr(uintptr_t ptr, int32_t scalar_count, int32_t tensor_offset) { | ||
| int32_t first_match = -1; |
There was a problem hiding this comment.
Validate that scalar_count is non-negative and does not exceed CORE_MAX_SCALAR_ARGS to prevent out-of-bounds reads from scalar_source_ptrs_.
bool mark_scalar_by_ptr(uintptr_t ptr, int32_t scalar_count, int32_t tensor_offset) {
if (scalar_count < 0 || scalar_count > CORE_MAX_SCALAR_ARGS) {
return false;
}
int32_t first_match = -1;References
- When using a signed integer for a count or size parameter in functions that perform bounds checks and memory operations, always validate that the count is non-negative to prevent bypassing bounds checks and causing out-of-bounds writes or signed-to-unsigned conversion overflows.
There was a problem hiding this comment.
Addressed in current head 9166ebb: mark_scalar_by_ptr now validates scalar_count, tensor_offset, the arg span, and the computed arg_index before scanning or marking.
| void record_scalar_source(int32_t slot, uintptr_t ptr, uint8_t dtype) { | ||
| scalar_source_ptrs_[slot] = ptr; | ||
| scalar_dtypes_[slot] = dtype; | ||
| } |
There was a problem hiding this comment.
Validate that slot is within the bounds of CORE_MAX_SCALAR_ARGS to prevent out-of-bounds writes to scalar_source_ptrs_ and scalar_dtypes_.
void record_scalar_source(int32_t slot, uintptr_t ptr, uint8_t dtype) {
if (slot >= 0 && slot < CORE_MAX_SCALAR_ARGS) {
scalar_source_ptrs_[slot] = ptr;
scalar_dtypes_[slot] = dtype;
}
}There was a problem hiding this comment.
Addressed in current head 9166ebb: record_scalar_source now returns unless slot is a valid single-slot scalar range.
| void copy_scalar_dtypes_from(const DumpArgSelection &src, int32_t dst_offset, int32_t src_offset, int32_t count) { | ||
| memcpy(&scalar_dtypes_[dst_offset], &src.scalar_dtypes_[src_offset], count * sizeof(uint8_t)); | ||
| clear_scalar_sources(dst_offset, count); | ||
| } |
There was a problem hiding this comment.
Validate that count, dst_offset, and src_offset are non-negative and within the bounds of CORE_MAX_SCALAR_ARGS to prevent out-of-bounds memory operations during memcpy and clear_scalar_sources.
| void copy_scalar_dtypes_from(const DumpArgSelection &src, int32_t dst_offset, int32_t src_offset, int32_t count) { | |
| memcpy(&scalar_dtypes_[dst_offset], &src.scalar_dtypes_[src_offset], count * sizeof(uint8_t)); | |
| clear_scalar_sources(dst_offset, count); | |
| } | |
| void copy_scalar_dtypes_from(const DumpArgSelection &src, int32_t dst_offset, int32_t src_offset, int32_t count) { | |
| if (count < 0 || dst_offset < 0 || src_offset < 0 || | |
| dst_offset + count > CORE_MAX_SCALAR_ARGS || | |
| src_offset + count > CORE_MAX_SCALAR_ARGS) { | |
| return; | |
| } | |
| memcpy(&scalar_dtypes_[dst_offset], &src.scalar_dtypes_[src_offset], count * sizeof(uint8_t)); | |
| clear_scalar_sources(dst_offset, count); | |
| } |
References
- When using a signed integer for a count or size parameter in functions that perform bounds checks and memory operations, always validate that the count is non-negative to prevent bypassing bounds checks and causing out-of-bounds writes or signed-to-unsigned conversion overflows.
There was a problem hiding this comment.
Addressed in current head 9166ebb: copy_scalar_dtypes_from now validates both destination and source scalar ranges before memcpy and source clearing.
| } | ||
|
|
||
| private: | ||
| void mark_index_ambiguous(int32_t index) { dump_arg_index_ambiguous_mask_ |= (uint64_t{1} << index); } |
There was a problem hiding this comment.
Shifting a 64-bit integer by an out-of-bounds index (negative or >= 64) is undefined behavior in C++. Validate that index is within the safe range [0, 63] before performing the bitwise shift.
void mark_index_ambiguous(int32_t index) {
if (index >= 0 && index < 64) {
dump_arg_index_ambiguous_mask_ |= (uint64_t{1} << index);
}
}There was a problem hiding this comment.
Addressed in current head 9166ebb: mark_index_ambiguous now validates the index before setting the ambiguity mask bit.
| void clear_scalar_sources(int32_t start, int32_t count) { | ||
| for (int32_t i = 0; i < count; i++) { | ||
| scalar_source_ptrs_[start + i] = 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
Validate that start and count are non-negative and within the bounds of CORE_MAX_SCALAR_ARGS to prevent out-of-bounds writes to scalar_source_ptrs_.
void clear_scalar_sources(int32_t start, int32_t count) {
if (count < 0 || start < 0 || start + count > CORE_MAX_SCALAR_ARGS) {
return;
}
for (int32_t i = 0; i < count; i++) {
scalar_source_ptrs_[start + i] = 0;
}
}References
- When using a signed integer for a count or size parameter in functions that perform bounds checks and memory operations, always validate that the count is non-negative to prevent bypassing bounds checks and causing out-of-bounds writes or signed-to-unsigned conversion overflows.
There was a problem hiding this comment.
Addressed in current head 9166ebb: clear_scalar_sources(start, count) now validates the scalar range before writing scalar_source_ptrs_.
| void clear_scalar_dtypes(int32_t start, int32_t count) { | ||
| memset(&scalar_dtypes_[start], 0, count * sizeof(uint8_t)); | ||
| } |
There was a problem hiding this comment.
Validate that start and count are non-negative and within the bounds of CORE_MAX_SCALAR_ARGS to prevent out-of-bounds writes to scalar_dtypes_ during memset.
| void clear_scalar_dtypes(int32_t start, int32_t count) { | |
| memset(&scalar_dtypes_[start], 0, count * sizeof(uint8_t)); | |
| } | |
| void clear_scalar_dtypes(int32_t start, int32_t count) { | |
| if (count < 0 || start < 0 || start + count > CORE_MAX_SCALAR_ARGS) { | |
| return; | |
| } | |
| memset(&scalar_dtypes_[start], 0, count * sizeof(uint8_t)); | |
| } |
References
- When using a signed integer for a count or size parameter in functions that perform bounds checks and memory operations, always validate that the count is non-negative to prevent bypassing bounds checks and causing out-of-bounds writes or signed-to-unsigned conversion overflows.
There was a problem hiding this comment.
Addressed in current head 9166ebb: clear_scalar_dtypes(start, count) now validates the scalar range before memset.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h (1)
392-403:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReject negative
src_offsetin bothcopy_scalars_from()mirrors.Both implementations only check
count < 0andsrc_offset + count > src.scalar_count_; a negativesrc_offsetcan pass and then index beforesrc.scalars_andsrc.dump_arg_selection_metadata.
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h#L392-L403: addsrc_offset < 0to the source-range validation before thememcpyandcopy_scalar_dtypes_from()call.src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.h#L390-L401: apply the same lower-bound check in the mirror.Proposed fix pattern
void copy_scalars_from(const Arg &src, int src_offset, int count) { - if (count < 0 || src_offset + count > src.scalar_count_) { + if (src_offset < 0 || count < 0 || src_offset + count > src.scalar_count_) { set_error("Source scalar range out of bounds in copy_scalars_from"); return; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h` around lines 392 - 403, Both implementations of the copy_scalars_from() method have an incomplete bounds check that allows negative src_offset values to pass through and cause out-of-bounds memory access. In src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h (lines 392-403, anchor), add a check for src_offset < 0 in the initial validation condition alongside the existing count < 0 and src_offset + count > src.scalar_count_ checks. Apply the identical lower-bound check (src_offset < 0) to the mirror implementation in src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.h (lines 390-401, sibling) to ensure consistent validation across both copies of the method.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/common/platform/include/aicpu/dump_arg_selection.h`:
- Around line 34-95: Add bounds checking to all index and range-based mutators
in the DumpArgSelection struct to prevent undefined behavior. In mark_index and
mark_index_ambiguous, validate that index is within [0, 63] before performing
the bit-shift operation. In mark_all, validate that tensor_count and
scalar_count are non-negative and that their sum does not exceed valid argument
limits. In mark_scalar_by_ptr, validate that scalar_count is non-negative,
tensor_offset is valid, and the computed arg_index is within [0, 63]. In
record_scalar_source, validate that slot is within [0, CORE_MAX_SCALAR_ARGS). In
clear_scalar_sources and clear_scalar_dtypes (both overloads), validate that
start is non-negative, count is non-negative, and start + count does not exceed
CORE_MAX_SCALAR_ARGS. In copy_scalar_dtypes_from, validate that dst_offset,
src_offset, and count are all non-negative and that both source and destination
ranges fit within the scalar arrays without overflow.
---
Outside diff comments:
In `@src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h`:
- Around line 392-403: Both implementations of the copy_scalars_from() method
have an incomplete bounds check that allows negative src_offset values to pass
through and cause out-of-bounds memory access. In
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h (lines 392-403,
anchor), add a check for src_offset < 0 in the initial validation condition
alongside the existing count < 0 and src_offset + count > src.scalar_count_
checks. Apply the identical lower-bound check (src_offset < 0) to the mirror
implementation in src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.h
(lines 390-401, sibling) to ensure consistent validation across both copies of
the method.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0380cc61-e1fe-4642-a282-8ed187bf9bf9
📒 Files selected for processing (11)
src/a2a3/runtime/host_build_graph/runtime/pto_runtime2_types.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.hsrc/a5/runtime/host_build_graph/runtime/pto_runtime2_types.hsrc/a5/runtime/host_build_graph/runtime/tensor_info.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.hsrc/common/platform/include/aicpu/dump_arg_selection.hsrc/common/task_interface/profiling_config.h
e902a8d to
9166ebb
Compare
- Centralize PTO2 profiling macro defaults in profiling_config.h - Move tensor/scalar dump selection metadata into DumpArgSelection - Reuse the shared helper from a2a3 and a5 Arg implementations - Update profiling and tensor dump docs for the new macro source
9166ebb to
1266722
Compare
Summary
Testing
Closes #1043