docs(args): fix stale cap literals missed by #1056 by ChaoZheng109 · Pull Request #1064 · hw-native-sys/simpler

ChaoZheng109 · 2026-06-16T06:59:40Z

What

Fixes two stale cap literals left behind by #1056 (which raised
CORE_MAX_TENSOR_ARGS 16→32 and lowered CORE_MAX_SCALAR_ARGS 32→16).
#1056 updated the scalar-limit error strings but missed:

pto_types.h (a2a3 + a5): the tensor-limit error string still read
"exceeds MAX_TENSOR_ARGS=16". The cap is now 32 — a user who hits
the tensor-arg cap would otherwise get a misleading message.
pto_runtime2_types.h (a2a3 + a5): the PTO2TaskPayload::init
memcpy comment said "Both arrays are 1024B" (dates from an old
128-scalar cap). The scalar arrays are now MAX_SCALAR_ARGS * 8 =
128B.

Comment/string only — no ABI or behavior change.

⚠️ Stacked on #1056

These literals are only correct on top of #1056's cap change. On
main, CORE_MAX_TENSOR_ARGS is still 16 and the scalar arrays are 256B,
so the current text is correct there. This branch is stacked on #1056:

Do not merge before Update: raise CORE_MAX_TENSOR_ARGS to 32, lower scalars to 16 #1056.
Until Update: raise CORE_MAX_TENSOR_ARGS to 32, lower scalars to 16 #1056 merges, this PR's diff also includes Update: raise CORE_MAX_TENSOR_ARGS to 32, lower scalars to 16 #1056's commit; after
Update: raise CORE_MAX_TENSOR_ARGS to 32, lower scalars to 16 #1056 lands, rebasing reduces it to the 4-line doc fix here.

Double the per-core kernel tensor-arg capacity (16 -> 32). The most tensor-hungry in-tree kernel (spmd_paged_attention_highperf) already uses 15 in-core tensors, leaving only one slot of headroom under the old cap of 16. Offset the cost by lowering CORE_MAX_SCALAR_ARGS 32 -> 16 so the tensor+scalar sum stays 48. This keeps PTO2DispatchPayload at 512 B and the SPMD context indices at 48/49, so per-dispatch latency is unchanged. Repo-wide max in-core scalar usage is 8 (spmd_paged_attention), well under the new 16-scalar cap. - arg_direction.h: CORE_MAX_TENSOR_ARGS 16->32, CORE_MAX_SCALAR_ARGS 32->16 - DepGenRecord (tensor-driven): size 2624->4672, _pad0 20->4, DEP_GEN_OVERFLOW_DEPS_PER_RECORD 326->582; docs/dfx/dep_gen.md updated - PTO2TaskPayload: tensors region 2048->4096 B, scalar-region guard 256->128 B; stale cache-line layout comments corrected - pto_types.h: scalar-limit error strings 32->16; guard add_scalars, add_scalars_i32, copy_scalars_from against negative count (signed count bypassed the bounds check -> oversized memcpy / negative scalar_count_) - intrinsic.h / pto2_dispatch_payload.h: comment-only (indices and payload size return to their original 48/49 and 512 B) Verified: a2a3 + a5 build, sim (vector_example, scalar_data) pass; hardware A/B perf-neutral on qwen3 decode (Device +0.3%, within noise).

hw-native-sys#1056 raised CORE_MAX_TENSOR_ARGS to 32 and lowered CORE_MAX_SCALAR_ARGS to 16. It updated the scalar-limit error strings but left two stale literals behind: - pto_types.h: the tensor-limit error string still read "exceeds MAX_TENSOR_ARGS=16"; the cap is now 32 (a2a3 + a5). - pto_runtime2_types.h: the PTO2TaskPayload::init memcpy comment said "Both arrays are 1024B" (dates from a 128-scalar cap). The scalar arrays are now MAX_SCALAR_ARGS * 8 = 128B (a2a3 + a5). Comment/string only; no ABI or behavior change.

gemini-code-assist · 2026-06-16T06:59:43Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

coderabbitai · 2026-06-16T06:59:49Z

Warning

Review limit reached

@ChaoZheng109, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 22 minutes and 27 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e27e8847-34e4-4186-9cab-893a4724c644

📥 Commits

Reviewing files that changed from the base of the PR and between 19b2c0b and d03deca.

📒 Files selected for processing (14)

docs/dfx/dep_gen.md
src/a2a3/platform/include/common/dep_gen.h
src/a2a3/runtime/tensormap_and_ringbuffer/common/intrinsic.h
src/a2a3/runtime/tensormap_and_ringbuffer/host/dep_gen_replay.cpp
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto2_dispatch_payload.h
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.h
src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_types.h
src/a5/platform/include/common/dep_gen.h
src/a5/runtime/tensormap_and_ringbuffer/common/intrinsic.h
src/a5/runtime/tensormap_and_ringbuffer/host/dep_gen_replay.cpp
src/a5/runtime/tensormap_and_ringbuffer/runtime/pto2_dispatch_payload.h
src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2_types.h
src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_types.h
src/common/task_interface/arg_direction.h

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

ChaoZheng109 added 2 commits June 15, 2026 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(args): fix stale cap literals missed by #1056#1064

docs(args): fix stale cap literals missed by #1056#1064
ChaoZheng109 wants to merge 2 commits into
hw-native-sys:mainfrom
ChaoZheng109:fix-arg-cap-stale-strings

ChaoZheng109 commented Jun 16, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChaoZheng109 commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

⚠️ Stacked on #1056

Uh oh!

gemini-code-assist Bot commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026

Review limit reached

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChaoZheng109 commented Jun 16, 2026 •

edited

Loading