Skip to content

fix(cua): use scroll notch count (wheel units) in all computer-use templates#129

Merged
dprevoznik merged 16 commits intomainfrom
fix-cua-templates-scroll-behavior
Mar 3, 2026
Merged

fix(cua): use scroll notch count (wheel units) in all computer-use templates#129
dprevoznik merged 16 commits intomainfrom
fix-cua-templates-scroll-behavior

Conversation

@dprevoznik
Copy link
Contributor

@dprevoznik dprevoznik commented Mar 3, 2026

updates scrolling logic across all computer use templates in the CLI.

  • Branch: fix-cua-templates-scroll-behavior

Made with Cursor


Note

Medium Risk
Changes scroll semantics across multiple computer-use templates from pixel-based deltas to wheel-notch units, which can materially alter agent navigation behavior and task success. Also includes a few runtime guards (local execution gating, null-handling) that are low risk but touch invocation paths.

Overview
Unifies scroll behavior across computer-use templates by switching from pixel-based scroll deltas to wheel unit (notch) counts in Anthropic (TS/Python), Gemini (TS/Python), Yutori (TS/Python), and OpenAGI handlers, and updating tool outputs/prompts to reflect the new units.

Gemini templates now convert magnitude (px) to capped notch counts (via PX_PER_NOTCH/MAX_NOTCHES_PER_ACTION) and add a few safety checks (e.g., missing function names/content). Local test entrypoints in Gemini are gated on KERNEL_INVOCATION, and the TS Gemini session adds null-safe handling for returned URLs/IDs.

Template/QA naming is aligned by changing the yutori template key from yutori-computer-use to yutori in docs (qa.md) and pkg/create/templates.go.

Written by Cursor Bugbot for commit 6f3501d. This will update automatically on new commits. Configure here.

…template

- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- Return tool output: "Scrolled N wheel unit(s) direction."
- Add system prompt line: scroll_amount and result are in wheel units

Made-with: Cursor
…late

- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- Return tool output: "Scrolled N wheel unit(s) direction."
- Add system prompt line: scroll_amount and result are in wheel units

Made-with: Cursor
…plate

- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- SCROLL_DOCUMENT: use 3 notches instead of 500 pixels
- SCROLL_AT: treat magnitude as notch count instead of denormalizing to pixels

Made-with: Cursor
- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- SCROLL_DOCUMENT: use 3 notches instead of 500 pixels
- SCROLL_AT: treat magnitude as notch count instead of denormalizing to pixels

Made-with: Cursor
…plate

- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- Return tool output: "Scrolled N wheel unit(s) direction."

Made-with: Cursor
- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- Return tool output: "Scrolled N wheel unit(s) direction."

Made-with: Cursor
- Send delta_x/delta_y as signed notch count (kernel-images uses delta
  as wheel-event repeat count, not pixels)
- Support left/right scroll directions
- Default scroll_amount changed from 100 (pixels) to 3 (notches)

Made-with: Cursor
…7 notches)

- scroll_document and scroll_at: magnitude → notches with PX_PER_NOTCH=60,
  MAX_NOTCHES_PER_ACTION=17, single API call
- Remove chunking; default magnitude 400
- KERNEL_INVOCATION guard so invokes use payload query

Made-with: Cursor
- Scroll: same as Python (PX_PER_NOTCH 60, MAX_NOTCHES_PER_ACTION 17, single call)
- loop.ts: Environment.ENVIRONMENT_BROWSER, candidate?.content, fc.name guard,
  content check in pruneOldScreenshots
- session.ts: session_id/liveViewUrl/replayViewUrl ?? null for string | null
- index.ts: KERNEL_INVOCATION guard for payload query

Made-with: Cursor
- OpenAGI/Lux emits N scroll actions for amount N; treat each as 1 notch
- Document in handler docstring; no coalescing

Made-with: Cursor
@dprevoznik dprevoznik changed the title fix-cua-templates: add cua-scroll debug log for scroll action (yutori Python) fix(cua): use scroll notch count (wheel units) in all computer-use templates Mar 3, 2026
Scroll amount was stored but never used; the only call to _execute_scroll
hardcoded notches=1. Align with docstring: 1 scroll event = 1 notch, model
controls amount by emitting N scroll actions. Remove parameter and
assignment to fix misleading API.

Made-with: Cursor
@dprevoznik dprevoznik requested a review from Sayan- March 3, 2026 05:06
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Copy link
Contributor

@Sayan- Sayan- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeet

- Template key: yutori-computer-use -> yutori in pkg/create/templates.go
- Rename pkg/templates/python/yutori-computer-use -> python/yutori
- Rename pkg/templates/typescript/yutori-computer-use -> typescript/yutori
- Update .cursor/commands/qa.md to use -t yutori

Made-with: Cursor
@dprevoznik dprevoznik merged commit 8898a01 into main Mar 3, 2026
2 checks passed
@dprevoznik dprevoznik deleted the fix-cua-templates-scroll-behavior branch March 3, 2026 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants