Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions WHATS_NEW.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,18 @@

## What's new (2026-06-26)

### Trial and Force Action Modes (Playwright-style)

Dry-run "is this control ready?" without clicking, or force a click past the gate. Full reference: [`docs/source/Eng/doc/new_features/v222_features_doc.rst`](docs/source/Eng/doc/new_features/v222_features_doc.rst).

- **`act_with_mode`** (`AC_act_with_mode`): `actionability.act_when_ready` only waits-then-acts. Real flows need two more modes Playwright codified: **trial** (run every actionability check but *don't* act — a side-effect-free "would this work?" dry run) and **force** (skip the checks and act now — the escape hatch when the gate misjudges a control as occluded/disabled). `act_with_mode` adds both alongside the default `auto`, over the same injectable seams as the gate, returning `{mode, acted, actionable, reason, point, result}`. Reuses `actionability.wait_actionable`; fully testable without a screen. Completes the ROUND-15 input-fidelity lane (7/7). No `PySide6`.

### Act In View — Scroll to a Target, Then Act When Actionable

Click the row three pages down: scroll it into view, then gate on actionability before clicking. Full reference: [`docs/source/Eng/doc/new_features/v221_features_doc.rst`](docs/source/Eng/doc/new_features/v221_features_doc.rst).

- **`act_in_view` / `ScrollPlan`** (`AC_act_in_view`): two reliability primitives stayed separate — `scroll_find.scroll_until_visible` brings an off-screen target on-screen, and `actionability.act_when_ready` waits for it to be visible/stable/enabled/unoccluded before acting. A real "click the off-screen row" step needs both. `act_in_view` composes them: scroll until the target is located, then run the actionability gate at its point and perform the action. `ScrollPlan` bundles the scroll search + its `locator`/`scroller` seams so the call stays within the argument limit; the actionability probes (`region_sampler`/`enabled_probe`/`hit_tester`) and gate `config` are injectable too, so the whole flow is testable without a screen. Closes the input-fidelity lane's composition gap. No `PySide6`.

### Template-Free Element Proposal (Pixels to Elements)

Get a clean numbered element list straight from the screen when there's no accessibility tree. Full reference: [`docs/source/Eng/doc/new_features/v220_features_doc.rst`](docs/source/Eng/doc/new_features/v220_features_doc.rst).
Expand Down
49 changes: 49 additions & 0 deletions docs/source/Eng/doc/new_features/v221_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Act In View — Scroll to a Target, Then Act When Actionable
==========================================================

Two reliability primitives stayed separate: ``scroll_find.scroll_until_visible``
brings an off-screen target on-screen, and ``actionability.act_when_ready`` waits
for a target to be visible / stable / enabled / unoccluded before acting. A real
"click the row three pages down" step needs *both* — scroll to it, then gate
before clicking. ``act_in_view`` composes them into one call.

* :class:`ScrollPlan` — bundles the scroll search (``kind`` / ``direction`` /
``max_scrolls`` / ``scroll_amount``) and its injectable ``locator`` /
``scroller`` seams, so the composed call stays within a sane argument count.
* :func:`act_in_view` — scroll until the target is found, then run the
actionability gate at its location and perform ``action`` on it.

Every seam — the scroll locator / scroller, the action, the actionability probes
(``region_sampler`` / ``enabled_probe`` / ``hit_tester``) and the gate ``config``
— is injectable, so the whole flow is testable without a screen. Reuses
:func:`scroll_find.scroll_until_visible` and
:func:`actionability.act_when_ready`. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import act_in_view, ScrollPlan

# Scroll down to the "Submit" button image, then click it once it's actionable
act_in_view("submit.png", lambda point: click(point[0], point[1]),
scroll=ScrollPlan(kind="image", direction="down",
max_scrolls=20))

``act_in_view`` returns ``{acted, coords, scrolls, result}`` (``result`` is the
action's return value) and raises ``AutoControlActionException`` if the target
never comes into view. Pass ``enabled_probe`` / ``hit_tester`` / ``config`` to
have the actionability gate actually wait for the control to be enabled and
unoccluded before the action fires — otherwise it acts as soon as the target is
located.

Executor commands
-----------------

``AC_act_in_view`` (``target`` + ``kind`` / ``direction`` / ``max_scrolls`` /
``scroll_amount`` / ``button`` → ``{acted, coords, scrolls}``) scrolls a template
or text target into view and clicks it. It is the matching ``ac_act_in_view`` MCP
tool and a Script Builder command under **Flow**. :func:`act_in_view` (which
takes an arbitrary action and the actionability probes) is the Python-API
surface.
49 changes: 49 additions & 0 deletions docs/source/Eng/doc/new_features/v222_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
Trial and Force Action Modes (Playwright-style)
===============================================

``actionability.act_when_ready`` has one behaviour: wait for the target to be
actionable, then act (or raise on timeout). Real flows need two more modes that
Playwright codified:

* **trial** — run every actionability check but *don't* perform the action; just
report whether it *would* have acted. The dry run for "is this control ready?"
without side effects.
* **force** — skip the checks and act *now*, the deliberate escape hatch when the
gate is wrong (a control the heuristics misjudge as occluded / disabled).

:func:`act_with_mode` adds both alongside the default gated (``auto``) behaviour,
over the same injectable seams as the gate, so each mode is testable without a
screen. Reuses :func:`actionability.wait_actionable`. Imports no ``PySide6``.

Headless API
------------

.. code-block:: python

from je_auto_control import act_with_mode

bbox = lambda: (x, y, w, h)
click = lambda point: do_click(point[0], point[1])

act_with_mode(click, bbox, mode="auto") # gate, then click if ready
report = act_with_mode(click, bbox, mode="trial") # dry run, never clicks
if report["actionable"]:
...
act_with_mode(click, bbox, mode="force") # click now, no checks

Every mode returns ``{mode, acted, actionable, reason, point, result}``:
``acted`` says whether the action ran, ``actionable`` / ``reason`` come from the
gate (``trial`` reports these without acting), and ``result`` is the action's
return value. The actionability probes (``region_sampler`` / ``enabled_probe`` /
``hit_tester``) and ``config`` are forwarded to the gate as usual. An unknown
``mode`` raises ``ValueError``.

Executor commands
-----------------

``AC_act_with_mode`` (``x`` / ``y`` + ``mode`` / ``button`` → ``{mode, acted,
actionable, reason, point}``) clicks a point under the chosen mode — ``trial``
is a dry-run probe that never clicks, ``force`` clicks unconditionally. It is the
matching ``ac_act_with_mode`` MCP tool and a Script Builder command under
**Flow**. :func:`act_with_mode` (which takes an arbitrary action) is the
Python-API surface.
38 changes: 38 additions & 0 deletions docs/source/Zh/doc/new_features/v221_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
在視野內操作——捲動到目標,再於可操作時動作
============================================

兩個可靠性原語原本各自獨立:``scroll_find.scroll_until_visible`` 把螢幕外的目標捲進畫面,
``actionability.act_when_ready`` 則在目標可見 / 穩定 / 啟用 / 未被遮擋前等待再動作。真實的
「點選下三頁的那一列」步驟需要*兩者*——先捲到它,再閘控後才點擊。``act_in_view`` 把它們組合成單一呼叫。

* :class:`ScrollPlan` ——把捲動搜尋(``kind`` / ``direction`` / ``max_scrolls`` /
``scroll_amount``)與其可注入的 ``locator`` / ``scroller`` 接縫打包,讓組合後的呼叫維持在合理的參數數量內。
* :func:`act_in_view` ——捲動直到找到目標,接著在其位置執行 actionability 閘控,並對其執行 ``action``。

每個接縫——捲動的 locator / scroller、action、actionability 探針(``region_sampler`` /
``enabled_probe`` / ``hit_tester``)與閘控 ``config``——皆可注入,故整個流程能在沒有螢幕的情況下測試。
重用 :func:`scroll_find.scroll_until_visible` 與 :func:`actionability.act_when_ready`。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import act_in_view, ScrollPlan

# 向下捲動到「Submit」按鈕影像,於可操作時點擊
act_in_view("submit.png", lambda point: click(point[0], point[1]),
scroll=ScrollPlan(kind="image", direction="down",
max_scrolls=20))

``act_in_view`` 回傳 ``{acted, coords, scrolls, result}``(``result`` 為 action 的回傳值),
若目標始終未進入畫面則丟出 ``AutoControlActionException``。傳入 ``enabled_probe`` / ``hit_tester`` /
``config`` 可讓 actionability 閘控真正等到控制項已啟用且未被遮擋才觸發動作——否則一旦定位到目標即動作。

執行器指令
----------

``AC_act_in_view``(``target`` 加上 ``kind`` / ``direction`` / ``max_scrolls`` /
``scroll_amount`` / ``button`` → ``{acted, coords, scrolls}``)把 template 或文字目標捲入畫面並點擊。
以對應的 ``ac_act_in_view`` MCP 工具及 Script Builder 指令(位於 **Flow** 分類下)形式提供。
:func:`act_in_view`(接受任意 action 與 actionability 探針)則是 Python API 介面。
41 changes: 41 additions & 0 deletions docs/source/Zh/doc/new_features/v222_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
試行與強制動作模式(Playwright 風格)
=====================================

``actionability.act_when_ready`` 只有一種行為:等待目標可操作,再動作(或逾時丟例外)。真實流程還需要
Playwright 定義的另外兩種模式:

* **trial(試行)**——執行每一項 actionability 檢查,但*不*真正動作;只回報它*是否會*動作。
「這個控制項準備好了嗎?」的無副作用乾跑。
* **force(強制)**——跳過檢查,*立即*動作;當閘控判斷錯誤(把控制項誤判為被遮擋 / 停用)時的刻意逃生口。

:func:`act_with_mode` 在預設的閘控(``auto``)行為之外加上這兩種,使用與閘控相同的可注入接縫,
故每種模式都能在沒有螢幕的情況下測試。重用 :func:`actionability.wait_actionable`。不匯入 ``PySide6``。

無頭 API
--------

.. code-block:: python

from je_auto_control import act_with_mode

bbox = lambda: (x, y, w, h)
click = lambda point: do_click(point[0], point[1])

act_with_mode(click, bbox, mode="auto") # 閘控後若就緒則點擊
report = act_with_mode(click, bbox, mode="trial") # 乾跑,絕不點擊
if report["actionable"]:
...
act_with_mode(click, bbox, mode="force") # 立即點擊,不檢查

每種模式皆回傳 ``{mode, acted, actionable, reason, point, result}``:``acted`` 表示動作是否執行,
``actionable`` / ``reason`` 來自閘控(``trial`` 不動作即回報這些),``result`` 為 action 的回傳值。
actionability 探針(``region_sampler`` / ``enabled_probe`` / ``hit_tester``)與 ``config`` 一如往常轉發給閘控。
未知的 ``mode`` 會丟出 ``ValueError``。

執行器指令
----------

``AC_act_with_mode``(``x`` / ``y`` 加上 ``mode`` / ``button`` → ``{mode, acted,
actionable, reason, point}``)以所選模式點擊一個點——``trial`` 是絕不點擊的乾跑探測,``force`` 無條件點擊。
以對應的 ``ac_act_with_mode`` MCP 工具及 Script Builder 指令(位於 **Flow** 分類下)形式提供。
:func:`act_with_mode`(接受任意 action)則是 Python API 介面。
5 changes: 5 additions & 0 deletions je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,10 @@
)
# Propose a clean element list from raw pixels (template-free)
from je_auto_control.utils.element_proposal import propose_elements, tag_kinds
# Scroll a target into view, then act on it once it is actionable
from je_auto_control.utils.act_in_view import ScrollPlan, act_in_view
# Trial / force action modes over the actionability gate
from je_auto_control.utils.act_modes import act_with_mode
# Rich clipboard formats — RTF + CSV/TSV codecs and Windows get / set
from je_auto_control.utils.clipboard_rich_formats import (
build_rtf, csv_to_rows, get_clipboard_csv, get_clipboard_rtf, rows_to_csv,
Expand Down Expand Up @@ -1782,6 +1786,7 @@ def start_autocontrol_gui(*args, **kwargs):
"localize_changes", "rank_changes",
"classify_widget", "box_features", "classify_icon",
"propose_elements", "tag_kinds",
"act_in_view", "ScrollPlan", "act_with_mode",
"build_rtf", "rtf_to_text", "rows_to_csv", "csv_to_rows",
"set_clipboard_rtf", "get_clipboard_rtf",
"set_clipboard_csv", "get_clipboard_csv",
Expand Down
29 changes: 29 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -4522,6 +4522,35 @@ def _add_work_queue_specs(specs: List[CommandSpec]) -> None:
),
description="Index where a busy/idle series first settles idle.",
))
specs.append(CommandSpec(
"AC_act_in_view", "Flow", "Act In View (scroll + click)",
fields=(
FieldSpec("target", FieldType.STRING,
placeholder="template path or text"),
FieldSpec("kind", FieldType.STRING, optional=True,
default="image", placeholder="image / text"),
FieldSpec("direction", FieldType.STRING, optional=True,
default="down", placeholder="up / down"),
FieldSpec("max_scrolls", FieldType.INT, optional=True, default=10),
FieldSpec("scroll_amount", FieldType.INT, optional=True,
default=3),
FieldSpec("button", FieldType.STRING, optional=True,
default="left"),
),
description="Scroll a target into view, then click it when actionable.",
))
specs.append(CommandSpec(
"AC_act_with_mode", "Flow", "Click with Mode (auto/trial/force)",
fields=(
FieldSpec("x", FieldType.INT, placeholder="x"),
FieldSpec("y", FieldType.INT, placeholder="y"),
FieldSpec("mode", FieldType.STRING, optional=True, default="auto",
placeholder="auto / trial / force"),
FieldSpec("button", FieldType.STRING, optional=True,
default="left"),
),
description="Click a point under an action mode (gate / dry-run / force).",
))
specs.append(CommandSpec(
"AC_simulate_cvd", "Image", "Simulate Colour-Vision Deficiency",
fields=(
Expand Down
6 changes: 6 additions & 0 deletions je_auto_control/utils/act_in_view/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
"""Scroll a target into view, then act on it once it is actionable."""
from je_auto_control.utils.act_in_view.act_in_view import (
ScrollPlan, act_in_view,
)

__all__ = ["act_in_view", "ScrollPlan"]
70 changes: 70 additions & 0 deletions je_auto_control/utils/act_in_view/act_in_view.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
"""Scroll a target into view, then act on it only once it is actionable.

Two reliability primitives the framework already had stayed separate:
``scroll_find.scroll_until_visible`` brings an off-screen target on-screen, and
``actionability.act_when_ready`` waits for a target to be visible / stable /
enabled / unoccluded before acting. A real "click the row three pages down" step
needs *both* — scroll to it, then gate before clicking. ``act_in_view`` composes
them into one call.

* :class:`ScrollPlan` — bundles the scroll search (``kind`` / ``direction`` /
``max_scrolls`` / ``scroll_amount``) and its injectable ``locator`` /
``scroller`` seams, so the composed call stays within a sane argument count.
* :func:`act_in_view` — scroll until the target is found, then run the
actionability gate at its location and perform ``action`` on it.

All seams (locator / scroller / action / actionability probes / clock) are
injectable, so the whole flow is testable without a screen. Reuses
:func:`scroll_find.scroll_until_visible` and
:func:`actionability.act_when_ready`. Imports no ``PySide6``.
"""
from dataclasses import dataclass
from typing import Any, Callable, Dict, List, Optional

from je_auto_control.utils.actionability import GateConfig, act_when_ready
from je_auto_control.utils.exception.exceptions import AutoControlActionException
from je_auto_control.utils.scroll_find import scroll_until_visible
from je_auto_control.utils.scroll_find.scroll_find import Locator, Scroller


@dataclass
class ScrollPlan:
"""How to scroll while searching for the target (with injectable seams)."""

kind: str = "image"
direction: str = "down"
max_scrolls: int = 10
scroll_amount: int = 3
locator: Optional[Locator] = None
scroller: Optional[Scroller] = None


def act_in_view(target: str, action: Callable[[List[int]], Any], *,
scroll: Optional[ScrollPlan] = None,
region_sampler: Optional[Callable[[Any], Any]] = None,
enabled_probe: Optional[Callable[[], Optional[bool]]] = None,
hit_tester: Optional[Callable[[List[int]], bool]] = None,
config: Optional[GateConfig] = None) -> Dict[str, Any]:
"""Scroll ``target`` into view, gate on actionability, then ``action`` it.

Scrolls per ``scroll`` (a :class:`ScrollPlan`) until ``target`` is located,
then runs :func:`actionability.act_when_ready` at the found point and calls
``action(center_point)``. Raises ``AutoControlActionException`` if the target
never comes into view. The actionability probes / ``config`` are injectable
and forwarded to the gate. Returns ``{acted, coords, scrolls, result}``.
"""
plan = scroll if scroll is not None else ScrollPlan()
found = scroll_until_visible(
target, kind=plan.kind, direction=plan.direction,
max_scrolls=plan.max_scrolls, scroll_amount=plan.scroll_amount,
locator=plan.locator, scroller=plan.scroller)
if not found["found"]:
raise AutoControlActionException(
f"target {target!r} not in view after {found['scrolls']} scrolls")
cx, cy = int(found["coords"][0]), int(found["coords"][1])
result = act_when_ready(action, lambda: (cx, cy, 1, 1),
region_sampler=region_sampler,
enabled_probe=enabled_probe, hit_tester=hit_tester,
config=config)
return {"acted": True, "coords": [cx, cy], "scrolls": found["scrolls"],
"result": result}
4 changes: 4 additions & 0 deletions je_auto_control/utils/act_modes/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
"""Trial and force action modes over the actionability gate."""
from je_auto_control.utils.act_modes.act_modes import ACT_MODES, act_with_mode

__all__ = ["act_with_mode", "ACT_MODES"]
Loading
Loading