Replace Playwright with Kernel native API in OpenAI CUA templates by rgarcia · Pull Request #124 · kernel/cli

rgarcia · 2026-02-25T23:26:19Z

Summary

Replace Playwright (over CDP) with Kernel's native computer control API in both TypeScript and Python OpenAI CUA templates
Add batch_computer_actions function tool that executes multiple browser actions in a single API call, reducing latency
Add local test scripts (test.local.ts / test_local.py) that create remote Kernel browsers for testing without deploying a Kernel app

Details

New KernelComputer class (TS + Python) wraps the Kernel SDK for all computer actions:

captureScreenshot, clickMouse, typeText, pressKey, scroll, moveMouse, dragMouse
batch endpoint for batched actions
playwright.execute for navigation (goto, back, forward, getCurrentUrl)
CUA key name to X11 keysym translation map (ported from Go reference implementation)
Button normalization (CUA model sends numeric button values 1/2/3 in batch calls)

Batch tool: System instructions guide the model to prefer batch_computer_actions for predictable sequences (e.g., click + type + enter).

Removed dependencies: playwright-core, sharp (TS), playwright (Python). Bumped @onkernel/sdk to ^0.38.0 and kernel to >=0.38.0.

Test plan

TypeScript test.local.ts E2E: created remote Kernel browser, ran CUA agent (eBay search task), batch tool used successfully, browser cleaned up
Python test_local.py E2E: same test, batch tool used on first action (type + enter), agent completed successfully
TypeScript compiles cleanly (tsc --noEmit)

Made with Cursor

Note

Medium Risk
Moderate risk because it replaces the core browser-control implementation and tool-calling flow (including new batched actions) across both Python and TypeScript templates, plus dependency upgrades. Impact is limited to sample templates, but regressions could break local/deployed runs and logging/output formatting.

Overview
Migrates both the Python and TypeScript OpenAI CUA templates from Playwright-over-CDP to Kernel’s native computer control API, introducing new KernelComputer wrappers that implement screenshot/mouse/keyboard/scroll/drag via Kernel endpoints.

Adds a batch_computer_actions function tool and model instructions to encourage batching predictable action sequences, plus new event-based logging (text and jsonl) that emits prompt/reasoning/text deltas, action descriptions, screenshots, and backend SDK timing.

Updates docs and env examples to require KERNEL_API_KEY, adds local runners (run_local.py, run_local.ts) and makes app entrypoints runnable locally, and removes Playwright/sharp/pillow-related code while bumping Kernel SDK dependencies to >=0.38.0 / ^0.38.0.

^{Written by Cursor Bugbot for commit dcb16c7. This will update automatically on new commits. Configure here.}

Both TypeScript and Python OpenAI CUA templates now use Kernel's native computer control API (screenshot, click, type, scroll, batch, etc.) instead of Playwright over CDP. This enables the batch_computer_actions tool which executes multiple actions in a single API call for lower latency. Key changes: - New KernelComputer class wrapping Kernel SDK for all computer actions - Added batch_computer_actions function tool with system instructions - Navigation (goto/back/forward) via Kernel's playwright.execute endpoint - Local test scripts create remote Kernel browsers without app deployment - Removed playwright-core, sharp (TS) and playwright (Python) dependencies - Bumped @onkernel/sdk to ^0.38.0 and kernel to >=0.38.0 Made-with: Cursor

cursor · 2026-02-25T23:40:32Z

pkg/templates/typescript/openai-computer-use/lib/agent.ts

      }
+
+      const currentUrl = await this.computer.getCurrentUrl();
+      utils.checkBlocklistedUrl(currentUrl);


TypeScript URL blocklist check return value silently ignored

Medium Severity

checkBlocklistedUrl returns a boolean, but agent.ts discards the return value, making the URL blocklist entirely non-functional. The Python counterpart correctly raises a ValueError to halt execution. Previously, Playwright's route-level route.abort() handler provided actual network-level blocking, but that was removed in this PR, leaving no working URL blocking in the TypeScript template.

Additional Locations (1)

pkg/templates/typescript/openai-computer-use/lib/utils.ts#L42-L49

cursor · 2026-02-25T23:40:32Z

pkg/templates/python/openai-computer-use/computers/config.py

+from .kernel_computer import KernelComputer

 computers_config = {
-    "local-playwright": LocalPlaywrightBrowser,


Updated config.py is now dead code

Low Severity

config.py was updated in this PR to reference KernelComputer, but computers/__init__.py no longer imports or exports computers_config. No other file references it either, making this entire file dead code.

cursor · 2026-02-25T23:40:32Z

pkg/templates/python/openai-computer-use/computers/kernel_computer.py

+        return "left"
+    if isinstance(button, int):
+        return {1: "left", 2: "middle", 3: "right"}.get(button, "left")
+    return str(button)


Missing handling for special click button values

Medium Severity

The CUA model can send click actions with button set to "back", "forward", or "wheel". The deleted Playwright code explicitly handled these by routing to self.back(), self.forward(), or mouse.wheel(). The new _normalize_button/normalizeButton functions pass these strings through unchanged to the Kernel click_mouse API, which only accepts "left", "right", or "middle" — causing an API error when the model uses these button types.

Additional Locations (1)

pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts#L92-L104

cursor · 2026-02-25T23:44:28Z

Bugbot Autofix prepared fixes for 3 of the 3 bugs found in the latest run.

✅ Fixed: TypeScript URL blocklist check return value silently ignored
- Changed checkBlocklistedUrl from returning a boolean to throwing an Error when a blocked URL is detected, matching the Python counterpart's ValueError behavior.
✅ Fixed: Updated config.py is now dead code
- Deleted the dead config.py file since computers_config is not imported or used anywhere in the codebase.
✅ Fixed: Missing handling for special click button values
- Added handling for 'back', 'forward', and 'wheel' button values in both Python and TypeScript KernelComputer.click() methods (routing to back/forward/scroll) and in batch translation functions (using Alt+Left/Right keypresses and scroll actions).

Or push these changes by commenting:

@cursor push a9e2870223

Preview (a9e2870223)

diff --git a/pkg/templates/python/openai-computer-use/computers/config.py b/pkg/templates/python/openai-computer-use/computers/config.py
deleted file mode 100644
--- a/pkg/templates/python/openai-computer-use/computers/config.py
+++ /dev/null
@@ -1,5 +1,0 @@
-from .kernel_computer import KernelComputer
-
-computers_config = {
-    "kernel": KernelComputer,
-}
\ No newline at end of file

diff --git a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
--- a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
+++ b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
@@ -74,12 +74,22 @@
 def _translate_cua_action(action: Dict[str, Any]) -> Dict[str, Any]:
     action_type = action.get("type", "")
     if action_type == "click":
+        button = action.get("button")
+        if button == "back":
+            return {"type": "press_key", "press_key": {"keys": ["Alt_L", "Left"]}}
+        if button == "forward":
+            return {"type": "press_key", "press_key": {"keys": ["Alt_L", "Right"]}}
+        if button == "wheel":
+            return {
+                "type": "scroll",
+                "scroll": {"x": action.get("x", 0), "y": action.get("y", 0), "delta_x": 0, "delta_y": 0},
+            }
         return {
             "type": "click_mouse",
             "click_mouse": {
                 "x": action.get("x", 0),
                 "y": action.get("y", 0),
-                "button": _normalize_button(action.get("button")),
+                "button": _normalize_button(button),
             },
         }
     elif action_type == "double_click":
@@ -134,6 +144,15 @@
         return base64.b64encode(resp.read()).decode("utf-8")
 
     def click(self, x: int, y: int, button="left") -> None:
+        if button == "back":
+            self.back()
+            return
+        if button == "forward":
+            self.forward()
+            return
+        if button == "wheel":
+            self.scroll(x, y, 0, 0)
+            return
         self.client.browsers.computer.click_mouse(self.session_id, x=x, y=y, button=_normalize_button(button))
 
     def double_click(self, x: int, y: int) -> None:

diff --git a/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts b/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
--- a/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
+++ b/pkg/templates/typescript/openai-computer-use/lib/kernel-computer.ts
@@ -105,11 +105,18 @@
 
 function translateCuaAction(action: CuaAction): BatchAction {
   switch (action.type) {
-    case 'click':
+    case 'click': {
+      if (action.button === 'back')
+        return { type: 'press_key', press_key: { keys: ['Alt_L', 'Left'] } };
+      if (action.button === 'forward')
+        return { type: 'press_key', press_key: { keys: ['Alt_L', 'Right'] } };
+      if (action.button === 'wheel')
+        return { type: 'scroll', scroll: { x: action.x ?? 0, y: action.y ?? 0, delta_x: 0, delta_y: 0 } };
       return {
         type: 'click_mouse',
         click_mouse: { x: action.x ?? 0, y: action.y ?? 0, button: normalizeButton(action.button) },
       };
+    }
     case 'double_click':
       return {
         type: 'click_mouse',
@@ -168,6 +175,9 @@
   }
 
   async click(x: number, y: number, button: string | number = 'left'): Promise<void> {
+    if (button === 'back') { await this.back(); return; }
+    if (button === 'forward') { await this.forward(); return; }
+    if (button === 'wheel') { await this.scroll(x, y, 0, 0); return; }
     await this.client.browsers.computer.clickMouse(this.sessionId, {
       x,
       y,

diff --git a/pkg/templates/typescript/openai-computer-use/lib/utils.ts b/pkg/templates/typescript/openai-computer-use/lib/utils.ts
--- a/pkg/templates/typescript/openai-computer-use/lib/utils.ts
+++ b/pkg/templates/typescript/openai-computer-use/lib/utils.ts
@@ -40,12 +40,14 @@
   }
 }
 
-export function checkBlocklistedUrl(url: string): boolean {
+export function checkBlocklistedUrl(url: string): void {
   try {
     const host = new URL(url).hostname;
-    return BLOCKED_DOMAINS.some((d) => host === d || host.endsWith(`.${d}`));
-  } catch {
-    return false;
+    if (BLOCKED_DOMAINS.some((d) => host === d || host.endsWith(`.${d}`))) {
+      throw new Error(`Blocked URL: ${url}`);
+    }
+  } catch (e) {
+    if (e instanceof Error && e.message.startsWith('Blocked URL:')) throw e;
   }
 }

This adds CUA-style backend/action event rendering (with JSONL mode support), aligns dotenv/local-run behavior across TypeScript and Python templates, and renames local entry scripts to run_local for clearer usage. Made-with: Cursor

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

✅ Fixed: TS agent omits current_url from computer_call_output
- The TypeScript agent now conditionally fetches the browser URL, runs the blocklist check, and adds current_url to computer_call_output.output for browser environments.
✅ Fixed: Duplicated _describe_action functions across Python files
- The Python agent now imports and uses _describe_action and _describe_batch_actions from kernel_computer.py instead of duplicating that logic locally.

Or push these changes by commenting:

@cursor push 90332375b9

Preview (90332375b9)

diff --git a/pkg/templates/python/openai-computer-use/agent/agent.py b/pkg/templates/python/openai-computer-use/agent/agent.py
--- a/pkg/templates/python/openai-computer-use/agent/agent.py
+++ b/pkg/templates/python/openai-computer-use/agent/agent.py
@@ -1,7 +1,11 @@
 import json
 import time
 from typing import Any, Callable
-from computers.kernel_computer import KernelComputer
+from computers.kernel_computer import (
+    KernelComputer,
+    _describe_action,
+    _describe_batch_actions,
+)
 from utils import (
     create_response,
     show_image,
@@ -168,47 +172,6 @@
                 parts.append(text)
         return " ".join(parts) if parts else None
 
-    def _describe_action(self, action_type: str, action_args: dict[str, Any]) -> str:
-        if action_type == "click":
-            x = int(action_args.get("x", 0))
-            y = int(action_args.get("y", 0))
-            button = action_args.get("button", "left")
-            if button in ("", "left"):
-                return f"click({x}, {y})"
-            return f"click({x}, {y}, {button})"
-        if action_type == "double_click":
-            return f"double_click({int(action_args.get('x', 0))}, {int(action_args.get('y', 0))})"
-        if action_type == "type":
-            text = str(action_args.get("text", ""))
-            if len(text) > 60:
-                text = f"{text[:57]}..."
-            return f"type({text!r})"
-        if action_type == "keypress":
-            keys = action_args.get("keys", [])
-            return f"keypress({keys})"
-        if action_type == "scroll":
-            return (
-                f"scroll({int(action_args.get('x', 0))}, {int(action_args.get('y', 0))}, "
-                f"dx={int(action_args.get('scroll_x', 0))}, dy={int(action_args.get('scroll_y', 0))})"
-            )
-        if action_type == "move":
-            return f"move({int(action_args.get('x', 0))}, {int(action_args.get('y', 0))})"
-        if action_type == "drag":
-            return "drag(...)"
-        if action_type == "wait":
-            return f"wait({int(action_args.get('ms', 1000))}ms)"
-        if action_type == "screenshot":
-            return "screenshot()"
-        return action_type
-
-    def _describe_batch_actions(self, actions: list[dict[str, Any]]) -> str:
-        pieces: list[str] = []
-        for action in actions:
-            action_type = str(action.get("type", "unknown"))
-            action_args = {k: v for k, v in action.items() if k != "type"}
-            pieces.append(self._describe_action(action_type, action_args))
-        return "batch[" + " -> ".join(pieces) + "]"
-
     def _execute_computer_action(self, action_type, action_args):
         if action_type == "click":
             self.computer.click(**action_args)
@@ -256,7 +219,7 @@
                     typed_actions = [a for a in actions if isinstance(a, dict)]
                     payload = {
                         "action_type": "batch",
-                        "description": self._describe_batch_actions(typed_actions),
+                        "description": _describe_batch_actions(typed_actions),
                         "action": {"type": "batch", "actions": typed_actions},
                     }
                     if elapsed_ms is not None:
@@ -299,7 +262,7 @@
             elapsed_ms = self._current_model_elapsed_ms()
             payload = {
                 "action_type": action_type,
-                "description": self._describe_action(action_type, action_args),
+                "description": _describe_action(action_type, action_args),
                 "action": action,
             }
             if elapsed_ms is not None:

diff --git a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
--- a/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
+++ b/pkg/templates/python/openai-computer-use/computers/kernel_computer.py
@@ -148,6 +148,8 @@
         return "drag(...)"
     if action_type == "wait":
         return f"wait({int(action_args.get('ms', 1000))}ms)"
+    if action_type == "screenshot":
+        return "screenshot()"
     return action_type
 
 

diff --git a/pkg/templates/typescript/openai-computer-use/lib/agent.ts b/pkg/templates/typescript/openai-computer-use/lib/agent.ts
--- a/pkg/templates/typescript/openai-computer-use/lib/agent.ts
+++ b/pkg/templates/typescript/openai-computer-use/lib/agent.ts
@@ -189,9 +189,6 @@
         if (!this.ackCb(msg)) throw new Error(`Safety check failed: ${msg}`);
       }
 
-      const currentUrl = await this.computer.getCurrentUrl();
-      utils.checkBlocklistedUrl(currentUrl);
-
       const out: Omit<ResponseComputerToolCallOutputItem, 'id'> = {
         type: 'computer_call_output',
         call_id: cc.call_id,
@@ -201,6 +198,11 @@
           image_url: `data:image/png;base64,${screenshot}`,
         },
       };
+      if (this.computer.getEnvironment() === 'browser') {
+        const currentUrl = await this.computer.getCurrentUrl();
+        utils.checkBlocklistedUrl(currentUrl);
+        (out.output as { current_url?: string }).current_url = currentUrl;
+      }
       return [out as ResponseItem];
     }

cursor · 2026-03-03T17:05:18Z

pkg/templates/typescript/openai-computer-use/lib/agent.ts

+          image_url: `data:image/png;base64,${screenshot}`,
+        },
+      };
+      return [out as ResponseItem];


TS agent omits current_url from computer_call_output

Medium Severity

The computer_call_output for browser environments is missing the current_url field. The Python agent correctly includes it via call_output["output"]["current_url"] = current_url, which is part of the OpenAI CUA protocol for browser environments. Without this field, the model may lose track of the browser's current page across turns, potentially degrading navigation accuracy.

cursor · 2026-03-03T17:05:18Z

pkg/templates/python/openai-computer-use/agent/agent.py

+            action_type = str(action.get("type", "unknown"))
+            action_args = {k: v for k, v in action.items() if k != "type"}
+            pieces.append(self._describe_action(action_type, action_args))
+        return "batch[" + " -> ".join(pieces) + "]"


Duplicated _describe_action functions across Python files

Low Severity

_describe_action and _describe_batch_actions are fully duplicated — once as module-level functions in kernel_computer.py and again as instance methods in agent.py. The TypeScript version correctly defines these once in log-events.ts and imports them in both agent.ts and kernel-computer.ts. The Python agent could import the existing functions from kernel_computer.py instead of re-implementing them.

Additional Locations (1)

pkg/templates/python/openai-computer-use/computers/kernel_computer.py#L124-L160

cursor bot reviewed Feb 25, 2026

View reviewed changes

cursor bot reviewed Mar 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Playwright with Kernel native API in OpenAI CUA templates#124

Replace Playwright with Kernel native API in OpenAI CUA templates#124
rgarcia wants to merge 2 commits intomainfrom
rgarcia/cua-native-kernel-api

rgarcia commented Feb 25, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot Feb 25, 2026

Uh oh!

cursor bot Feb 25, 2026

Uh oh!

cursor bot Feb 25, 2026

Uh oh!

cursor bot commented Feb 25, 2026

Uh oh!

cursor bot left a comment •

edited

Loading

Uh oh!

cursor bot Mar 3, 2026

Uh oh!

cursor bot Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rgarcia commented Feb 25, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Test plan

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

TypeScript URL blocklist check return value silently ignored

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Updated config.py is now dead code

Uh oh!

cursor bot Feb 25, 2026

Choose a reason for hiding this comment

Missing handling for special click button values

Uh oh!

cursor bot commented Feb 25, 2026

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 3, 2026

Choose a reason for hiding this comment

TS agent omits current_url from computer_call_output

Uh oh!

cursor bot Mar 3, 2026

Choose a reason for hiding this comment

Duplicated _describe_action functions across Python files

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rgarcia commented Feb 25, 2026 •

edited by cursor bot

Loading

Updated `config.py` is now dead code

cursor bot left a comment •

edited

Loading

TS agent omits `current_url` from `computer_call_output`

Duplicated `_describe_action` functions across Python files