Skip to content

[v4]: add /elementInfo route stub#1877

Open
seanmcguire12 wants to merge 3 commits intomainfrom
seanmcguire/stg-1652-add-pageelementinfo-route-stub
Open

[v4]: add /elementInfo route stub#1877
seanmcguire12 wants to merge 3 commits intomainfrom
seanmcguire/stg-1652-add-pageelementinfo-route-stub

Conversation

@seanmcguire12
Copy link
Member

@seanmcguire12 seanmcguire12 commented Mar 23, 2026

why

  • to add a single /elementInfo route which encapsulates multiple locator functions

what changed

  • added an /elementInfo route, which accepts a Selector, and returns the following info:
    • count
    • isVisible
    • isChecked
    • inputValue
    • textContent
    • innerHTML
    • innerText
    • centroid

screenshot:

Screenshot 2026-03-23 at 5 10 55 PM

test plan

  • added a test which verifies that the /elementInfo route accepts a request

Summary by cubic

Adds the v4 page.elementInfo endpoint (POST /v4/page/elementInfo) to return common element details in one call. The handler is a stub that returns default values to unblock client work.

  • New Features
    • Implemented page.elementInfo route that accepts a Selector and returns: count, isVisible, isChecked, inputValue, textContent, innerHTML, innerText, centroid, plus optional selector, elementType, backendNodeId, ariaRole, ariaAttributes, attrs.
    • Wired the route into v4 page routes, added request/action/result/response schemas to OpenAPI, and added an integration test posting an XPath selector to assert a successful action.

Written for commit 03b7e08. Summary will update on new commits. Review in cubic

@changeset-bot
Copy link

changeset-bot bot commented Mar 23, 2026

⚠️ No Changeset found

Latest commit: 03b7e08

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Client as External API Client
    participant API as Fastify Server (v4)
    participant Schema as Zod / OpenAPI Schemas
    participant Handler as ElementInfo Route Handler
    participant Action as Page Action Handler Wrapper

    Note over Client,Action: NEW: Request Flow for /page/elementInfo

    Client->>API: POST /v4/page/elementInfo
    activate API
    Note right of Client: Headers: Session-ID<br/>Body: Selector (XPath/CSS/Text)

    API->>Schema: NEW: Validate Request Body
    Schema-->>API: PageElementInfoRequest valid

    API->>Handler: Dispatch to elementInfoRoute
    activate Handler

    Handler->>Action: NEW: createPageActionHandler()
    activate Action

    Action->>Handler: execute() (Internal)
    
    Note over Handler: STUB: Currently returns<br/>placeholder values
    
    Handler->>Schema: NEW: Parse Result (PageElementInfoResultSchema)
    Schema-->>Handler: Validated Info (count, visibility, centroid, etc.)

    Handler-->>Action: Return Result
    deactivate Action

    Action-->>API: Wrap in PageElementInfoResponse
    deactivate Handler

    API-->>Client: 200 OK (JSON Result)
    deactivate API

    Note over Client,API: Result includes: count, isVisible, isChecked, <br/>inputValue, textContent, innerHTML, innerText, centroid
Loading

@pirate
Copy link
Member

pirate commented Mar 24, 2026

I would also add:

  • selector: fully hydrated selector {xpath,css,x,y,text,elementType,backendNodeId}
  • ariaRole
  • ariaAttributes
  • attrs (element attributes like {name,id,data-*,href,style,...})

@seanmcguire12 seanmcguire12 force-pushed the seanmcguire/stg-1652-add-pageelementinfo-route-stub branch from eda8d77 to 03b7e08 Compare March 24, 2026 19:00
Comment on lines +869 to +870
isChecked: z.boolean().default(false),
inputValue: z.string().default(""),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are redundant with attrs below, I dont think we need to send both.

If you do want to break it out then I think it should just be value and value=1 or value=checked for checkboxes that are checked

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • isChecked needs to look at the checked property on the HTMLInputElement, so it wouldn't show in attrs
  • inputValue also reads from the value property, so it also wouldn't show in attrs

export const PageElementInfoResultSchema = z
.object({
count: z.number().int().nonnegative().default(0),
isVisible: z.boolean().default(false),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
isVisible: z.boolean().default(false),
isInViewport: z.boolean().default(false), // e.g. scrolled out of view
isHidden: z.boolean().default(false), // e.g. display: hidden | none
isOccluded: z.boolean().default(false), // e.g. elem is behind a modal, not clickable until modal is dismissed

isVisible is misleadingly simple, we should split this up so consumers can actually determine what is needed in order to click an element (closing some occluding element, scrolling, un-hiding, etc. all require different steps)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

down to have multiple visibility related fields. should we have them all nested in an outer visibility object though?

backendNodeId: z.number().int().optional(),
ariaRole: z.string().optional(),
ariaAttributes: z.record(z.string(), z.string()).optional(),
attrs: z.record(z.string(), z.string()).optional(),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
attrs: z.record(z.string(), z.string()).optional(),
attributes: z.record(z.string(), z.string()),
style: z.record(z.string(), z.unknown()),

We should add computed CSS styles so that the agent can check for things like display: none, display: hidden, opacity: 0, color: green, etc.).

Also attributes should never be none, we should just return {} to avoid more unions. Try to match the naming and shapes that the DOM Element API uses to minimize confusion for agents and users: https://developer.mozilla.org/en-US/docs/Web/API/Element

style should be the same object shape as the DOM API: CSSStyleDeclaration

.default({ x: 0, y: 0 }),
selector: ResultSelectorSchema.optional(),
elementType: z.string().optional(),
backendNodeId: z.number().int().optional(),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
backendNodeId: z.number().int().optional(),
backendNodeId: z.bigint().positive(),

I think this is always available, no? Also it should never be negative.

isVisible: z.boolean().default(false),
isChecked: z.boolean().default(false),
inputValue: z.string().default(""),
textContent: z.string().default(""),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
textContent: z.string().default(""),

This is redundant with selector.text and innerText I think

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +874 to +880
centroid: z
.object({
x: z.number(),
y: z.number(),
})
.strict()
.default({ x: 0, y: 0 }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
centroid: z
.object({
x: z.number(),
y: z.number(),
})
.strict()
.default({ x: 0, y: 0 }),
domrects: z.array(z
.object({
x: z.number(),
y: z.number(),
top: z.number().optional(),
bottom: z.number().optional(),
right: z.number().optional(),
left: z.number().optional(),
})
.strict())

This should be an array of DOMRects matching the shape returned by elem.getClientRects(). I has to be an array because many inline elements like <span>, <p> etc. often wrap across multiple lines and have multiple bounding boxes. Centroid-based clicking is useless for most inline elements subject to wrapping unless you provide the full list of bounding rects.

Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BU tried a bunch of really intricate heuristics to try and solve this problem because it was a big issue with links that wrap across lines. e.g. try 2px in from the upper left corner of the first bounding box, then try 4px in from the bottom right corner of the last bounding box, etc. based on various conditions

a naiive centroid click would land in the middle of the space between the two bounding boxex, which is actually outside both of them, so the click does nothing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, centroid is a bit too leaky. can make it an array with multiple rects. do you think we need top, right, bottom, & left though? or should we just make it x, y, width, height ?

.strict()
.default({ x: 0, y: 0 }),
selector: ResultSelectorSchema.optional(),
elementType: z.string().optional(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
elementType: z.string().optional(),
elementType: z.string(),


export const PageElementInfoResultSchema = z
.object({
count: z.number().int().nonnegative().default(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we dont return the whole list of matching elements here then we need to a separate endpoint to get the whole list, or at least a param like idx that we can pass to "get the second matching element".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this route should only return details for a single element:

  • if there are multiple matches, this route will just take the first match
  • eg, if the user wants info on the third match, they must provide a selector that is specific enough to reach the third match

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats really difficult to do if you dont know the nesting level at which they are peers, e.g.

/div/div/div to get the second match thee is no way to know which is needed:

  • /div/div/div[2]
  • /div/div[2]/div
  • /div[2]/div/div

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats why the ElementSelectorSchema provides an idx param

elementType: z.string().optional(),
backendNodeId: z.number().int().optional(),
ariaRole: z.string().optional(),
ariaAttributes: z.record(z.string(), z.string()).optional(),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ariaAttributes: z.record(z.string(), z.string()).optional(),
aria: z.record(z.string(), z.unknown()),

Just return {} if empty, not undefined. Also all aria-related stuff should be under a single key imo, there are lots of attrs we might want to add in the future and it's easier if they're nested under one type instead of flat at the top level:

https://developer.mozilla.org/en-US/docs/Web/Accessibility/ARIA/Reference/Attributes

selector: ResultSelectorSchema.optional(),
elementType: z.string().optional(),
backendNodeId: z.number().int().optional(),
ariaRole: z.string().optional(),
Copy link
Member

@pirate pirate Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ariaRole: z.string().optional(),

I think this should be under a single aria: {...} shape, see below.

Copy link
Member

@pirate pirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of small changes but in general I recommend just copying the native shape that DOM APIs provide. Lets not introduce novel concepts and names for things that have had stable shapes for 20yr+

https://developer.mozilla.org/en-US/docs/Web/API/Element

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants