Skip to content

Add CoW Protocol order-batch debug skill#4384

Draft
fleupold wants to merge 1 commit intomainfrom
skill/batch_debug
Draft

Add CoW Protocol order-batch debug skill#4384
fleupold wants to merge 1 commit intomainfrom
skill/batch_debug

Conversation

@fleupold
Copy link
Copy Markdown
Contributor

@fleupold fleupold commented May 5, 2026

Description

Documents how to debug why a batch of orders failed to execute or executed slowly.

Changes

Skill file added

How to test

  1. Compile a list of slow or failed order uids as a csv
  2. Ask claude to analyze failure reasons

Note that for large files this skill may use significant credits

Documents how to debug why a batch of orders failed to execute or executed
slowly. Companion to the single-order and quote-verification skills; aimed
at the case where you receive a CSV of order UIDs and want a per-order
classification plus per-quoter aggregates.

Per order, the skill produces:
  order_id, expired, expired_detail, quoter, quoter_name,
  did_bid, bid_layer, discard_reason

The seven-step procedure:

1. Bulk-fetch order details from the orderbook API (status, quote.solver,
   validTo). Old orders that 404 are recorded as `unknown` rather than
   silently dropped.
2. Per-order lifecycle from `debug.cow.fi/api/orders/{uid}/events` —
   the last `OrderEventLabel` deterministically classifies an order as
   expired-at-validTo vs removed-early (invalid / filtered / cancelled /
   never-qualified).
3. Solver address ↔ name (and URL) mapping from autopilot's `Creating
   solver` log.
4. Autopilot `proposed solution` per quoter (OR-batched ≤30 UIDs per
   query — backtick-escape `parsed.spans./solve.solver`).
5. Driver-side `discarded solution: settlement encoding` for in-cluster
   solvers, with `parsed.fields.err` bucketed into solver-account-out-of-gas,
   simulation revert, simulation OOG, signature/permit failure.
6. Combine into a CSV; merge `proposed`/`discarded` sets so the same order
   can show `bid_layer = both` when multiple solutions for the same order
   land on different sides.
7. Per-quoter summary table + dominant-root-cause paragraph.

Co-location is detected purely from logs (no infra-repo access required):
the autopilot's `Creating solver` log carries each solver's URL, and a
host suffix of `.svc.cluster.local` indicates an in-cluster solver whose
driver logs are queryable. A driver-pod log-presence stats query is the
fallback / cross-check — zero hits ⇒ assume co-located, regardless of URL.
Co-located solvers are opaque to us: `did_bid` becomes `unknown`, never
`no`, when only autopilot-side data is available.

Caveats called out: log retention windows, OR-chunk sizing and the
backticks-vs-quotes pitfall on slash-containing field paths, the
`parsed.fields.orders` debug-string format that needs regex extraction,
and the fact that solvers can be promoted/demoted between deploys (pull
`Creating solver` for a window overlapping the orders' time range, not
"now"). A pre-canned query reference at the end covers the common
follow-ups (any-bidder-on-order, risk-detector exclusion).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant