Skip to content

docs: consolidated Evals docs (Performance redesign + workforce evals)#662

Draft
jordanc-relevanceai wants to merge 6 commits into
mainfrom
consolidate/evals
Draft

docs: consolidated Evals docs (Performance redesign + workforce evals)#662
jordanc-relevanceai wants to merge 6 commits into
mainfrom
consolidate/evals

Conversation

@jordanc-relevanceai
Copy link
Copy Markdown
Collaborator

This PR consolidates 2 drafter PRs that both modify the agent Evals page. Opened as draft for review before the source PRs are closed.

Source PRs (being closed in favor of this one)

Why consolidated

Both edit build/agents/build-your-agent/evals.mdx. #645 is a near-total rewrite (renames the Monitor tab to Evaluate/Performance, "Evaluators" to "checks", documents multiple Performance dashboards). #542's additions to the same file would conflict badly with that rewrite and reintroduce the old "Monitor"/"Evaluators" terminology. Consolidating lets the rewrite land while keeping #542's genuinely new workforce-evals content.

Changes by source PR

#645 — TSP-1256

  • Rewrites evals.mdx: Monitor → Performance tab, dashboard list page, creating/configuring Performance dashboards, per-check charts, "Use cases for multiple dashboards", and replaces "evaluators" with "checks" throughout.

#542 — TSP-1095

  • Adds a workforce evals <Note> to the top of evals.mdx: evals also run against entire workforces, currently API-only via the Relevance AI MCP / eval API.
  • Adds an "Evaluate a workforce" example accordion to get-started/core-concepts/programmatic-gtm.mdx.

Reconciliation note

github-actions Bot and others added 6 commits May 6, 2026 16:28
Adds evals page for workforces covering generate-and-score and score-only
modes, evaluator types, key differences from agent evals, and when to use
each mode. Updates docs.json navigation to include the new page.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Workforce evals BE shipped (relevance-api-node #12943) but FE is still
in flight. Replacing the standalone workforce evals page with a small
note on the agent evals page and a workforce prompt example on the
Programmatic GTM intro — pointing users at the MCP/API path that
actually works today.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both embeds were using kebab-case 'padding-top' (invalid in JSX style
objects), 56.75% instead of 56.25%, and a single-line wrapper that
didn't match the standard snippet. Swapped in the canonical wrapper
from the style guide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… redesign

Reflects the redesigned Performance tab (formerly Monitor):
- Renames Monitor section to Performance throughout
- Documents the new dashboard list page with preview cards
- Adds steps for creating and configuring dashboards via the settings drawer
- Describes individual dashboard views (charts, version markers, run history table)
- Adds instructions for updating dashboard settings post-creation
- Adds new "Use cases for multiple dashboards" section with CardGroup
- Replaces all "evaluators" references with "checks"
- Updates best practices and FAQs to cover Performance-specific questions

Linear: https://linear.app/relevance/issue/TSP-1256/

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
645 rewrote evals.mdx (Monitor->Performance, Evaluators->checks). 542's only
substantive evals.mdx change was a workforce-evals Note; its other hunks were
cosmetic iframe reformatting, dropped in favor of 645's rewrite. Kept 542's
workforce Note (still accurate) and its MCP 'Evaluate a workforce' example.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mintlify
Copy link
Copy Markdown
Contributor

mintlify Bot commented Jun 3, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
relevanceai 🟢 Ready View Preview Jun 3, 2026, 9:51 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant