docs: consolidated Evals docs (Performance redesign + workforce evals)#662
Draft
jordanc-relevanceai wants to merge 6 commits into
Draft
docs: consolidated Evals docs (Performance redesign + workforce evals)#662jordanc-relevanceai wants to merge 6 commits into
jordanc-relevanceai wants to merge 6 commits into
Conversation
Adds evals page for workforces covering generate-and-score and score-only modes, evaluator types, key differences from agent evals, and when to use each mode. Updates docs.json navigation to include the new page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Workforce evals BE shipped (relevance-api-node #12943) but FE is still in flight. Replacing the standalone workforce evals page with a small note on the agent evals page and a workforce prompt example on the Programmatic GTM intro — pointing users at the MCP/API path that actually works today. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both embeds were using kebab-case 'padding-top' (invalid in JSX style objects), 56.75% instead of 56.25%, and a single-line wrapper that didn't match the standard snippet. Swapped in the canonical wrapper from the style guide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… redesign Reflects the redesigned Performance tab (formerly Monitor): - Renames Monitor section to Performance throughout - Documents the new dashboard list page with preview cards - Adds steps for creating and configuring dashboards via the settings drawer - Describes individual dashboard views (charts, version markers, run history table) - Adds instructions for updating dashboard settings post-creation - Adds new "Use cases for multiple dashboards" section with CardGroup - Replaces all "evaluators" references with "checks" - Updates best practices and FAQs to cover Performance-specific questions Linear: https://linear.app/relevance/issue/TSP-1256/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
645 rewrote evals.mdx (Monitor->Performance, Evaluators->checks). 542's only substantive evals.mdx change was a workforce-evals Note; its other hunks were cosmetic iframe reformatting, dropped in favor of 645's rewrite. Kept 542's workforce Note (still accurate) and its MCP 'Evaluate a workforce' example. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
This was referenced Jun 3, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR consolidates 2 drafter PRs that both modify the agent Evals page. Opened as draft for review before the source PRs are closed.
Source PRs (being closed in favor of this one)
Why consolidated
Both edit
build/agents/build-your-agent/evals.mdx. #645 is a near-total rewrite (renames the Monitor tab to Evaluate/Performance, "Evaluators" to "checks", documents multiple Performance dashboards). #542's additions to the same file would conflict badly with that rewrite and reintroduce the old "Monitor"/"Evaluators" terminology. Consolidating lets the rewrite land while keeping #542's genuinely new workforce-evals content.Changes by source PR
#645 — TSP-1256
evals.mdx: Monitor → Performance tab, dashboard list page, creating/configuring Performance dashboards, per-check charts, "Use cases for multiple dashboards", and replaces "evaluators" with "checks" throughout.#542 — TSP-1095
<Note>to the top ofevals.mdx: evals also run against entire workforces, currently API-only via the Relevance AI MCP / eval API.get-started/core-concepts/programmatic-gtm.mdx.Reconciliation note
evals.mdxin full, then grafted in docs(TSP-1095): add workforce evaluations documentation #542's workforce-evals<Note>(it's still accurate — it doesn't reference the old tab names).evals.mdxhunks: they were cosmetic iframe-embed reformatting that conflicted with docs(TSP-1256): update agent evals page for multi-dashboard Performance tab redesign #645's rewrite and carried no content. Heads-up: docs(TSP-1095): add workforce evaluations documentation #542's PR description mentioned a separateworkforce-features/evals.mdxpage, but that file was not in its branch's actual diff — its only real contributions were the Note and the MCP example, both preserved here.