Skip to content

Cli install sdk evals#85

Open
ViaDézo1er / cedric (viadezo1er) wants to merge 3 commits intomainfrom
cedric/cli-install-sdk-evals
Open

Cli install sdk evals#85
ViaDézo1er / cedric (viadezo1er) wants to merge 3 commits intomainfrom
cedric/cli-install-sdk-evals

Conversation

@viadezo1er
Copy link
Copy Markdown

bt setup instrument: interactive mode, language selection, and scoped permissions

Adds three new flags to bt setup instrument and wires them end-to-end through agent invocation and task generation.

--interactive / -i opens the agent in its interactive TUI (Claude Code, etc.) so the user can review and approve each tool use.
--yolo runs the agent in the background with bypassPermissions — no approval prompts.
--language <LANG> restricts instrumentation to specific languages (python, typescript, go, java, ruby, csharp); repeatable; omit to let the agent auto-detect.

Run-mode prompt (interactive terminal, no flags)

When none of the above flags are passed and the terminal is interactive, the user is asked how to run the agent. Background mode uses acceptEdits with --allowedTools scoped to the package managers for the selected language(s) only (e.g. uv for Python, npm/yarn/pnpm for TypeScript, dotnet for C#). Interactive TUI mode opens the agent's terminal UI.

Language selection prompt

A multi-select prompt is shown between the workflow and run-mode prompts. Selecting "All languages" (the default) lets the agent auto-detect; selecting specific languages also narrows the background tool allowlist.


**How to obtain the permalink:**

Most language SDKs print a direct URL to the emitted trace after the app runs. Capture that URL and print it.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is true outside of our sample applications. Especially with auto-instrumentation, the app will usually not print a url to the trace.

In mcp, the coding agent can fetch recent objects and get a permalink that way. Does the bt cli agent have a similar means to do this?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be super useful if we could figure this out. Getting a valid url was very helpful feedback to the agent when it was installing via mcp. Not sure if this is possible in bt cli though

write_text_file(&dir.join("java.md"), JAVA_DOCS)?;
write_text_file(&dir.join("csharp.md"), CSHARP_DOCS)?;
write_text_file(&dir.join("braintrust-url-formats.md"), URL_FORMATS_DOCS)?;
write_text_file(&dir.join("_index.md"), INDEX)?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should instrument-task.md be in this list too?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's passed as the first prompt to the agent (wether the agent is in the background or in claude code/codex/...), and tweaked a bit depending on the options chosen.
See src/setup/mod.rs lines 27 and 1475

Adds an optional, repeatable `--language` flag to `bt setup instrument`
that lets callers specify the target language(s) directly, bypassing the
agent's language auto-detection step.

Accepted values (case-insensitive):
  python, typescript, javascript, go, csharp, c#, java, ruby

`typescript` and `javascript` are treated as the same language;
duplicate values are deduplicated before being passed to the agent.

When one or more languages are provided the rendered task prompt
includes a "Language Override" section telling the agent to skip Step 2
(auto-detection) and instrument the specified language(s) directly.

Also fixes a pre-existing compile error in tests where
`render_instrument_task` was already called with a `workflows` argument
that the implementation didn't accept, and adds the `{WORKFLOW_CONTEXT}`
placeholder so non-instrument workflows inject `bt` CLI guidance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@viadezo1er
Copy link
Copy Markdown
Author

ViaDézo1er / cedric (viadezo1er) commented Mar 27, 2026

CI passes after git rebase origin/main cedric/cli-install-sdk-evals

@github-actions
Copy link
Copy Markdown

Latest downloadable build artifacts for this PR commit bcec4727269d:

Available artifact names
  • ``artifacts-build-global
  • ``artifacts-build-local-x86_64-pc-windows-msvc
  • ``artifacts-build-local-x86_64-apple-darwin
  • ``artifacts-build-local-x86_64-unknown-linux-musl
  • ``artifacts-build-local-aarch64-apple-darwin
  • ``artifacts-build-local-x86_64-unknown-linux-gnu
  • ``artifacts-build-local-aarch64-unknown-linux-musl
  • ``artifacts-build-local-aarch64-unknown-linux-gnu
  • ``artifacts-plan-dist-manifest
  • ``cargo-dist-cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants