Skip to content

Add local ART-E email agent example#675

Closed
poofeth wants to merge 3 commits into
OpenPipe:mainfrom
poofeth:example/art-e-email-agent
Closed

Add local ART-E email agent example#675
poofeth wants to merge 3 commits into
OpenPipe:mainfrom
poofeth:example/art-e-email-agent

Conversation

@poofeth
Copy link
Copy Markdown

@poofeth poofeth commented May 11, 2026

Summary

  • add a local examples/art-e Python package for an ART-E style email research agent
  • include deterministic inbox fixtures, search/read tools, LangGraph rollout, ART training entrypoint, and scoring helpers
  • add provider-free rollout/train smoke tests that monkeypatch LangGraph model creation and verify ART-E env handling
  • make Weave tracing optional for this example so local smoke tests can run even if Weave's transitive gql import is incompatible in a fresh environment
  • document offline checks/training and link the local example from the root README

Scope

This is a local deterministic ART-E-style example. The tests mock model calls and require no Gmail account, paid API, or private credentials. Real training uses art_e.train with a configured inference endpoint.

Bounty

Validation

  • uv run --project examples/art-e pytest examples/art-e/tests -q - 6 passed
  • (cd examples/art-e && uv run python main.py)
  • uv run --project examples/art-e ruff check .
  • uv run --project examples/art-e ruff format --check .
  • python -m compileall examples/art-e/art_e examples/art-e/main.py examples/art-e/tests
  • git diff --check

@poofeth
Copy link
Copy Markdown
Author

poofeth commented May 11, 2026

Added provider-free smoke coverage for the ART-E entrypoints: the new test monkeypatches LangGraph chat/model creation, drives the final-answer tool, verifies reward/scoring, and checks build_model reads the ART-E env vars. Validation rerun: uv run --project examples/art-e pytest examples/art-e/tests -q (6 passed), (cd examples/art-e && uv run python main.py), ruff check/format-check, compileall, and git diff --check.

@poofeth
Copy link
Copy Markdown
Author

poofeth commented May 11, 2026

Removed the generated per-example uv.lock from the PR to keep the example diff focused; this cuts 6,619 generated lines without changing the source package.\n\nReran validation after removal:\n\ntext\n$ uv run --project examples/art-e pytest examples/art-e/tests -q\n...... [100%]\n6 passed, 7 warnings in 3.01s\n\n$ (cd examples/art-e && uv run python main.py)\nScenario: Where and when is the team offsite?\nRetrieved: Team offsite logistics\nReward: 0.93\n\n$ uv run --project examples/art-e ruff check .\nAll checks passed!\n\n$ uv run --project examples/art-e ruff format --check .\n311 files already formatted\n\n\nAlso updated the PR body with a scope note: tests are deterministic/provider-free; real training requires a configured inference endpoint.

@poofeth
Copy link
Copy Markdown
Author

poofeth commented May 11, 2026

@poofeth poofeth closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant