-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Description
Document is incomplete and vague and does not show how standalone runs can be performed for trace based matrics. See section https://deepeval.com/docs/metrics-step-efficiency#as-a-standalone. give working sample for convo_test_case
evals_iterator() currently runs trace-based metrics (e.g., StepEfficiency, TaskCompletion) internally and uploads them to a ConfidentAI test run, but there is no way to access these metric results from Python code.
This prevents users from using trace-based metrics inside custom evaluation workflows.
Expected behaviour
print(metric.score) # at the end, expected metric results should be printable