Skip to content

Commit ec509b0

Browse files
authored
Merge branch 'main' into feat/property-bindings-support
2 parents 1cd1151 + 10bc126 commit ec509b0

25 files changed

+3289
-4
lines changed

packages/uipath/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "uipath"
3-
version = "2.10.29"
3+
version = "2.10.30"
44
description = "Python SDK and CLI for UiPath Platform, enabling programmatic interaction with automation services, process management, and deployment tools."
55
readme = { file = "README.md", content-type = "text/markdown" }
66
requires-python = ">=3.11"
Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
# Line-by-Line Evaluation Sample
2+
3+
This sample demonstrates the line-by-line evaluation feature for output evaluators.
4+
5+
## Overview
6+
7+
Line-by-line evaluation allows evaluators to:
8+
- Split multi-line outputs by a configurable delimiter (e.g., `\n`)
9+
- Evaluate each line independently
10+
- Provide partial credit based on the percentage of correct lines
11+
- Return detailed per-line feedback
12+
13+
## Features Demonstrated
14+
15+
- **Partial Credit Scoring**: Get 0.67 for 2/3 correct lines instead of 0.0
16+
- **Per-Line Feedback**: See exactly which lines passed or failed
17+
- **Configurable Delimiter**: Use `\n`, `|`, or any custom delimiter
18+
- **Comparison**: Side-by-side comparison with regular evaluation
19+
20+
## Installation
21+
22+
This sample uses the UiPath package from TestPyPI:
23+
24+
```bash
25+
# Install dependencies
26+
uv sync
27+
28+
```
29+
30+
## Usage
31+
32+
### Run the agent
33+
34+
```bash
35+
uv run uipath run main '{"items": ["apple", "banana", "cherry"]}'
36+
```
37+
38+
### Run evaluations
39+
40+
```bash
41+
uv run uipath eval main evaluations/eval-sets/default.json --workers 1
42+
```
43+
44+
## Evaluation Results
45+
46+
The sample includes three test cases with five evaluators:
47+
48+
### ExactMatch Evaluators
49+
- **LineByLineExactMatch** - New evaluator with line-by-line support
50+
- **RegularExactMatch** - New evaluator without line-by-line (for comparison)
51+
- **LegacyLineByLineExactMatch** - Legacy evaluator with line-by-line support
52+
53+
### Contains Evaluators
54+
- **LineByLineContains** - New evaluator with line-by-line support (checks if each line contains the search text)
55+
- **RegularContains** - New evaluator without line-by-line (checks if the entire output contains the search text)
56+
57+
Test cases:
58+
1. **All lines match exactly** - All evaluators score 1.0
59+
2. **One line doesn't match** - Line-by-line ExactMatch: 0.67, Regular ExactMatch: 0.0 (shows partial credit!)
60+
3. **Single item** - All evaluators score 1.0
61+
62+
Expected output (showing ExactMatch evaluators):
63+
```
64+
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
65+
┃ Evaluation ┃ LineByLineExactMatch ┃ RegularExactMatch ┃ LegacyLineByLineExactMatch ┃
66+
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
67+
│ Test all lines match │ 1.0 │ 1.0 │ 1.0 │
68+
│ Test when one line doesn't │ 0.7 │ 0.0 │ 0.7 │ ← Key difference!
69+
│ Test with single item │ 1.0 │ 1.0 │ 1.0 │
70+
├───────────────────────────────┼────────────────────────┼─────────────────────┼───────────────────────────────┤
71+
│ Average │ 0.9 │ 0.7 │ 0.9 │
72+
└───────────────────────────────┴────────────────────────┴─────────────────────┴───────────────────────────────┘
73+
```
74+
75+
Contains evaluators will all score 1.0 since all test outputs contain "Item:".
76+
77+
## Configuration
78+
79+
### Evaluator Configuration
80+
81+
#### New Evaluators (Version-based)
82+
83+
The line-by-line evaluator is configured in `evaluations/evaluators/line-by-line-exact-match.json`:
84+
85+
```json
86+
{
87+
"version": "1.0",
88+
"evaluatorTypeId": "uipath-exact-match",
89+
"evaluatorConfig": {
90+
"name": "LineByLineExactMatch",
91+
"targetOutputKey": "result",
92+
"lineByLineEvaluator": true,
93+
"lineDelimiter": "\n"
94+
}
95+
}
96+
```
97+
98+
#### Legacy Evaluators (Category/Type-based)
99+
100+
Legacy evaluators also support line-by-line evaluation in `evaluations/evaluators/legacy-line-by-line-exact-match.json`:
101+
102+
```json
103+
{
104+
"category": "Deterministic",
105+
"type": "Equals",
106+
"name": "LegacyLineByLineExactMatch",
107+
"targetOutputKey": "result",
108+
"lineByLineEvaluation": true,
109+
"lineDelimiter": "\n"
110+
}
111+
```
112+
113+
#### Contains Evaluators
114+
115+
The Contains evaluator checks if the output contains a specific search text. In line-by-line mode, it checks each line independently:
116+
117+
**Line-by-line Contains** (`evaluations/evaluators/line-by-line-contains.json`):
118+
```json
119+
{
120+
"version": "1.0",
121+
"evaluatorTypeId": "uipath-contains",
122+
"evaluatorConfig": {
123+
"name": "LineByLineContains",
124+
"target_output_key": "result",
125+
"line_by_line_evaluator": true,
126+
"line_delimiter": "\n",
127+
"case_sensitive": false,
128+
"negated": false
129+
}
130+
}
131+
```
132+
133+
**Regular Contains** (`evaluations/evaluators/regular-contains.json`):
134+
```json
135+
{
136+
"version": "1.0",
137+
"evaluatorTypeId": "uipath-contains",
138+
"evaluatorConfig": {
139+
"name": "RegularContains",
140+
"target_output_key": "result",
141+
"line_by_line_evaluator": false,
142+
"case_sensitive": false,
143+
"negated": false
144+
}
145+
}
146+
```
147+
148+
In evaluation criteria, specify the search text:
149+
```json
150+
{
151+
"LineByLineContains": {
152+
"searchText": "Item:"
153+
}
154+
}
155+
```
156+
157+
**Behavior difference**:
158+
- **Line-by-line**: Checks if each line contains "Item:", gives partial credit (e.g., 2/3 if one line is missing it)
159+
- **Regular**: Checks if the entire output contains "Item:" at least once, returns 1.0 or 0.0
160+
161+
Key options for all evaluator types:
162+
- `lineByLineEvaluator`/`lineByLineEvaluation`: Enable line-by-line evaluation (default: `false`)
163+
- `lineDelimiter`: Delimiter to split lines (default: `"\n"`)
164+
- `case_sensitive`: Case-sensitive comparison (default: `false` for Contains, `true` for ExactMatch)
165+
- `negated`: Invert the result (default: `false`, only for Contains)
166+
167+
### Custom Delimiters
168+
169+
You can use any delimiter:
170+
171+
```json
172+
{
173+
"evaluatorConfig": {
174+
"lineByLineEvaluator": true,
175+
"lineDelimiter": "|" // Pipe-separated values
176+
}
177+
}
178+
```
179+
180+
## File Structure
181+
182+
```
183+
line_by_line_test/
184+
├── main.py # Simple agent that outputs one item per line
185+
├── uipath.json # Agent configuration
186+
├── pyproject.toml # Dependencies (uses TestPyPI)
187+
└── evaluations/
188+
├── evaluators/
189+
│ ├── line-by-line-exact-match.json # New line-by-line ExactMatch evaluator
190+
│ ├── regular-exact-match.json # New regular ExactMatch evaluator
191+
│ ├── legacy-line-by-line-exact-match.json # Legacy line-by-line ExactMatch evaluator
192+
│ ├── line-by-line-contains.json # New line-by-line Contains evaluator
193+
│ └── regular-contains.json # New regular Contains evaluator
194+
└── eval-sets/
195+
└── default.json # Test cases with all 5 evaluators
196+
```
197+
198+
## Learn More
199+
200+
- [UiPath Python SDK Documentation](https://docs.uipath.com/)
201+
- [Evaluation Framework Guide](../../src/uipath/_resources/eval.md)
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"version": "2.0",
3+
"resources": []
4+
}
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
{
2+
"$schema": "https://cloud.uipath.com/draft/2024-12/entry-point",
3+
"$id": "entry-points.json",
4+
"entryPoints": [
5+
{
6+
"filePath": "main",
7+
"uniqueId": "main",
8+
"type": "function",
9+
"input": {
10+
"type": "object",
11+
"properties": {
12+
"items": {
13+
"type": "array",
14+
"items": {
15+
"type": "string"
16+
}
17+
}
18+
},
19+
"description": "Input schema.",
20+
"required": [
21+
"items"
22+
]
23+
},
24+
"output": {
25+
"type": "object",
26+
"properties": {
27+
"result": {
28+
"type": "string"
29+
}
30+
},
31+
"description": "Output schema.",
32+
"required": [
33+
"result"
34+
]
35+
}
36+
}
37+
]
38+
}
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
{
2+
"version": "1.0",
3+
"id": "line-by-line-test",
4+
"name": "Line-by-Line Evaluation Test",
5+
"evaluatorRefs": [
6+
"LineByLineExactMatch",
7+
"RegularExactMatch",
8+
"LegacyLineByLineExactMatch",
9+
"LineByLineContains",
10+
"RegularContains"
11+
],
12+
"evaluations": [
13+
{
14+
"id": "test-all-lines-match",
15+
"name": "Test all lines match exactly",
16+
"inputs": {
17+
"items": [
18+
"apple",
19+
"banana",
20+
"cherry"
21+
]
22+
},
23+
"evaluationCriterias": {
24+
"LineByLineExactMatch": {
25+
"expectedOutput": {
26+
"result": "Item: apple\nItem: banana\nItem: cherry"
27+
}
28+
},
29+
"RegularExactMatch": {
30+
"expectedOutput": {
31+
"result": "Item: apple\nItem: banana\nItem: cherry"
32+
}
33+
},
34+
"LegacyLineByLineExactMatch": {
35+
"expectedOutput": {
36+
"result": "Item: apple\nItem: banana\nItem: cherry"
37+
},
38+
"expectedAgentBehavior": ""
39+
},
40+
"LineByLineContains": {
41+
"searchText": "apple"
42+
},
43+
"RegularContains": {
44+
"searchText": "apple"
45+
}
46+
}
47+
},
48+
{
49+
"id": "test-partial-line-mismatch",
50+
"name": "Test when one line doesn't match",
51+
"inputs": {
52+
"items": [
53+
"apple",
54+
"banana",
55+
"cherry"
56+
]
57+
},
58+
"evaluationCriterias": {
59+
"LineByLineExactMatch": {
60+
"expectedOutput": {
61+
"result": "Item: apple\nItem: WRONG\nItem: cherry"
62+
}
63+
},
64+
"RegularExactMatch": {
65+
"expectedOutput": {
66+
"result": "Item: apple\nItem: WRONG\nItem: cherry"
67+
}
68+
},
69+
"LegacyLineByLineExactMatch": {
70+
"expectedOutput": {
71+
"result": "Item: apple\nItem: WRONG\nItem: cherry"
72+
},
73+
"expectedAgentBehavior": ""
74+
},
75+
"LineByLineContains": {
76+
"searchText": "Item:"
77+
},
78+
"RegularContains": {
79+
"searchText": "Item:"
80+
}
81+
}
82+
},
83+
{
84+
"id": "test-single-item",
85+
"name": "Test with single item",
86+
"inputs": {
87+
"items": [
88+
"orange"
89+
]
90+
},
91+
"evaluationCriterias": {
92+
"LineByLineExactMatch": {
93+
"expectedOutput": {
94+
"result": "Item: orange"
95+
}
96+
},
97+
"RegularExactMatch": {
98+
"expectedOutput": {
99+
"result": "Item: orange"
100+
}
101+
},
102+
"LegacyLineByLineExactMatch": {
103+
"expectedOutput": {
104+
"result": "Item: orange"
105+
},
106+
"expectedAgentBehavior": ""
107+
},
108+
"LineByLineContains": {
109+
"searchText": "Item:"
110+
},
111+
"RegularContains": {
112+
"searchText": "Item:"
113+
}
114+
}
115+
}
116+
]
117+
}
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
{
2+
"version": "1.0",
3+
"id": "simple-test",
4+
"name": "Simple Test",
5+
"evaluatorRefs": ["LineByLineExactMatch"],
6+
"evaluations": [
7+
{
8+
"id": "test-1",
9+
"name": "Single test",
10+
"inputs": {
11+
"items": ["apple"]
12+
},
13+
"evaluationCriterias": {
14+
"LineByLineExactMatch": {
15+
"expectedOutput": {
16+
"result": "Item: apple"
17+
}
18+
}
19+
}
20+
}
21+
]
22+
}

0 commit comments

Comments
 (0)