You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Batch mode is a feature of UFO, the agent allows batch automation of tasks.
3
+
Batch mode allows automated execution of tasks on specific applications or files using predefined plan files. This mode is particularly useful for repetitive tasks on Microsoft Office applications (Word, Excel, PowerPoint).
4
4
5
5
## Quick Start
6
6
7
-
### Step 1: Create a Plan file
7
+
### Step 1: Create a Plan File
8
8
9
-
Before starting the Batch mode, you need to create a plan file that contains the list of steps for the agent to follow. The plan file is a JSON file that contains the following fields:
9
+
Create a JSON plan file that defines the task to be automated. The plan file should contain the following fields:
| object | The application or file to interact with. | String |
15
15
| close | Determines whether to close the corresponding application or file after completing the task. | Boolean |
16
16
17
-
Below is an example of a plan file:
17
+
Example plan file:
18
18
19
19
```json
20
20
{
21
21
"task": "Type in a text of 'Test For Fun' with heading 1 level",
22
22
"object": "draft.docx",
23
-
"close": False
23
+
"close": false
24
24
}
25
25
```
26
26
27
-
!!! note
28
-
The `object` field is the application or file that the agent will interact with. The object **must be active** (can be minimized) when starting the Batch mode.
29
-
The structure of your files should be as follows, where `tasks` is the directory for your tasks and `files` is where your object files are stored:
27
+
**Important:** The `close` field should be a boolean value (`true` or `false`), not a Python boolean (`True` or `False`).
30
28
31
-
- Parent
32
-
- tasks
33
-
- files
29
+
The file structure should be organized as follows:
34
30
31
+
```
32
+
Parent/
33
+
├── tasks/
34
+
│ └── plan.json
35
+
└── files/
36
+
└── draft.docx
37
+
```
38
+
39
+
The `object` field in the plan file refers to files in the `files` directory. The plan reader will automatically resolve the full file path by replacing `tasks` with `files` in the directory structure.
35
40
36
-
### Step 2: Start the Batch Mode
37
-
To start the Batch mode, run the following command:
Replace `{task_name}` with the name of the task and `{plan_file}` with the `Path_to_Parent/Plan_file`.
50
+
**Parameters:**
51
+
-`{task_name}`: Name for this task execution (used for logging)
52
+
-`{plan_file}`: Full path to the plan JSON file (e.g., `C:/Parent/tasks/plan.json`)
53
+
54
+
### Supported Applications
46
55
56
+
Batch mode currently supports the following Microsoft Office applications:
47
57
58
+
-**Word** (`.docx` files) - `WINWORD.EXE`
59
+
-**Excel** (`.xlsx` files) - `EXCEL.EXE`
60
+
-**PowerPoint** (`.pptx` files) - `POWERPNT.EXE`
61
+
62
+
The application will be automatically launched when the batch mode starts, and the specified file will be opened and maximized.
48
63
49
64
## Evaluation
50
-
You may want to evaluate the `task` is completed successfully or not by following the plan. UFO will call the `EvaluationAgent` to evaluate the task if `EVA_SESSION` is set to `True` in the `config_dev.yaml` file.
51
65
52
-
You can check the evaluation log in the `logs/{task_name}/evaluation.log` file.
66
+
UFO can automatically evaluate whether the task was completed successfully. To enable evaluation, ensure `EVA_SESSION` is set to `True` in the `config/ufo/system.yaml` file.
67
+
68
+
Check the evaluation results in `logs/{task_name}/evaluation.log`.
69
+
70
+
## References
71
+
72
+
The batch mode uses a `PlanReader` to parse the plan file and creates a `FromFileSession` to execute the plan.
53
73
54
-
# References
55
-
The batch mode employs a `PlanReader` to parse the plan file and create a `FromFileSession` to follow the plan.
74
+
### PlanReader
56
75
57
-
## PlanReader
58
-
The `PlanReader` is located in the `ufo/module/sessions/plan_reader.py` file.
76
+
The `PlanReader` is located at `ufo/module/sessions/plan_reader.py`.
59
77
60
78
:::module.sessions.plan_reader.PlanReader
61
79
62
-
<br>
63
-
## FollowerSession
80
+
### FromFileSession
64
81
65
-
The `FromFileSession` is also located in the `ufo/module/sessions/session.py` file.
82
+
The `FromFileSession` is located at `ufo/module/sessions/session.py`.
Sometimes, UFO may need additional context or information to complete a task. These information are important and customized for each user. UFO can ask the user for additional information and save it in the local memory for future reference. This customization feature allows UFO to provide a more personalized experience to the user.
3
+
UFO can ask users for additional context or information when needed and save it in local memory for future reference. This customization feature enables a more personalized user experience by remembering user-specific information across sessions.
4
4
5
-
## Scenario
5
+
## Example Scenario
6
6
7
-
Let's consider a scenario where UFO needs additional information to complete a task. UFO is tasked with booking a cab for the user. To book a cab, UFO needs to know the exact address of the user. UFO will ask the user for the address and save it in the local memory for future reference. Next time, when UFO is asked to complete a task that requires the user's address, UFO will use the saved address to complete the task, without asking the user again.
7
+
Consider a task where UFO needs to book a cab. To complete this task, UFO requires the user's address. UFO will:
8
8
9
+
1. Ask the user for their address
10
+
2. Save the address in local memory
11
+
3. Use the saved address automatically in future tasks that require it
9
12
10
-
## Implementation
11
-
We currently implement the customization feature in the `HostAgent` class. When the `HostAgent` needs additional information, it will transit to the `PENDING` state and ask the user for the information. The user will provide the information, and the `HostAgent` will save it in the local memory base for future reference. The saved information is stored in the `blackboard` and can be accessed by all agents in the session.
13
+
This eliminates the need to repeatedly provide the same information.
12
14
13
-
!!! note
14
-
The customization memory base is only saved in a **local file**. These information will **not** upload to the cloud or any other storage to protect the user's privacy.
15
+
## How It Works
16
+
17
+
The customization feature is implemented across multiple agent types (`HostAgent`, `AppAgent`, and `OpenAIOperatorAgent`). When an agent needs additional information:
18
+
19
+
1. The agent transitions to the `PENDING` state
20
+
2. The agent asks the user for the required information (if `ASK_QUESTION` is enabled)
21
+
3. The user's response is saved to the `blackboard` in the QA pairs file
22
+
4. All agents in the session can access this information from the shared `blackboard`
23
+
24
+
The saved QA pairs are stored locally as JSON lines in the file specified by `QA_PAIR_FILE`. Privacy is preserved as this information never leaves the local machine.
15
25
16
26
## Configuration
17
27
18
-
You can configure the customization feature by setting the following field in the `config_dev.yaml` file.
28
+
Configure the customization feature in `config/ufo/system.yaml`:
29
+
30
+
| Configuration Option | Description | Type | Default Value |
|`USE_CUSTOMIZATION`| Whether to enable the customization. | Boolean | True |
23
-
|`QA_PAIR_FILE`| The path for the historical QA pairs. | String | "customization/historical_qa.txt" |
24
-
|`QA_PAIR_NUM`| The number of QA pairs for the customization.| Integer | 20 |
37
+
**Note:** Both `ASK_QUESTION` and `USE_CUSTOMIZATION` need to be enabled for the full customization experience. `ASK_QUESTION` controls whether agents can prompt users for information, while `USE_CUSTOMIZATION` controls whether previously saved information is loaded.
The Follower mode is a feature of UFO that the agent follows a list of pre-defined steps in natural language to take actions on applications. Different from the normal mode, this mode creates an `AppAgent` that follows the plan list provided by the user to interact with the application, instead of generating the plan itself. This mode is useful for debugging and software testing or verification.
3
+
Follower mode enables UFO to execute a predefined list of steps in natural language. Unlike normal mode where the agent generates its own plan, follower mode creates an `AppAgent` that follows user-provided steps to interact with applications. This mode is particularly useful for debugging, software testing, and verification.
4
4
5
5
## Quick Start
6
6
7
-
### Step 1: Create a Plan file
7
+
### Step 1: Create a Plan File
8
8
9
-
Before starting the Follower mode, you need to create a plan file that contains the list of steps for the agent to follow. The plan file is a JSON file that contains the following fields:
9
+
Create a JSON plan file containing the steps for the agent to follow:
10
10
11
11
| Field | Description | Type |
12
12
| --- | --- | --- |
13
13
| task | The task description. | String |
14
14
| steps | The list of steps for the agent to follow. | List of Strings |
15
15
| object | The application or file to interact with. | String |
16
16
17
-
Below is an example of a plan file:
17
+
Example plan file:
18
18
19
19
```json
20
20
{
@@ -31,53 +31,54 @@ Below is an example of a plan file:
31
31
}
32
32
```
33
33
34
-
!!! note
35
-
The `object` field is the application or file that the agent will interact with. The object **must be active** (can be minimized) when starting the Follower mode.
34
+
The `object` field specifies the application or file the agent will interact with. This object should be opened and accessible before starting follower mode.
36
35
36
+
### Step 2: Start Follower Mode
37
37
38
-
### Step 2: Start the Follower Mode
39
-
To start the Follower mode, run the following command:
Replace `{task_name}` with the name of the task and `{plan_file}` with the path to the plan file.
48
-
45
+
**Parameters:**
46
+
-`{task_name}`: Name for this task execution (used for logging)
47
+
-`{plan_file}`: Path to the plan JSON file
49
48
50
49
### Step 3: Run in Batch (Optional)
51
50
52
-
You can also run the Follower mode in batch mode by providing a folder containing multiple plan files. The agent will follow the plans in the folder one by one. To run in batch mode, run the following command:
51
+
To execute multiple plan files sequentially, provide a folder containing multiple plan files:
UFO will automatically detect the plan files in the folder and run them one by one.
60
-
61
-
!!! tip
62
-
Replace `{task_name}` with the name of the task and `{plan_folder}` with the path to the folder containing plan files.
58
+
UFO will automatically detect and execute all plan files in the folder sequentially.
63
59
60
+
**Parameters:**
61
+
-`{task_name}`: Name for this batch execution (used for logging)
62
+
-`{plan_folder}`: Path to the folder containing plan JSON files
64
63
65
64
## Evaluation
66
-
You may want to evaluate the `task` is completed successfully or not by following the plan. UFO will call the `EvaluationAgent` to evaluate the task if `EVA_SESSION` is set to `True` in the `config_dev.yaml` file.
67
65
68
-
You can check the evaluation log in the `logs/{task_name}/evaluation.log` file.
66
+
UFO can automatically evaluate task completion. To enable evaluation, ensure `EVA_SESSION` is set to `True` in `config/ufo/system.yaml`.
67
+
68
+
Check the evaluation results in `logs/{task_name}/evaluation.log`.
69
+
70
+
## References
71
+
72
+
Follower mode uses a `PlanReader` to parse the plan file and creates a `FollowerSession` to execute the steps.
69
73
70
-
# References
71
-
The follower mode employs a `PlanReader` to parse the plan file and create a `FollowerSession` to follow the plan.
74
+
### PlanReader
72
75
73
-
## PlanReader
74
-
The `PlanReader` is located in the `ufo/module/sessions/plan_reader.py` file.
76
+
The `PlanReader` is located at `ufo/module/sessions/plan_reader.py`.
75
77
76
78
:::module.sessions.plan_reader.PlanReader
77
79
78
-
<br>
79
-
## FollowerSession
80
+
### FollowerSession
80
81
81
-
The `FollowerSession` is also located in the `ufo/module/sessions/session.py` file.
82
+
The `FollowerSession` is located at `ufo/module/sessions/session.py`.
UFO² supports **wrapping any third-party agent as an AppAgent**, allowing it to be invoked by the HostAgent within a multi-agent workflow. This section demonstrates how to run **Operator**, an OpenAI-based Conversational UI Agent (CUA), as an AppAgent inside the UFO² ecosystem.
3
+
UFO² supports wrapping third-party agents as AppAgents, enabling them to be orchestrated by the HostAgent in multi-agent workflows. This guide demonstrates how to run **Operator**, an OpenAI-based Conversational UI Agent (CUA), within the UFO² ecosystem.
Before proceeding, ensure that Operator has been properly configured. Follow the setup instructions in the [OpenAI CUA (Operator) guide](../../configuration/models/operator.md).
12
10
13
-
Before proceeding, please ensure that the Operator has been properly configured. You can follow the setup instructions in the [OpenAI CUA (Operator) guide](../../configuration/models/operator.md).
11
+
## Running the Operator
14
12
15
-
## 🚀 Running the Operator
13
+
UFO² provides two modes for running Operator:
16
14
17
-
UFO² provides two modes for running the Operator:
15
+
1.**Single Agent Mode (`operator`)** — Run Operator independently through UFO² as a launcher
16
+
2.**AppAgent Mode (`normal_operator`)** — Run Operator as an `AppAgent` orchestrated by the `HostAgent`
18
17
19
-
1.**Single Agent Mode** — Use UFO² as the launcher to run Operator in standalone mode.
20
-
2.**AppAgent Mode** — Run Operator as an `AppAgent`, enabling it to be orchestrated by the `HostAgent` as part of a broader task decomposition.
18
+
### Single Agent Mode
21
19
22
-
### 🔹 Single Agent Mode
20
+
In single agent mode, Operator functions independently but is launched through UFO². This mode is useful for debugging or quick prototyping.
23
21
24
-
In this mode, the Operator functions independently but is launched through UFO². This is useful for debugging or quick prototyping.
python -m ufo --mode operator --task test_operator --request "Open Notepad and type Hello World"
28
29
```
29
30
30
-
### 🔸 AppAgent Mode
31
+
### AppAgent Mode
32
+
33
+
In AppAgent mode, Operator is wrapped as an `AppAgent` and can be triggered as a sub-agent within the HostAgent workflow. This enables task decomposition where the HostAgent coordinates multiple agents including Operator.
31
34
32
-
This mode wraps Operator as an AppAgent (`normal_operator`) so that it can be triggered as a sub-agent within a full HostAgent workflow.
0 commit comments