You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: documents/docs/tutorials/creating_app_agent/demonstration_provision.md
+14-12Lines changed: 14 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
##Provide Human Demonstrations to the AppAgent
1
+
# Provide Human Demonstrations to the AppAgent
2
2
3
3
Users or application developers can provide human demonstrations to the `AppAgent` to guide it in executing similar tasks in the future. The `AppAgent` uses these demonstrations to understand the context of the task and the steps required to execute it, effectively becoming an expert in the application.
4
4
5
-
###How to Prepare Human Demonstrations for the AppAgent?
5
+
## How to Prepare Human Demonstrations for the AppAgent?
6
6
7
7
Currently, UFO supports learning from user trajectories recorded by [Steps Recorder](https://support.microsoft.com/en-us/windows/record-steps-to-reproduce-a-problem-46582a9b-620f-2e36-00c9-04e25d784e47) integrated within Windows. More tools will be supported in the future.
8
8
@@ -14,9 +14,10 @@ Follow the [official guidance](https://support.microsoft.com/en-us/windows/recor
14
14
15
15
Include any specific details or instructions for UFO to notice by adding comments. Since Steps Recorder doesn't capture typed text, include any necessary typed content in the comments as well.
16
16
17
-
<palign="center">
18
-
<img src="../../img/add_comment.png" alt="Adding Comments in Steps Recorder"/>
19
-
</p>
17
+
<figuremarkdown>
18
+

19
+
<figcaption>Adding comments in Steps Recorder for additional context</figcaption>
20
+
</figure>
20
21
21
22
22
23
### Step 3: Review and Save the Recorded Demonstrations
@@ -57,14 +58,15 @@ Would you like to save any one of them as a future reference for the agent? Pres
57
58
58
59
Press `1` to save the plan into its memory for future reference. A sample can be found [here](https://github.com/microsoft/UFO/blob/main/vectordb/demonstration/example.yaml).
59
60
60
-
You can view a demonstration video below:
61
+
You can view a demonstration video [here](https://github.com/yunhao0204/UFO/assets/59384816/0146f83e-1b5e-4933-8985-fe3f24ec4777).
### How to Use Human Demonstrations to Enhance the AppAgent?
63
+
## How to Use Human Demonstrations to Enhance the AppAgent?
67
64
68
65
After creating the offline indexer, refer to the [Learning from User Demonstrations](../../ufo2/core_features/knowledge_substrate/learning_from_demonstration.md) section for guidance on how to use human demonstrations to enhance the AppAgent.
69
66
70
-
---
67
+
## Related Documentation
68
+
69
+
-[Overview: Enhancing AppAgent Capabilities](./overview.md) - Learn about all enhancement approaches
70
+
-[Help Document Provision](./help_document_provision.md) - Provide knowledge through documentation
Copy file name to clipboardExpand all lines: documents/docs/tutorials/creating_app_agent/help_document_provision.md
+12-8Lines changed: 12 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,7 @@
2
2
3
3
Help documents provide guidance to the `AppAgent` in executing specific tasks. The `AppAgent` uses these documents to understand the context of the task and the steps required to execute it, effectively becoming an expert in the application.
4
4
5
-
## How to Provide Help Documents to the AppAgent?
6
-
7
-
### Step 1: Prepare Help Documents and Metadata
5
+
## Step 1: Prepare Help Documents and Metadata
8
6
9
7
UFO currently supports processing help documents in `json` format. More formats will be supported in the future.
10
8
@@ -30,11 +28,11 @@ An example of a help document in `json` format is as follows:
30
28
31
29
Save each help document in a `json` file of your target folder.
32
30
33
-
###Step 2: Place Help Documents in the AppAgent Directory
31
+
## Step 2: Place Help Documents in the AppAgent Directory
34
32
35
33
Once you have prepared all help documents and their metadata, place them into a folder. Sub-folders for the help documents are allowed, but ensure that each help document and its corresponding metadata are placed in the same directory.
36
34
37
-
###Step 3: Create a Help Document Indexer
35
+
## Step 3: Create a Help Document Indexer
38
36
39
37
After organizing your documents in a folder named `path_of_the_docs`, you can create an offline indexer to support RAG for UFO. Follow these steps:
This command will create an offline indexer for all documents in the `path_of_the_docs` folder using Faiss and embedding with sentence transformer (additional embeddings will be supported soon). By default, the created index will be placed [here](https://github.com/microsoft/UFO/tree/main/vectordb/docs).
50
48
51
-
!!! note
49
+
!!! note "Application Name Requirement"
52
50
Ensure the `app_name` is accurately defined, as it is used to match the offline indexer in online RAG.
53
51
52
+
## How to Use Help Documents to Enhance the AppAgent?
53
+
54
+
After creating the offline indexer, refer to the [Learning from Help Documents](../../ufo2/core_features/knowledge_substrate/learning_from_help_document.md) section for guidance on how to use the help documents to enhance the `AppAgent`.
54
55
55
-
### How to Use Help Documents to Enhance the AppAgent?
56
+
##Related Documentation
56
57
57
-
After creating the offline indexer, you can find the guidance on how to use the help documents to enhance the `AppAgent` in the [Learning from Help Documents](../../ufo2/core_features/knowledge_substrate/learning_from_help_document.md) section.
58
+
-[Overview: Enhancing AppAgent Capabilities](./overview.md) - Learn about all enhancement approaches
59
+
-[User Demonstrations Provision](./demonstration_provision.md) - Teach through examples
Copy file name to clipboardExpand all lines: documents/docs/tutorials/creating_app_agent/overview.md
+7-9Lines changed: 7 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,22 +1,20 @@
1
1
# Enhancing AppAgent Capabilities
2
2
3
-
UFO² provides a flexible framework for application developers and users to enhance `AppAgent` capabilities for specific applications. By providing additional knowledge and tools, you can significantly improve the `AppAgent`'s effectiveness in automating tasks within your applications.
3
+
UFO² provides a flexible framework for application developers and users to enhance `AppAgent` capabilities for specific applications. AppAgent enhancement is about **augmenting**the existing AppAgent's capabilities through:
4
4
5
-
!!!info "What is AppAgent Enhancement?"
6
-
AppAgent enhancement is about **augmenting** the existing AppAgent's capabilities, not creating a new agent. You provide:
7
-
- **Knowledge** (help documents, demonstrations) to guide decision-making
8
-
- **Native API tools** (via MCP servers) for efficient automation
9
-
- **Application-specific context** for better understanding
5
+
-**Knowledge** (help documents, demonstrations) to guide decision-making
6
+
-**Native API tools** (via MCP servers) for efficient automation
7
+
-**Application-specific context** for better understanding
10
8
11
9
## Enhancement Components
12
10
13
11
The `AppAgent` can be enhanced through three complementary approaches:
|**[Help Documents](./help_document_provision.md)**| Provide application-specific guidance and instructions to help the agent understand tasks and workflows |This section|[Learning from Help Documents](../../ufo2/core_features/knowledge_substrate/learning_from_help_document.md)|
18
-
|**[User Demonstrations](./demonstration_provision.md)**| Supply recorded user interactions to teach the agent how to perform specific tasks through examples |This section|[Learning from Demonstrations](../../ufo2/core_features/knowledge_substrate/learning_from_demonstration.md)|
19
-
|**[Native API Tools](./warpping_app_native_api.md)**| Create custom MCP action servers that wrap application COM APIs or other native interfaces for efficient automation |This section|[Creating MCP Servers](../creating_mcp_servers.md)|
15
+
|**[Help Documents](./help_document_provision.md)**| Provide application-specific guidance and instructions to help the agent understand tasks and workflows |[Provision Guide](./help_document_provision.md)|[Learning from Help Documents](../../ufo2/core_features/knowledge_substrate/learning_from_help_document.md)|
16
+
|**[User Demonstrations](./demonstration_provision.md)**| Supply recorded user interactions to teach the agent how to perform specific tasks through examples |[Provision Guide](./demonstration_provision.md)|[Learning from Demonstrations](../../ufo2/core_features/knowledge_substrate/learning_from_demonstration.md)|
17
+
|**[Native API Tools](./warpping_app_native_api.md)**| Create custom MCP action servers that wrap application COM APIs or other native interfaces for efficient automation |[Wrapping Guide](./warpping_app_native_api.md)|[Creating MCP Servers](../creating_mcp_servers.md)|
Copy file name to clipboardExpand all lines: documents/docs/tutorials/creating_app_agent/warpping_app_native_api.md
+14-11Lines changed: 14 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,6 @@
1
1
# Wrapping Application Native APIs as MCP Action Servers
2
2
3
-
!!!info "Modern Approach: MCP Servers"
4
-
UFO² uses **MCP (Model Context Protocol) servers** to expose application native APIs to the AppAgent. This document shows you how to create custom MCP action servers that wrap your application's COM APIs, REST APIs, or other programmable interfaces.
3
+
UFO² uses **MCP (Model Context Protocol) servers** to expose application native APIs to the AppAgent. This document shows you how to create custom MCP action servers that wrap your application's COM APIs, REST APIs, or other programmable interfaces.
5
4
6
5
## Overview
7
6
@@ -12,7 +11,7 @@ While AppAgent can automate applications through UI controls, providing **native
AppAgent combines both approaches - the LLM intelligently selects **GUI tools** (from UIExecutor) or **API tools** (from your custom MCP server) based on the task requirements.
17
16
18
17
## Prerequisites
@@ -27,8 +26,6 @@ Before creating a native API MCP server:
27
26
28
27
### Step 1: Create Your MCP Server File
29
28
30
-
### Step 1: Create Your MCP Server File
31
-
32
29
Create a new Python file in `ufo/client/mcp/local_servers/` for your application's MCP server:
Copy file name to clipboardExpand all lines: documents/docs/tutorials/creating_mcp_servers.md
+34-39Lines changed: 34 additions & 39 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,11 +2,7 @@
2
2
3
3
This tutorial teaches you how to create, register, and deploy custom MCP servers for UFO² agents. You'll learn to build **local**, **HTTP**, and **stdio** MCP servers, and how to register them with different agents.
4
4
5
-
!!!info "Prerequisites"
6
-
- Basic Python knowledge
7
-
- Familiarity with [MCP Overview](../mcp/overview.md)
8
-
- Understanding of [MCP Configuration](../mcp/configuration.md)
9
-
- Review [Built-in Local Servers](../mcp/local_servers.md) as examples
5
+
**Prerequisites**: Basic Python knowledge, familiarity with [MCP Overview](../mcp/overview.md) and [MCP Configuration](../mcp/configuration.md). Review [Built-in Local Servers](../mcp/local_servers.md) as examples.
10
6
11
7
---
12
8
@@ -43,11 +39,11 @@ All MCP servers fall into two categories:
|**Action**| State-changing execution | ✅ Yes | ❌ No |
45
41
46
-
!!!tip "Tool Selection"
47
-
- **Data Collection tools**: Automatically invoked by the framework to build observation prompts
48
-
- **Action tools**: LLM agent actively selects which tool to execute at each step
49
-
50
-
**Write clear docstrings and type annotations** - they become LLM instructions!
42
+
**Tool Selection:**
43
+
-**Data Collection tools**: Automatically invoked by the framework to build observation prompts
44
+
-**Action tools**: LLM agent actively selects which tool to execute at each step
45
+
46
+
**Important**: Write clear docstrings and type annotations - they become LLM instructions!
51
47
52
48
---
53
49
@@ -199,7 +195,7 @@ if __name__ == "__main__":
199
195
200
196
### Example: Application-Specific Server
201
197
202
-
Here's a real-world example - a server for Chrome browser automation:
198
+
Here's a real-world example - a server for Chrome browser automation. For more details on wrapping application native APIs, see [Wrapping App Native API](creating_app_agent/warpping_app_native_api.md).
Now your Windows UFO² agent can execute Linux commands remotely! The LLM will select `execute_command` or `get_system_info` as needed.
601
+
**Cross-Platform Workflow**: Now your Windows UFO² agent can execute Linux commands remotely! The LLM will select `execute_command` or `get_system_info` as needed.
608
602
609
603
---
610
604
@@ -954,11 +948,10 @@ AppAgent:
954
948
955
949
### 1. Write Comprehensive Docstrings
956
950
957
-
!!!tip "Docstrings → LLM Instructions"
958
-
Your docstrings are **directly converted to LLM prompts**. The LLM uses them to understand:
959
-
- **What** the tool does
960
-
- **When** to use it
961
-
- **How** to use it correctly
951
+
Your docstrings are **directly converted to LLM prompts**. The LLM uses them to understand:
952
+
-**What** the tool does
953
+
-**When** to use it
954
+
-**How** to use it correctly
962
955
963
956
**Bad Example:**
964
957
```python
@@ -1259,12 +1252,13 @@ asyncio.run(test())
1259
1252
1260
1253
## Next Steps
1261
1254
1262
-
Now that you've learned to create MCP servers:
1255
+
Now that you've learned to create MCP servers, explore these related topics:
1263
1256
1264
1257
1. **Review Built-in Servers**: See [Local Servers](../mcp/local_servers.md) for production examples
1265
1258
2. **Explore HTTP Deployment**: Read [Remote Servers](../mcp/remote_servers.md) for cross-platform automation
1266
1259
3. **Understand Agent Configuration**: Study [MCP Configuration](../mcp/configuration.md) for advanced setups
1267
-
4. **Create Your First Agent**: Follow [Creating App Agent](creating_app_agent/overview.md) to build custom agents
1260
+
4. **Learn about Computer Class**: Review [Computer](../client/computer.md) to understand the MCP client integration
1261
+
5. **Create Your First Agent**: Follow [Creating App Agent](creating_app_agent/overview.md) to build custom agents
1268
1262
1269
1263
---
1270
1264
@@ -1280,10 +1274,11 @@ Now that you've learned to create MCP servers:
1280
1274
1281
1275
---
1282
1276
1283
-
!!!quote "Best Practices Summary"
1284
-
- ✅ **Write clear docstrings** - they become LLM instructions
1285
-
- ✅ **Use descriptive names** - for tools, parameters, and namespaces
0 commit comments