Skip to content

Commit 6492a17

Browse files
Logo and docs
1 parent c55ae4f commit 6492a17

4 files changed

Lines changed: 23 additions & 8 deletions

File tree

README.md

Lines changed: 19 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
1-
<div align="center">
2-
<h1>ToolMaker</h1>
1+
<div align="center" style="font-weight: bold;">
2+
<img src="resources/logo_text.svg" width="400px" alt="ToolMaker" />
3+
4+
<b>Turn GitHub repositories into LLM-compatible tools.</b>
35
</div>
6+
<hr>
47

58
<img src="resources/logo.svg" width="130px" align="right" />
69

@@ -15,23 +18,28 @@ This repository contains the official code for the paper:
1518
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks by dynamically utilising external software components. However, these tools must be implemented in advance by human developers, hindering the applicability of LLM agents in domains which demand large numbers of highly specialised tools, like in life sciences and medicine. Motivated by the growing trend of scientific studies accompanied by public code repositories, we propose ToolMaker, a novel agentic framework that autonomously transforms papers with code into LLM-compatible tools. Given a short task description and a repository URL, ToolMaker autonomously installs required dependencies and generates code to perform the task, using a closed-loop self-correction mechanism to iteratively diagnose and rectify errors. To evaluate our approach, we introduce a benchmark comprising 15 diverse and complex computational tasks spanning both medical and non-medical domains with over 100 unit tests to objectively assess tool correctness and robustness. ToolMaker correctly implements 80% of the tasks, substantially outperforming current state-of-the-art software engineering agents. ToolMaker therefore is a step towards fully autonomous agent-based scientific workflows.
1619
</details>
1720

18-
![Overview](resources/overview.png)
1921

22+
## News
23+
- **[May 2025]** Our [paper](https://arxiv.org/abs/2502.11705) has been accepted at [ACL 2025](https://2025.aclweb.org/)! 🎉
24+
- **[Feb 2025]** Initial code release.
2025

2126
> [!NOTE]
22-
> This is an experimental release of ToolMaker that is compatible with the [ToolArena](https://github.com/KatherLab/ToolArena) benchmark. ToolArena includes many more tools than the original TM-Bench which was released as part of ToolMaker. As such, the tasks are no longer defined in this repository, but in the ToolArena repository (though imported into this repository via the [`benchmark`](benchmark/) submodule, which points to ToolArena).
27+
> This is an experimental release of ToolMaker that is compatible with the [ToolArena](https://github.com/KatherLab/ToolArena) benchmark. ToolArena includes significantly more tools than the original TM-Bench which was released as part of ToolMaker. As such, the tasks are no longer defined in this repository, but in the ToolArena repository (though imported into this repository via the [`benchmark`](benchmark/) submodule, which points to ToolArena).
2328
>
2429
> You can still access the original code release of ToolMaker including the original TM-Bench benchmark in the [`original`](https://github.com/KatherLab/ToolMaker/tree/original) branch.
2530
26-
## News
2731

28-
- **[May 2025]** Our [paper](https://arxiv.org/abs/2502.11705) has been accepted at [ACL 2025](https://2025.aclweb.org/)! 🎉
29-
- **[Feb 2025]** Initial code release
32+
## Overview
33+
ToolMaker is an agentic workflow that turns GitHub repositories into LLM-compatible tools. Given a short task description and a repository URL, ToolMaker autonomously installs required dependencies and generates code to perform the task, using a closed-loop self-correction mechanism to iteratively diagnose and rectify errors.
34+
35+
<div align="center">
36+
<img src="resources/tool_making.png" width="600px" alt="Tool making" align="center"/>
37+
</div>
3038

3139
## Installation
3240
First clone this repository, including submodules (note the `--recursive` flag):
3341
```bash
34-
git clone --recursive ehttps://github.com/KatherLab/ToolMaker
42+
git clone --recursive https://github.com/KatherLab/ToolMaker
3543
```
3644

3745
Install [`uv`](https://docs.astral.sh/uv/getting-started/installation/) if you haven't already.
@@ -78,6 +86,9 @@ uv run python -m toolmaker.utils.visualize -i tool_output/tools/my_uni_tool/logs
7886
```
7987
This will create a `my_uni_tool.html` file in the current directory which you can view in your browser.
8088

89+
The diagram below provides an illustration of the ToolMaker workflow in action.
90+
![Overview](resources/overview.png)
91+
8192
## Benchmarking
8293
To run the unit tests that constitute the benchmark, use the following command (note that this requires the `benchmark` dependency group to be installed via `uv sync --group benchmark`):
8394
```bash

resources/logo_text.svg

Lines changed: 4 additions & 0 deletions
Loading

resources/overview.png

-3.86 KB
Loading

resources/tool_making.png

157 KB
Loading

0 commit comments

Comments
 (0)