This repository provides the artifact for MASEO, a research-oriented multi-agent system that automated generate ontologies from competency questions, with a built-in focus on explainability. It aims to make the process of ontology generation more transparent, modular, and intelligent by distributing tasks among specialized agents. Each specialized agent is designed to keep track the logic behind each entity in generated ontology.
The pipeline consists of four sequential stages:
| Agent | Responsibility | External Tools |
|---|---|---|
Ontology Generation Agent |
Generates the initial OWL ontology from CQs | None |
Syntax Repair Agent |
Fixes RDF/XML syntax errors reported by the parser | rdflib |
Logical Consistency Agent |
Repairs logical inconsistencies reported by HermiT | HermiT Reasoner |
Pitfall Resolution Agent |
Resolves ontology modeling pitfalls reported by OOPS! | OOPS! |
The illustration of the MASOE framework:
- End-to-end automation — from a list of CQs to a validated ontology
- Role-based agents — each stage is handled by a dedicated LLM agent with a specific instruction and responsibility
- Provenance tracking — every ontology entity carries an append-only
vaem:rationalelog attributed to the agent that made each change, and adc:sourcelog linking each change back to the CQ, pitfall, or error that motivated it
| Tool | Purpose | Setup |
|---|---|---|
| HermiT Reasoner | Logical consistency checking | Download HermiT.jar and update the path in reason_ontology() |
| OOPS! REST API | Ontology pitfall detection | No local setup required — uses the public REST endpoint |
| Java (JRE 8+) | Required to run HermiT | sudo apt install default-jre |
MASEO support execution over single set of competency questions with sepcific LLM (CLI Execution) as well as batch run over a selection of models and sets of competency questions over various domains (Batch Execution).
python -u cli.py \
--config ./config.yaml \
--cqs_file ./dataset/cqs/wine_cqs.json \
--save_file ./wine.owl \
--agent_method true| Argument | Required | Description |
|---|---|---|
--config |
Yes | Path to config.yaml. Defaults to ./config.yaml. |
--cqs_file |
Yes | JSON file with competency questions: [{"id": "CQ1", "value": "..."}, ...]. |
--save_file |
Yes | Where to write the produced OWL ontology. |
--agent_method |
No | true (default) runs the full multi-agent pipeline; false runs single-pass generation only. |
To sweep multiple models and competency-question files in one command, use run_batch.py:
python -u run_batch.py --batch ./batch.yamlbatch.yaml only contains the list of models that you wish to run.
models:
- provider: openrouter
id: qwen/qwen3.6-flash
- provider: deepseek
id: deepseek-chat
- provider: ollama
id: qwen3:32b
...Place your competency-question files in ./dataset/cqs/. For every (model, cqs_file) pair the runner invokes MASEO generation (--agent_method true) and normal agent generation (--agent_method false). All generated ontology and log file will be saved independently.
Additional documentation of the project is available at readthedocs
- The full document for configuration can be found at: Configuration
- The full document for Input file structure can be found at Input
- The full document for Output file structure can be found at Output
This work was supported by the grant SOEL: Supporting Ontology Engineering with Large Language Models PID2023-152703NA-I00 funded by MCIN/AEI/10.13039/501100011033 and by ERDF/UE. The authors would also like to thank the EDINT (Espacios de Datos para las Infraestructuras Urbanas Inteligentes) ontology development team for sharing the project resources for evaluation purposes.
