This document describes the architecture of the Graph RAG (Retrieval-Augmented Generation) application that combines OpenAI GPT with GraphDB knowledge graphs for intelligent jaguar conservation queries. The application demonstrates:
- Conversational AI using Microsoft Agent Framework DevUI that queries structured data using SPARQL
- Ontology-Aware Knowledge Extraction that transforms unstructured text into RDF knowledge graphs
This is a true semantic Graph RAG implementation using formal ontologies (RDFS/OWL), not just labeled property graphs (LPG).
┌─────────────────────────────────────────────────────────────┐
│ Microsoft Agent Framework │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ DevUI Server │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Auto- │ │ Interactive │ │ Built-in │ │ │
│ │ │ opening │ │ Chat │ │ Debugging │ │ │
│ │ │ Browser │ │ Interface │ │ Tools │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
│
┌─────────────────────────▼───────────────────────────────────┐
│ Microsoft Agent Framework │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Jaguar Query Agent │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ System Prompt │ │ │
│ │ │ - Graph RAG Instructions │ │ │
│ │ │ - SPARQL Guidelines │ │ │
│ │ │ - Response Formatting │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ OpenAI Client │ │ │
│ │ │ - GPT-4 Integration │ │ │
│ │ │ - Function Calling │ │ │
│ │ │ - Thread Management │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────┐ │ │
│ │ │ GraphDB Tool │ │ │
│ │ │ - SPARQL Query Execution │ │ │
│ │ │ - Query Validation │ │ │
│ │ │ - Result Processing │ │ │
│ │ └─────────────────────────────────────────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
│
┌─────────────────────────▼───────────────────────────────────┐
│ External Systems │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ OpenAI API │ │ GraphDB │ │
│ │ - GPT-4 │ │ - RDF Triple Store │ │
│ │ - Responses │ │ - SPARQL Engine │ │
│ │ - Function │ │ - Jaguar Ontology │ │
│ │ Calling │ │ - Conservation │ │
│ └─────────────────┘ │ Data │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
In addition to querying existing knowledge graphs, this system provides ontology-driven knowledge extraction from unstructured text:
┌─────────────────────────────────────────────────────────────┐
│ Ontology-Aware Knowledge Extraction │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ Jaguar │ │ Jaguar │ │
│ │ Ontology │ │ Corpus │ │
│ │ (.ttl) │ │ (.txt) │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ GPT-5 │ │
│ │ Semantic Analysis │ │
│ │ - Domain Context │ │
│ │ - Disambiguation │ │
│ │ - Relationship │ │
│ │ Inference │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ RDF Turtle │ │
│ │ Generation │ │
│ │ - Aligned to │ │
│ │ Ontology │ │
│ │ - Valid Structure │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────▼──────────┐ │
│ │ GraphDB │ │
│ │ Import │ │
│ │ (Text Snippet) │ │
│ └─────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Key Innovation: The LLM uses the formal ontology to understand domain semantics, enabling it to:
- Disambiguate entities (wildlife jaguars vs. cars vs. guitars)
- Extract only relevant information aligned with the ontology
- Infer relationships between entities based on context
- Generate valid RDF that conforms to the ontology structure
Why This Requires RDF/Ontologies:
- LPG databases lack formal semantic definitions
- No standardized class hierarchies (RDFS/OWL)
- No property domain/range constraints
- No reasoning or inference capabilities
- LLMs need semantic structure, not just node/edge labels
- DevUI Server: Built-in development web interface
- Auto-opening Browser: Automatically launches at
http://localhost:8000 - Interactive Chat: Real-time conversation interface
- Built-in Debugging: Visual inspection of agent behavior
- Zero Configuration: No frontend code required
- Jaguar Query Agent: Graph RAG specialist agent
- System Prompt: Graph RAG instructions and SPARQL guidelines
- OpenAI Client: GPT-4 integration with function calling
- Thread Management: Conversation context preservation
- text2knowledge.ipynb: Jupyter notebook for ontology-aware extraction
- Loads formal ontology (jaguar_ontology.ttl)
- Processes unstructured text corpus
- Uses GPT-5 for semantic entity extraction
- Generates RDF Turtle aligned with ontology
- Demonstrates concept understanding vs. pattern matching
- GraphDB Tool: Core Graph RAG component
- Executes SPARQL queries against knowledge graph
- Validates query syntax and ontology compliance
- Processes and formats query results
- OpenAI API: GPT-4/GPT-5 for natural language processing and semantic extraction
- GraphDB: RDF triple store with jaguar conservation ontology
- DevUI Launch:
python3 main.pystarts DevUI server - User Interaction: User types queries in DevUI interface
- Agent Processing:
- LLM Analysis: OpenAI GPT analyzes natural language query
- SPARQL Generation: AI generates appropriate SPARQL query
- Tool Call Detection: Agent Framework detects need for GraphDB tool
- Knowledge Graph Query:
- SPARQL Execution:
query_jaguar_databasetool executes query - Result Processing: Raw SPARQL results processed and formatted
- SPARQL Execution:
- Response Generation:
- LLM Interpretation: OpenAI GPT interprets GraphDB results
- Natural Language Response: Generates human-readable response
- Markdown Formatting: Applies formatting with code blocks
- DevUI Display:
- Response Delivery: Shows formatted response in DevUI
- Context Preservation: Conversation history maintained automatically
DevUI provides automatic state management:
# DevUI handles all state management automatically
serve(entities=[query_agent], auto_open=True)Microsoft Agent Framework handles conversation management:
# Agent Framework manages threads and context automatically
response = asyncio.run(agent.run(user_message, thread=thread, store=True))Combines LLM with knowledge graph queries:
# LLM generates SPARQL, tool executes against GraphDB
tools=[query_jaguar_database]
tool_choice="auto"# DevUI handles all state management automatically
# No manual state configuration required
serve(entities=[query_agent], auto_open=True)- Agent Framework: Microsoft Agent Framework manages conversation context
- OpenAI Responses API: Server-side thread persistence
- DevUI Interface: Built-in conversation history and state management
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key
OPENAI_RESPONSES_MODEL_ID=gpt-4
# GraphDB Configuration
GRAPHDB_URL=http://localhost:7200
GRAPHDB_REPOSITORY=your_repo_name# Hardcoded in create_jaguar_query_agent() function
settings = OpenAISettings(
api_key=os.getenv("OPENAI_API_KEY", ""),
model_id=os.getenv("OPENAI_RESPONSES_MODEL_ID", "gpt-4")
)- API Keys: Stored in
.env, never committed to repository - Input Validation: User inputs validated before processing
- Error Handling: Errors logged but sanitized for user display
- SPARQL Injection Prevention: Tool validates SPARQL syntax
- Read-Only Access: Agent only reads from GraphDB
- No Data Storage: No user data persisted beyond conversation
- Secure APIs: Use HTTPS for all external communications
- Microsoft Agent Framework DevUI
- Automatic state management
- Single-user design
- OpenAI API rate limits
- Multi-User Support: Session-based user management
- Distributed State: Redis or database backend
- Horizontal Scaling: Multiple agent instances
- Query Caching: Cache frequent SPARQL patterns
- Async Processing: Non-blocking GraphDB queries
- DevUI request/response logging
- Agent Framework conversation logging
- GraphDB query execution logging
- Response Times: Graph RAG query processing time
- Query Success Rate: SPARQL execution success rate
- Tool Usage: GraphDB tool call frequency
- DevUI Interface: Test queries through the DevUI
- SPARQL Validation: Verify generated queries in GraphDB
- Response Quality: Check natural language response accuracy
- Microsoft Agent Framework: Agent management and DevUI
- OpenAI: GPT API client
- Requests: GraphDB HTTP communication
- python-dotenv: Environment variable management
- GraphDB: RDF triple store
- SPARQL: Query language for knowledge graphs
- RDF/Turtle: Ontology format
graph_RAG/
├── main.py # DevUI entry point
├── text2knowledge.ipynb # Ontology-aware knowledge extraction notebook
├── src/
│ └── agents/
│ ├── jaguar_query_agent.py # Agent creation with DevUI integration
│ └── jaguar_tool.py # GraphDB tool implementation
├── data/
│ ├── jaguar_ontology.ttl # Jaguar conservation ontology (RDFS/OWL)
│ ├── jaguars.ttl # Jaguar instance data (RDF)
│ └── jaguar_corpus.txt # Mixed-content text corpus for extraction
├── docs/
│ ├── agent_design.md # Agent design documentation
│ └── architecture.md # Architecture documentation
├── .env # Environment variables
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- True Semantic Graph RAG: Uses formal ontologies (RDFS/OWL), not just LPG
- Hybrid Intelligence: Combines LLM reasoning with structured data
- Ontology-Driven Extraction: Intelligent knowledge mining from unstructured text
- Real-time Queries: Live data from knowledge graphs
- Context Awareness: Maintains conversation context
- Standards-Based: W3C standards (RDF, SPARQL, RDFS, OWL)
- Extensible: Easy to add new tools and capabilities
- Formal Ontologies: Machine-readable domain semantics
- Class Hierarchies: RDFS/OWL inheritance and reasoning
- Property Definitions: Domain/range constraints
- Semantic Interoperability: Universal URIs and namespaces
- Reasoning Capabilities: Built-in inference engines
- LLM-Friendly: Rich semantic structure for AI understanding
- Data-Driven Insights: Access to structured conservation data
- Natural Language Interface: Easy querying of complex data
- Knowledge Mining: Extract conservation data from research papers
- Educational Tool: Demonstrates semantic Graph RAG capabilities
- Research Support: Facilitates conservation research queries