Skip to content

aaudevart/LLM_RAG_ETL_Showcase-Hostobot

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

18 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ₯ HOSTOBOT

Hospital Concept

LangChain Neo4j Gemini OpenAI MistralAI
LangChain Neo4j Gemini OpenAI MistralAI

πŸš€ Overview

HOSTOBOT is an advanced Agentic Microservices demonstration combining Large Language Models (LLMs) with a Graph Database (Neo4j) to deliver a powerful Hybrid RAG (Retrieval-Augmented Generation) system.

Traditional RAG systems often struggle with structured aggregation and relationship-heavy queries. HOSTOBOT addresses this limitation using a Router Agent that dynamically selects the best tool:

  • βš™οΈ Python Tools β†’ real-time simulations and computations
  • πŸ”Ž Vector Search β†’ qualitative insights (patient sentiment)
  • 🧠 Graph Cypher Generation β†’ quantitative analytics

This platform empowers users to effortlessly request complex hospital-related data using natural language queries.

The code is written in Python 🐍.


πŸ—οΈ Architecture

The intelligent core of this system is the πŸ₯ Hospital RAG Agent, which dynamically routes queries using the following advanced tools:

  • πŸ› οΈ Current Hospitals (get_current_hospitals_tools) β€” Retrieves the active list of hospitals directly from the Neo4j database.
  • πŸ› οΈ Wait Times (get_current_wait_times) β€” Generates a simulated current wait time for patient emergency visits.
  • πŸ› οΈ Most Available Hospital (get_most_available_hospital) β€” Suggests the most optimal hospital with the highest available capacity.
  • πŸ”—πŸ› οΈ Experiences & Reviews (get_reviews) β€” Leverages Vector Search capabilities on patient "Review" nodes via a Neo4jVectorSearchChain, seamlessly integrating qualitative semantic search into the populated graph database.
  • πŸ”—πŸ› οΈ Graph Querying (get_graph) β€” Dynamically translates natural language questions into precise Cypher queries using a CypherChain, enabling direct quantitative question-answering against the complex graph structure.

βš™οΈ Configuration

To run the application, you need the following prerequisites:

  • πŸ”‘ NEO4J Credentials: URI, Username, and Password.
  • πŸ”‘ AI API Key: Choose between OpenAI, Gemini, or Mistral.

πŸ“ Action: Provide these credentials in your .env file and ensure Docker Desktop 🐳 is running before proceeding.


πŸ“‚ Project Structure


LLM_RAG_ETL_Showcase-Hostobot/
β”œβ”€β”€ .env                        # Critical environment configurations
β”œβ”€β”€ docker-compose.yml          # Orchestration for all 4 services
β”‚
β”œβ”€β”€ img/                        # Images to display in the Readme
β”‚   └── dbscheme.png
β”‚   
β”œβ”€β”€ hospital_neo4j_etl/         # DATA LAYER
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ hospital_bulk_csv_write.py  # Bulk loader & relationship mapper
β”‚   β”‚   └── entrypoint.sh               # Execution script
β”‚   └── Dockerfile
β”‚
β”œβ”€β”€ chatbot_api/                # LOGIC LAYER (FastAPI)
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ main.py                     # API entry point & routes
β”‚   β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”‚   └── hospital_rag_agent.py   # Agent & Tool definitions
β”‚   β”‚   β”œβ”€β”€ chains/
β”‚   β”‚   β”‚   β”œβ”€β”€ hospital_cypher_chain.py # Cypher generation logic
β”‚   β”‚   β”‚   └── hospital_review_chain.py # Vector search logic
β”‚   β”‚   β”œβ”€β”€ tools/
β”‚   β”‚   β”‚   └── wait_times.py            # Simulated real-time tool
β”‚   β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”‚   └── hospital_rag_query.py    # Pydantic schemas
β”‚   β”‚   └── utils/
β”‚   β”‚       └── async_utils.py          # Retry decorators
β”‚   └── Dockerfile
β”‚
└── chatbot_frontend/           # PRESENTATION LAYER
    β”œβ”€β”€ src/
    β”‚   └── main.py                     # Streamlit UI logic
    └── Dockerfile



πŸ“Š Knowledge Graph Schema

The Neo4j database is modeled to handle multi-hop questions.

Nodes

  • πŸ₯ Hospital: {id, name, state_name}
  • πŸ’° Payer: {id, name}
  • πŸ‘¨β€βš•οΈ 🩺 Physician: {id, name, dob, school, salary}
  • πŸ§‘β€πŸ€β€πŸ§‘ πŸ€• Patient: {id, name, sex, blood_type}
  • πŸ“… Visit: {id, room_number, admission_type, status, diagnosis}
  • πŸ“ Review: {id, text, patient_name, physician_name}

Relationships

  • (Patient)-[:HAS]->(Visit)
  • (Physician)-[:TREATS]->(Visit)
  • (Visit)-[:AT]->(Hospital)
  • (Visit)-[:COVERED_BY]->(Payer)
  • (Visit)-[:WRITES]->(Review)
  • (Hospital)-[:EMPLOYS]->(Physician)

Neo4j Schema
Graph Ontology: Mapping relationships between Patients, Visits, and Providers.


πŸ› οΈ Installation & Execution

Start the entire microservices stack with a single command:

docker compose up --build

🌐 Local Access Points

Once the Docker containers are built and healthy, you can access the localized services:


❓ Example Questions to Ask

Try testing the agent with some of these complex, natural language questions:

  • Which hospitals are part of the hospital network?
  • What’s the current wait time at Wallace-Hamilton Hospital?
  • At which hospitals are patients reporting issues related to billing or insurance?
  • What’s the average length in days for completed emergency visits?
  • How are patients describing the nursing team at Castaneda-Hardy?
  • What was the total amount billed to each payer during 2023?
  • What is the average charge for visits covered by Medicaid?
  • Which doctor has the shortest average visit duration?
  • What is the total billed amount for patient 789's hospital stay?
  • Which state saw the biggest percentage increase in Medicaid visits from 2022 to 2023?
  • What’s the average daily billing amount for patients with Aetna coverage?
  • How many patient reviews have been submitted from Florida?
  • For visits that include a chief complaint, what percentage also have a review?
  • What percentage of visits at each hospital include patient reviews?
  • Which physician has received the highest number of reviews for their visits?
  • What is the unique identifier for Dr. James Cooper?
  • Show all reviews associated with visits handled by physician 270 β€” include every one.

About

Using an Agentic Microservices Architecture πŸ—οΈ, this is a showcase of a LangChain Agent 🧠 based on a LLM and a list of tools to access Graph RAG data (Neo4j) πŸ•ΈοΈ + MiddleBack/ETL (Python) 🐍 about requesting in natural language hospital data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 96.3%
  • Shell 1.9%
  • Dockerfile 1.8%