Reverse Engineering with Gilhari SDK

Transform existing relational databases into intuitive object-oriented models for LLM-powered applications

This document provides a comprehensive guide to the reverse engineering capabilities of the Gilhari SDK, which automatically converts existing relational database schemas into object-oriented models that are perfectly suited for LLM MCP (Model Context Protocol) tool interactions.

🎯 Why Reverse Engineering Matters for LLMs

The LLM Challenge

Large Language Models excel at understanding and manipulating object-oriented data structures, but struggle with complex SQL queries and relational database concepts. Traditional database interactions require:

Complex JOIN operations
Manual relationship management
SQL syntax knowledge
Database-specific query optimization

The Gilhari Solution

Reverse engineering bridges this gap by:

Automatic Relationship Detection: Infers foreign key relationships from database metadata
Object-Oriented Mapping: Creates intuitive Java classes with relationship arrays for object navigation
LLM-Friendly Structure: Generates models that LLMs can easily understand and manipulate
Zero SQL Required: Eliminates the need for complex SQL queries

🔍 How Reverse Engineering Works

1. Database Schema Analysis

The JDX reverse engineering tool analyzes your existing database schema by:

Examining table structures and column definitions
Identifying foreign key constraints
Mapping data types to Java equivalents
Detecting business relationships

2. Relationship Inference

The tool automatically infers relationships by analyzing:

Foreign Key Constraints: Direct database relationships
Column Naming Patterns: Common naming conventions
Data Type Compatibility: Matching key types
Business Logic Patterns: Real-world relationship patterns

3. Java Class Generation

For each relationship, the tool generates:

Entity Classes: Main business objects with relationship arrays
Constructors: Default and JSON-based constructors
JSON Support: Built-in serialization/deserialization through JDX_JSONObject

📊 E-commerce Example: Relationship Analysis

Database Schema

Our e-commerce example uses a typical PostgreSQL schema with these tables:

supplier - Product suppliers
product - Product catalog
customer - Customer information
address - Customer addresses
customerorder - Customer orders
orderitem - Order line items

Automatically Inferred Relationships

1. Supplier → Product (1:Many)

-- Database relationship
supplier.id → product.supplierid

Generated Java Structure: The Supplier class is generated with:

A relationship array field: Product[] listProduct
Default constructor and JSON-based constructor
Extends JDX_JSONObject for JSON serialization support

What This Enables:

The relationship array listProduct is generated in the Java class
At runtime, JDX automatically populates this array when you access it after loading a Supplier entity
Field attributes like name are defined as VIRTUAL_ATTRIB in the .revjdx file and accessed through JDX's dynamic attribute system
No SQL queries are required in your code - JDX handles the relationship queries automatically

2. Customer → CustomerOrder (1:Many)

-- Database relationship
customer.id → customerorder.customerid

Generated Java Structure: The Customer class is generated with:

Relationship array fields: CustomerOrder[] listCustomerOrder and Address[] listAddress
Default constructor and JSON-based constructor
Extends JDX_JSONObject for JSON serialization support

What This Enables:

The relationship arrays listCustomerOrder and listAddress are generated in the Java class
At runtime, JDX automatically populates these arrays when you access them after loading a Customer entity
Field attributes like totalamount are defined as VIRTUAL_ATTRIB in the .revjdx file and accessed through JDX's dynamic attribute system
JDX handles the relationship queries automatically without requiring SQL in your code

3. Customer → Address (1:Many)

-- Database relationship
customer.id → address.customerid

Note: This relationship is included in the Customer class shown above (see listAddress field).

4. CustomerOrder → OrderItem (1:Many)

-- Database relationship
customerorder.id → orderitem.orderid

Generated Java Structure: The CustomerOrder class is generated with:

A relationship array field: OrderItem[] listOrderItem
Default constructor and JSON-based constructor
Extends JDX_JSONObject for JSON serialization support

What This Enables:

The relationship array listOrderItem is generated in the Java class
At runtime, JDX automatically populates this array when you access it after loading a CustomerOrder entity
Field attributes like quantity are defined as VIRTUAL_ATTRIB in the .revjdx file and accessed through JDX's dynamic attribute system
JDX handles the relationship queries automatically

5. Product → OrderItem (1:Many)

-- Database relationship
product.id → orderitem.productid

Generated Java Structure: The Product class is generated with:

A relationship array field: OrderItem[] listOrderItem
Default constructor and JSON-based constructor
Extends JDX_JSONObject for JSON serialization support

What This Enables:

The relationship array listOrderItem is generated in the Java class
At runtime, JDX automatically populates this array when you access it after loading a Product entity
Field attributes like orderid are defined as VIRTUAL_ATTRIB in the .revjdx file and accessed through JDX's dynamic attribute system
JDX handles the relationship queries automatically

🚀 LLM MCP Tool Integration

Object Model Summary

The getObjectModelSummary tool provides LLMs with a complete overview of the generated object model. This tool returns information about all entities, their fields (as defined in the .revjdx file), and their relationships (as defined by the generated relationship arrays in the Java classes).

Natural Language Queries

With reverse engineering, LLMs can perform complex operations using natural language:

Instead of SQL:

SELECT p.*, s.name as supplier_name 
FROM product p 
JOIN supplier s ON p.supplierid = s.id 
WHERE s.name = 'TechSupply Inc'

LLM can use:

"Show me all products from TechSupply Inc"

What Reverse Engineering Actually Generates: The reverse engineering process generates only the Java model classes with relationship arrays. The actual retrieval of entities and population of relationship arrays is handled by the JDX framework at runtime when you use JDX API methods. The generated Supplier class contains the listProduct relationship array, which JDX automatically populates when you access it after loading a Supplier entity.

🔧 Technical Implementation

Configuration Files

The reverse engineering process creates several configuration files:

1. Database Configuration (`ecommerce_template_postgres.config`)

The reverse engineering process creates a configuration file containing:

Database connection string with JDBC URL, credentials, database type, and debug level
JDBC driver specification
Object model package name
Superclass name (typically JDX_JSONObject)
Settings for accessor method generation and JSON mappings

2. ORM Specification (`.revjdx` files)

The reverse engineering process generates JDX configuration files (.revjdx) that define the object model structure. These files contain:

CLASS definitions: Map each database table to a Java class with:
- VIRTUAL_ATTRIB entries for each column (with Java type mappings)
- PRIMARY_KEY specification
- RELATIONSHIP declarations linking to collection classes
- SQLMAP entries for nullable fields
COLLECTION_CLASS definitions: Define array-based collections for 1:Many relationships, specifying the element class, table, and foreign key

3. Class Name Mapping (`classnames_map_ecommerce.js`)

The reverse engineering process generates a JavaScript JSON file that maps Java class names to their fully qualified package names (e.g., "Supplier": "com.acme.ecommerce.model.Supplier").

Generated Java Classes

Entity Class Structure

The reverse engineering tool generates Java classes with the following characteristics:

Each class extends JDX_JSONObject from the JDX framework
Relationship arrays: Public array fields for 1:Many relationships (e.g., Product[] listProduct)
Constructors: Default constructor and JSON-based constructor for deserialization
Package structure: Classes are placed in the package specified in the configuration
Imports: Required imports for JSONObject, JSONException, and JDX_JSONObject

Important Note: The actual field attributes (like id, name, contactemail, rating) are defined in the .revjdx configuration file as VIRTUAL_ATTRIB entries using lowercase names matching the database column names. The JDX framework handles these attributes dynamically at runtime, so they don't appear as explicit fields in the generated Java class. The relationship arrays (like listProduct) are the primary generated code elements that enable object navigation.

Post Reverse-Engineering Customization

In general, reverse-engineering is supposed to give you a jump-start to create the Java (container) classes and a default ORM specification starting with the chosen database tables. After that, you are free to refine, change, massage, or drop any part of the specification for better comprehension and semantic correctness. For example:

Meaningful attribute names: Give a more meaningful attribute name for a column (e.g., product_id for the column name pid).
Naming conventions: Apply a consistent naming convention for the (JSON) object classes and their attributes (camelCase or snake_case).
Semantic data types: Use semantic knowledge of the data type (e.g., Integer vs. BigDecimal) where it matters for the application.
Exposing or hiding columns: Choose which column values to expose in the Java (JSON) object—for example, you may drop the salary attribute from an Employee class for a given use case.
Ordering for clarity: Rearrange the order of mapping specifications for classes, and the order of attribute specifications within a class, for better semantic understanding—e.g., mention more important attributes or relationships earlier.
Curated domain model: Expose only a curated domain model with a limited set of columns (attributes) and relationships for a particular application. The database table might have 20 columns but the application may need only 7; some related records in other tables need not be included in the object graph. This reduces object complexity and data transfer overhead.
All of the above help make an AI agent (LLM) more effective and efficient when using an object model summary based on the ORM specification.

Here is an example of a refined ORM specification that works with the same E-commerce database schema and existing data: ecommerce_postgres_orm_example.jdx.

🎯 Benefits for LLM Applications

1. Natural Language Processing

LLMs can understand business relationships intuitively
No need to translate between SQL and natural language
Complex queries become simple object navigation

2. Object-Oriented Thinking

Data structures match how humans think about business
Relationships are explicit and navigable through relationship arrays
Object navigation replaces complex SQL JOINs

3. Zero SQL Knowledge Required

LLMs don't need to understand database internals
No complex JOIN operations to manage
Automatic query optimization handled by JDX

4. Automatic Relationship Management

Foreign keys become object references
1:Many relationships become arrays (e.g., Product[] listProduct)
The ORM layer automatically handles relationship queries

5. JSON Integration

Built-in serialization/deserialization
Perfect for REST API responses
Seamless integration with web services

🔍 Debugging and Monitoring

SQL Statement Logging

To see the SQL statements generated by Gilhari/JDX, set DEBUG_LEVEL=3 (or higher) in the JDX_DATABASE line of your configuration file:

JDX_DATABASE JDX:jdbc:postgresql://127.0.0.1:5432/ecommerce;USER=postgres;PASSWORD=<password>;JDX_DBTYPE=POSTGRES;DEBUG_LEVEL=3

This will log all SQL operations, showing how the ORM translates object operations into database queries.

Example SQL Output

DEBUG: Executing SQL: SELECT * FROM supplier WHERE id = ?
DEBUG: Executing SQL: SELECT * FROM product WHERE supplierid = ?
DEBUG: Executing SQL: SELECT * FROM orderitem WHERE productid = ?

🚀 Getting Started with Reverse Engineering

1. Prepare Your Database

Ensure your database has:

Proper foreign key constraints
Consistent naming conventions
Complete table definitions
Sample data for testing

2. Run Reverse Engineering

./smart_reverse_engineer.sh

3. Review Generated Classes

Check the src/ directory for generated Java classes:

Entity classes for each table with relationship arrays
Constructors for object instantiation
Relationship arrays enable object navigation without SQL

4. Test with LLM

Use the getObjectModelSummary tool to verify the model structure:

curl -X GET "http://localhost:8081/gilhari/v1/getObjectModelSummary/now"

5. Monitor SQL Generation

Enable debug logging to see how object operations translate to SQL by setting DEBUG_LEVEL=3 in the JDX_DATABASE configuration line.

🎉 Conclusion

Reverse engineering with the Gilhari SDK transforms the complexity of relational databases into intuitive object-oriented models that LLMs can easily understand and manipulate. This approach:

Eliminates SQL complexity for LLM applications
Provides natural object navigation for business relationships
Automatically handles foreign key management
Generates clean Java code with proper relationships
Enables natural language queries without database knowledge

The result is a powerful foundation for AI-powered database applications that can understand and manipulate complex business data using natural language, making database interactions as intuitive as human conversation.

Ready to transform your database into an LLM-friendly object model? Start with the Quick Start Guide and see how reverse engineering can revolutionize your AI applications!

FilesExpand file tree

REVERSE_ENGINEERING.md

Latest commit

History

REVERSE_ENGINEERING.md

File metadata and controls

Reverse Engineering with Gilhari SDK

🎯 Why Reverse Engineering Matters for LLMs

The LLM Challenge

The Gilhari Solution

🔍 How Reverse Engineering Works

1. Database Schema Analysis

2. Relationship Inference

3. Java Class Generation

📊 E-commerce Example: Relationship Analysis

Database Schema

Automatically Inferred Relationships

1. Supplier → Product (1:Many)

2. Customer → CustomerOrder (1:Many)

3. Customer → Address (1:Many)

4. CustomerOrder → OrderItem (1:Many)

5. Product → OrderItem (1:Many)

🚀 LLM MCP Tool Integration

Object Model Summary

Natural Language Queries

🔧 Technical Implementation

Configuration Files

1. Database Configuration (ecommerce_template_postgres.config)

2. ORM Specification (.revjdx files)

3. Class Name Mapping (classnames_map_ecommerce.js)

Generated Java Classes

Entity Class Structure

Post Reverse-Engineering Customization

🎯 Benefits for LLM Applications

1. Natural Language Processing

2. Object-Oriented Thinking

3. Zero SQL Knowledge Required

4. Automatic Relationship Management

5. JSON Integration

🔍 Debugging and Monitoring

SQL Statement Logging

Example SQL Output

🚀 Getting Started with Reverse Engineering

1. Prepare Your Database

2. Run Reverse Engineering

3. Review Generated Classes

4. Test with LLM

5. Monitor SQL Generation

🎉 Conclusion

1. Database Configuration (`ecommerce_template_postgres.config`)

2. ORM Specification (`.revjdx` files)

3. Class Name Mapping (`classnames_map_ecommerce.js`)