An Agentic AI system that automatically analyzes Google Analytics (GA4) traffic data from Excel, computes YOY and Month-over-Month metrics, and generates strictly formatted, non-hallucinated business summaries using Groq LLM.
This project demonstrates safe, production-ready LLM integration by clearly separating deterministic computation from generative reasoning.
- ✅ Multi-table GA4 Excel parsing
- ✅ Automatic YOY % and LM % calculations
- ✅ State-driven agent workflow (INIT → COMPLETE)
- ✅ LLM-generated summaries in exact predefined format
- ✅ Anti-hallucination design (LLM reasoning only)
- ✅ Professional, styled Excel output
- ✅ Fallback logic if LLM is unavailable
GA4 (Google Analytics 4) traffic reports often:
- Contain multiple tables in a single Excel sheet
- Require manual calculations for YOY and MoM
- Depend on human-written summaries, leading to inconsistency
LLMs alone cannot be trusted with numerical accuracy.
👉 This project solves that by:
- Using Python for all calculations
- Using LLM only for explanation
- Enforcing strict output formats
- Producing business-ready reports
Excel Input (GA4 Data)
│
▼
Traffic Analysis Agent (State Machine)
│
┌──────┴────────┐
│ │
▼ ▼
Table Parser Metrics Engine
│ │
└──────┬────────┘
▼
LLM Summary Agent (Groq – LLaMA 3.3)
│
▼
Styled Excel Report (Output)
- Agentic orchestration via state machine
- Deterministic math, generative reasoning
- Fail-safe design with fallback summaries
| State | Description |
|---|---|
| INIT | Initialize agent and config |
| LOAD_DATA | Load Excel GA4 data |
| PARSE_TABLES | Detect and parse multiple tables |
| CALCULATE_METRICS | Compute YOY %, LM %, totals |
| GENERATE_SUMMARIES | LLM-based reasoning |
| CREATE_OUTPUT | Styled Excel report |
| COMPLETE | Successful execution |
(Current Year − Previous Year) / Previous Year × 100
(Current Month − Previous Month) / Previous Month × 100
- Totals calculated only till August
- YOY applied to Total row
- % Change row auto-updated
- All numeric values validated before use
- Model:
llama-3.3-70b-versatile - Purpose: Narrative reasoning only
- Temperature:
0.2 - No calculations performed by LLM
- Reasoning (Aug vs July – 1 sentence)
- Year-over-Year comparison
- Trend analysis
- Important notes
- Explicit data injection into prompt
- Strict formatting rules
- Low temperature
- Python-computed values only
The final Excel report includes:
- 📌 Calculated GA4 tables
- 📌 LLM summaries placed above each table
- 📌 Color-coded trends
- 📌 Business-ready formatting
Ideal for:
- Stakeholder reviews
- Management presentations
- Monthly analytics reporting
- Python
- Pandas / NumPy
- OpenPyXL
- Groq API
- LLaMA 3.3 (70B)
- Excel (GA4 Data)
- The agent loads a GA4 Excel file containing multiple traffic tables
- No assumptions about table count or placement
- The system progresses through explicit states
- Each state has one responsibility
- Failures are isolated and debuggable
This avoids tightly coupled scripts and makes the system predictable.
- Tables are detected dynamically using
"Month"headers - Supports inconsistent Excel formatting
- Duplicate columns handled safely
All calculations are done before invoking the LLM:
- YOY %
- Last Month %
- Totals till August
- % Change rows
✔ Reproducible ✔ Accurate ✔ Excel-equivalent
- LLM receives pre-computed data only
- Generates business summaries
- Enforced exact format
❌ No math ❌ No data invention
- Summaries placed above each table
- Styled Excel formatting
- Stakeholder-ready output
flowchart TD
Start([Start]) --> A[📊 GA4 Excel Input<br/>Multiple Traffic Tables]
A --> B{🤖 Traffic Analysis Agent<br/>State Machine}
B -->|State 1| C[📥 LOAD_DATA]
C -->|pandas.read_excel| C1[Validate Structure]
C1 --> C2[Load All Sheets]
C2 -->|State 2| D[🔍 PARSE_TABLES]
D --> D1[Detect Month Headers]
D1 --> D2[Extract Table Ranges]
D2 --> D3[Separate Multiple Tables]
D3 -->|State 3| E[📐 CALCULATE_METRICS]
E --> E1[🔢 YOY % Calculation<br/>Current vs Last Year]
E --> E2[📊 LM % Calculation<br/>Current vs Last Month]
E --> E3[➕ Total Rows<br/>Sum + % Change]
E1 & E2 & E3 -->|State 4| F[🧠 GENERATE_SUMMARIES]
F --> G[☁️ Groq API Call]
G --> G1[LLaMA 3.3 70B Model]
G1 --> G2[Prompt: Exact Format Instructions]
G2 --> H[📝 Structured Summary]
H --> H1[💡 Reasoning Section]
H --> H2[📈 YOY Analysis]
H --> H3[📉 Trend Observations]
H --> H4[📌 Key Notes]
H1 & H2 & H3 & H4 -->|State 5| I[📦 CREATE_OUTPUT]
I --> I1[Build Excel Workbook]
I1 --> I2[Apply Cell Styling]
I2 --> I3[Format Headers]
I3 --> I4[Add Summary Sections]
I4 --> J[💾 Styled Excel Output<br/>Professional Report]
J --> K([✅ COMPLETE])
%% Styling
classDef input fill:#E3F2FD,stroke:#1E88E5,stroke-width:3px
classDef orchestrator fill:#E8F5E9,stroke:#2E7D32,stroke-width:3px
classDef processing fill:#FFFDE7,stroke:#F9A825,stroke-width:2px
classDef metrics fill:#F3E5F5,stroke:#6A1B9A,stroke-width:2px
classDef llm fill:#FCE4EC,stroke:#C2185B,stroke-width:2px
classDef output fill:#E0F2F1,stroke:#00695C,stroke-width:2px
classDef terminal fill:#ECEFF1,stroke:#37474F,stroke-width:3px
class Start,A input
class B orchestrator
class C,C1,C2,D,D1,D2,D3 processing
class E,E1,E2,E3 metrics
class F,G,G1,G2,H,H1,H2,H3,H4 llm
class I,I1,I2,I3,I4,J output
class K terminal
| Responsibility | Component |
|---|---|
| Math & Logic | Python |
| Orchestration | Agent State Machine |
| Reasoning | LLM |
| Presentation | Excel Builder |
LLMs explain results — they do not compute them.
- Data is injected directly
- Strict formatting rules
- Low temperature
- Fallback summaries
- Dynamic table detection
- Context-aware summaries
- Multi-stage decision flow
- Failure-tolerant execution
- Deterministic execution
- Observable states
- Scalable to more tables
- Business-aligned output
pip install pandas numpy openpyxl python-dotenvexport GROQ_API_KEY=your_api_key_here(Windows)
set GROQ_API_KEY=your_api_key_herepython agentic_main.pyIf no Groq API key is provided:
- The agent still runs
- Summaries are generated using deterministic fallback logic
- No execution failure
- Real-world Agentic AI design
- Safe and responsible LLM usage
- Production-grade data automation
- Explainable AI workflows
- Clean separation of logic and reasoning
Have questions or suggestions?
- 📧 Email: harshitwaldia112@gmail.com
- 🐦 Twitter: @HarshitWaldia
- 💼 LinkedIn: Harshit Waldia
- ⚙️ GitHub: @HarshitWaldia
If you find this project helpful, please consider giving it a star! ⭐
Harshit Waldia
Ahaṁ Brahmāsmi | अहं ब्रह्मास्मि
The true self is not the body but an eternal, infinite part of the universe