BA Helper is a specialized impact analyzer for backend teams. It bridges the gap between changing business requirements and backend architecture. In research contexts, the engine is referred to as ReqImpact.
When a business requirement changes (e.g., "allow users to cancel paid bookings for a refund"), Technical Business Analysts (BAs) and QA Engineers must manually trace how that change cascades through the backend codebase. This process is historically slow, heavily reliant on tribal knowledge, and lacks an immutable audit trail—often resulting in missed edge cases and unhandled regression risks.
BA Helper automates the heavy lifting of traceability while enforcing strict human oversight. Given a requirement change and a codebase snapshot, the system provides a complete audited workflow:
- Extraction: Parses backend code and constructs an evidence-first impact graph.
- Analysis: Exposes unknowns, risks, and targeted QA scenarios.
- Human Review: Forces an analyst to explicitly accept or reject every proposed traceability link.
- Snapshot: Freezes the reviewed decisions into an immutable reviewed snapshot.
- Final Export: Generates a deterministic, audited markdown report directly from the locked snapshot.
Unlike generic AI coding assistants or repo chatbots:
- No Hallucinated Claims: Every insight must link to a persisted code
Evidencerecord. - Stateful & Persistent: It generates structured, queryable entities (Traceability Links, Evidence, Decisions).
- Human-in-the-Loop: It does not blindly trust AI output. The LLM acts as an analytical reader, and a human acts as the mandatory approver.
- Audit-Style Gating: You cannot download a final report until every single link is manually reviewed and a snapshot is locked.
Our analysis is strictly constrained to prevent hallucinations and fabricated claims:
- Immutable Snapshots: Once a snapshot is taken, the historical record cannot be altered by subsequent live edits.
- Gated Exports: The system strictly blocks final exports if unreviewed links exist or if the snapshot is missing.
- No AI in Final Export: The final markdown report is generated strictly from the frozen database payload, with zero active LLM calls or retrieval processes during the export phase.
The primary golden path demo validates the core evidence-first pipeline. The complete audited workflow involves:
scan → impact analysis → evidence → review → snapshot → async report → drift → rerun lineage.
You can run the definitive automated integration test for the focused TypeScript/NestJS demo path:
pnpm demo:golden-pathVisual Case Study: For a step-by-step visual walkthrough of this workflow, see the Demo Case Study, which features an 8-screen proof pack demonstrating the full end-to-end audit and lifecycle process.
Sample Requirement:
"When a paid booking is cancelled, the system must refund the tenant, prevent double refunds, update booking/payment state, and notify relevant parties."
Built as a TypeScript modular monolith to balance speed of development with eventual microservice readiness:
- Frontend: Next.js App Router, Tailwind CSS, Shadcn UI (React 19).
- Backend API: NestJS HTTP API serving frontend requests.
- Workers: NestJS BullMQ background processors for heavy analysis and extraction.
- Persistence: PostgreSQL (Prisma) for relational state and pgvector for embeddings. Redis for job queues.
- Contracts: Shared Zod API schemas bounding the frontend and backend.
This reviewed snapshot behavior is covered by invariant test suites:
- E17A Backend Tests: Asserts that missing snapshots and unreviewed links block the gate at the API level, and that final reports are derived purely from snapshot payloads.
- E17B Frontend Tests: MSW/JSDOM UI test suites assert that incomplete gate states visually disable export functionality, and complete states correctly dispatch the frozen markdown Blob to the user.
graph TD
A[Requirement Change] --> B(Repository Snapshot & Scan Health)
B --> C{Evidence-first Impact Analysis}
C -->|Domain Pack Hints| D[Evidence-backed Impacted Artifacts]
C -->|Missing Code| E[Unknowns / Risks / QA Scenarios]
D --> F[Human Review Gate]
E --> F
F --> G[Traceability Report]
G -.-> H[Drift / Freshness Warning]
graph BT
A[Scanned Code Evidence] -->|Base Truth| B[Human Review Finalization]
C[Domain Pack Hints] -.->|Guides Search| B
D[LLM Suggestions] -.->|Structures Claims| B
Note: EVIDENCED impacts require Scanned Code Evidence. Domain Packs and LLM Suggestions cannot fabricate evidence.
We designed this project to be highly reproducible locally. No real LLM or embedding API keys are required to run the automated demo test or spin up the platform.
Fresh clone validation:
[ ] pnpm install works
[ ] local DB starts
[ ] migrations apply
[ ] typecheck passes
[ ] golden path demo passes
[ ] no external AI keys required
- Docker & Docker Compose (for Postgres/pgvector and Redis)
- Node.js (v20+)
- pnpm (v9+)
git clone https://github.com/hungthinh1104/BA_Helper.git
cd ba-helper
pnpm installCreate the environment files from their examples. The examples contain safe, pre-configured local placeholders (including a fake AI provider).
cp apps/api/.env.example apps/api/.env
cp apps/web/.env.example apps/web/.env.localFor containerized web runtime, keep two URLs straight:
NEXT_PUBLIC_API_URL: browser-visible API origin, usuallyhttp://localhost:3001INTERNAL_API_URL: server-side API origin inside the web container, usuallyhttp://api:3001
Launch the Postgres and Redis containers in the background:
docker compose up -d postgres redisApply the Prisma schema to your local Postgres database:
pnpm --dir apps/api exec prisma generate
pnpm --dir apps/api exec prisma migrate deploy --schema prisma/schema.prismaWe provide an idempotent seed script to populate a realistic "Booking Cancellation" scenario directly into the database. This is the fastest way to experience the Human Review Gate and Export workflow without external LLM keys.
- See the Local Demo Runbook for full setup.
- Run
pnpm db:migrateandpnpm db:seed:demo. - Follow the Demo Acceptance Checklist to walk through the UI.
Run the automated integration test to verify the deterministic, end-to-end impact analyzer flow programmatically using a fake LLM provider.
pnpm demo:golden-path
# Or explicitly: pnpm test tests/demo/golden-path-demo.spec.tsNote: This automated command runs entirely locally using FakeLlmProvider and FakeEmbeddingProvider so CI stays deterministic. The manual UI demo uses Gemini when AI_PROVIDER=google and GEMINI_API_KEY or GOOGLE_API_KEY is set.
If you wish to test the retrieval and domain matching logic explicitly:
pnpm test tests/evaluation/impact-evaluation.spec.tsIf you wish to run the full UI and Backend locally:
# Start backend API (Port 3001)
pnpm dev:api
# Start background worker
pnpm dev:worker
# Start frontend web app (Port 3000)
pnpm dev:webOpen http://localhost:3000/login and sign in using the dev-login bypass.
The default CI and golden path stay on fake providers. Real-provider smoke is explicit and manual:
# Deterministic local smoke
pnpm --dir apps/api smoke:public-github
# Real Gemini LLM + fake embeddings
AI_PROVIDER=google EMBEDDING_PROVIDER=fake pnpm --dir apps/api smoke:public-github:real-llm
# Real Gemini LLM + Google embeddings
AI_PROVIDER=google EMBEDDING_PROVIDER=google pnpm --dir apps/api smoke:public-github:real-pathWhen running the containerized stack, use the dedicated migration owner first:
docker compose up -d --build migrate api worker webThis compose topology now matches the current project shape:
migrateowns schema deploymentapiserves the backend on3001workerhandles queued jobswebserves the Next.js frontend on3000
Avoid docker compose config in shared logs when real provider keys are loaded in your shell, because Compose expands current environment values into the resolved output.
- Database Connection Fails: Ensure Docker is running. The default
.env.examplepoints topostgresql://ba_helper:ba_helper@localhost:5432/ba_helperwhich matches thedocker-compose.ymlcredentials. - Fixture Path Not Found: If you see "0 artifacts extracted" in the demo test, ensure you did not modify the
tests/fixtures/nestjs-booking-with-paymentdirectory structure. - Prisma Client Issues: If types are out of sync or tests fail to compile, run
pnpm --dir apps/api prisma generateto refresh the client. - Port Conflicts: Ensure ports
3000(Web),3001(API),5432(Postgres), and6379(Redis) are free on your host machine.
We prioritize keeping your proprietary code safe without overclaiming formal security certifications:
- No Remote Code Execution: The scanner performs static regex and AST-based extraction. It never executes your repository code.
- Production Failsafe: The application is hardened to fail fast if critical environment variables are missing or set to weak development defaults in production.
- No Raw Vectors: No raw embedding vectors are dumped in diagnostics or reports.
- Bounded Diagnostics: Scans are bounded by file size and count limits to prevent OOM errors.
- Evidence Hierarchy: Strict constraints to prevent orphaned AI claims.
- Review Gate: Manual human-in-the-loop review ensures safe outputs.
- Snapshot-Scoped Embedding Reuse: Vectors are tightly scoped to a specific repository snapshot commit; no old snapshot chunk leakage is permitted.
- Safe Fallback: Unrecognized domains fallback to the
general@0.0.0domain pack.
Built as a TypeScript modular monolith to balance speed of development with eventual microservice readiness:
- apps/web: Next.js App Router frontend (React, Tailwind, Shadcn).
- apps/api: NestJS HTTP API serving frontend requests.
- apps/worker: NestJS BullMQ background processors for heavy analysis and extraction.
- packages/analyzer: Headless static extraction utilities with explicit scanner capability metadata.
- packages/contracts: Shared Zod API schemas bounding the frontend and backend.
- Persistence: PostgreSQL (Prisma) for relational state and pgvector for embeddings. Redis for job queues.
For more details, see Architecture Documentation.
- Primary demo stack: TypeScript/NestJS is the strongest and
STABLEscanner path. - Pilot scanner adapters: Java/Spring Boot is
PARTIAL; Gonet/http, Go/Gin, Python/FastAPI, C#/ASP.NET Core, PHP/Laravel, and Ruby/Rails areEXPERIMENTALcapability proofs. - Capability metadata: Every scan exposes
SCANNER_CAPABILITY_SUMMARYso reviewers can see whether a result came from aSTABLE,PARTIAL, orEXPERIMENTALadapter. - Output generation: Impact matrices, QA scenarios, unknown/risk tracking, human review gates, deterministic snapshot-sourced Markdown/PDF exports, and drift-aware lineage reports.
- TypeScript/NestJS is the strongest scanner path.
- Multi-language adapters are bounded pilots. They demonstrate deterministic extraction contracts, not full compiler-level semantic analysis.
- Unsupported route patterns, file scan blind spots, artifact uncertainty, and dependency boundaries become diagnostics,
UNKNOWN, orRISKitems requiring review. - Experimental scanners must not be presented as production-grade language support.
- Domain packs are hints, not evidence.
- LLM output is constrained by extracted evidence and human review; it is not allowed to finalize reports by itself.
- Evaluation metrics are internal quality signals, not public benchmarks.
- Automated CI golden path uses fake providers; manual UI demo runs with Gemini real LLM when configured.
- Production SaaS concerns such as GitHub App auth, billing, and hosted multi-tenant deployment are not complete.
- Keep TypeScript/NestJS as the primary public demo story.
- Harden pilot scanner adapters while keeping capability status explicit.
- Improve visual review and traceability flows without weakening the evidence hierarchy.
- Native OAuth and GitHub App integrations.
- Golden Path Demo Guide
- Sample Requirement Change
- Public Beta Release Note
- Portfolio Proof Pack
- Public Demo Checklist
- Impact Evaluation Docs
- Domain Pack Architecture
- Security Policy
- Contributing Guide
Please see our agent rules and coding standards before submitting pull requests. All code must adhere to the modular monolith boundaries and state machine invariants.