Turn any topic into a live map of what the internet thinks — in under a minute.
Eye Data is a reference application showing what's possible when you combine Bright Data's web data infrastructure with NVIDIA's NeMo Agent Toolkit and a modern LLM pipeline.
Give it a topic — a company, a product, a brand, a trend — and in ~30 seconds it will:
- Discover conversations happening right now on Reddit, X, and LinkedIn (plus the open web)
- Analyze sentiment, themes, risks, and opportunities across ~90 real posts
- Synthesize them into a handful of narratives that tell you what's actually being said
- Visualize the evidence as a searchable graph + timeline so you can drill into every claim
It's a social-listening tool, a brand-intelligence tool, a competitive-intelligence tool, a market-research tool — depending on the topic you give it. The underlying engine doesn't care.
|
Social data is only as good as your ability to collect it at scale, reliably, and without getting blocked. Bright Data is the web data infrastructure this kind of agent needs to exist:
The web is the largest training-and-grounding corpus in the world. Bright Data is how agents reach it. |
Everything that happens below the "the LLM decides what to search for" step in this app runs on Bright Data: |
The orchestration layer is NVIDIA's NeMo Agent Toolkit (NAT) — a production-grade runtime for LangGraph agents that gives us four things for free:
- OpenAI-compatible API surface (
/v1/chat/completionswithstream: true) — any OpenAI client works against it - First-class WebSocket support for real-time streaming of intermediate steps
- Automatic OpenTelemetry instrumentation for every LLM call (Langfuse, Langsmith, Phoenix, Grafana Tempo — all one config line away)
- Plugin system for workflows — we register a
langgraph_wrapperworkflow that wraps the 8-stage pipeline as a single agent the toolkit can serve
The LLM itself is pluggable: OpenRouter, NVIDIA NIM, local inference — anything speaking the OpenAI API works.
The 8-stage pipeline (click to expand)
Every run executes these stages as a LangGraph state machine, streaming progress events to the browser as it goes.
| # | Stage | What it does | Data in / out |
|---|---|---|---|
| 1 | plan | LLM generates 6 targeted search keywords from the topic. Falls back to heuristics if the LLM stumbles. | topic → keywords[] |
| 2 | search | Bright Data SERP API runs those keywords against Reddit, X, LinkedIn, and open-web discovery. | keywords[] → ~315 raw SERP results |
| 3 | collect | Deduplicates by URL; builds normalized CollectedPost objects. |
SERP results → posts[] |
| 4 | rank | Heuristic scoring (topical relevance, recency, engagement, text richness) picks the best ~90 via round-robin across platforms. | posts[] → ranked_posts[] |
| 5 | analyze | LLM classifies each post in batches of 16 (up to 25 concurrent calls): sentiment, key takeaway, risk/opportunity tags, narrative candidates. | posts[] → analyses[] |
| 6 | synthesize | LLM clusters the analyses into 3–10 narratives with title/summary/momentum/sentiment lean. Produces an executive brief (risks, opportunities, recommended actions). | analyses[] → narratives[] + brief |
| 7 | render | Builds three visual artifacts: evidence graph (nodes + edges), timeline tape, sentiment clusters. | narratives[] → artifacts |
| 8 | persist | Parallel writes to Neon Postgres (posts, analyses, narratives, brief, session meta, full event log). | everything → DB |
Each stage emits CUSTOM_START / CUSTOM_END OpenTelemetry events with structured progress. The frontend consumes these via SSE and renders the live terminal view.
What ends up in the database
Seven tables in Neon Postgres capture the complete run — the session itself, every event it emitted, every post collected, the LLM's analysis of each post, the synthesized narratives, and the executive brief. This means any session is fully replayable after the fact; no need to re-scrape the web.
| Table | Contents |
|---|---|
sessions |
Session id, topic, status, progress, artifacts |
session_events |
Structured event log (every stage transition) |
session_posts |
Every collected post + scoring signals |
session_post_analysis |
LLM output per post (sentiment, tags, takeaway) |
session_narratives |
The clustered narratives for the session |
session_briefs |
Risks / opportunities / recommended actions |
┌────────────────────────────────────────────────────────────────┐
│ demos.brightdata.com/eye-data │
│ (Vercel) │
│ │
│ ┌──────────────────────┐ ┌───────────────────────┐ │
│ │ Next.js 16 App │ │ Server-only /api/run │ │
│ │ React 19 · Tailwind │◄────────►│ (proxies backend, │ │
│ │ Framer Motion │ SSE │ hides URL from │ │
│ │ react-force-graph-2d │ │ the browser) │ │
│ └──────────────────────┘ └───────────┬───────────┘ │
└──────────────────────────────────────────────┬─┴────────────────┘
│
HTTPS · stream=true
│
▼
┌────────────────────────────────────────────────────────────────┐
│ AWS Lightsail Container Service │
│ (us-east-1 · nano tier) │
│ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ NVIDIA NeMo Agent Toolkit │ │
│ │ ├── FastAPI · /v1/chat/completions · /websocket │ │
│ │ ├── OpenTelemetry instrumentation (opt-in exporters) │ │
│ │ └── LangGraph: 8-stage pipeline (plan→…→persist) │ │
│ └────────────────────────────────────────────────────────┘ │
└──────────┬──────────────────────┬──────────────────────┬───────┘
│ │ │
▼ ▼ ▼
┌────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Bright Data │ │ OpenRouter / NIM │ │ Neon Postgres │
│ │ │ │ │ │
│ SERP API │ │ LLM inference │ │ Sessions, posts,│
│ Web Unlocker │ │ (pluggable) │ │ narratives, │
│ │ │ │ │ briefs, events │
└────────────────┘ └──────────────────┘ └──────────────────┘
|
Frontend
|
Backend
|
Infrastructure
|
.
├── .github/workflows/vercel.yml # CI: deploy frontend to Vercel on push
├── eye-data/
│ ├── eye-data-app/ # Next.js 16 frontend (Vercel)
│ │ ├── src/app/ # Pages, API routes
│ │ ├── src/components/ # Terminal, dashboard, evidence graph
│ │ └── src/lib/ # run-transport, nat-protocol, env
│ │
│ └── pipeline/ # Python backend (AWS Lightsail)
│ ├── pipeline/
│ │ ├── graph/ # LangGraph pipeline + per-stage nodes
│ │ ├── brightdata/ # SERP + Web Unlocker clients
│ │ ├── llm/ # OpenAI-compatible LLM client
│ │ ├── db/ # SQLAlchemy models + repos
│ │ └── prompts/ # Versioned prompt templates
│ ├── nat_config.yaml # NeMo Agent Toolkit workflow config
│ ├── Dockerfile # Multi-stage build for Lightsail
│ ├── deploy.sh # Manual deploy to Lightsail via GHCR
│ └── requirements.lock.txt # Pinned Python deps (no pip backtracking)
cd eye-data/eye-data-app
cp .env.local.example .env.local
# edit .env.local (see the file for every variable)
npm install
npm run devThe app serves under /eye-data (configurable via NEXT_PUBLIC_BASE_PATH). Visit http://localhost:3000/eye-data.
cd eye-data/pipeline
pip install -r requirements.lock.txt
pip install -e . --no-deps
nat serve --config_file nat_config.yaml --port 8001Or in a container that matches production:
docker build -t eye-data-api .
docker run --rm -p 8080:8080 --env-file ../.env eye-data-apiFull deployment uses:
- Frontend → Vercel (GitHub Actions workflow in
.github/workflows/vercel.yml) - Backend → AWS Lightsail Container Service, deployed via
eye-data/pipeline/deploy.sh - Database → Neon Postgres (managed)
The deploy script builds a linux/amd64 image, pushes it to GitHub Container Registry, then tells Lightsail to pull and run. Total first deploy: ~20 minutes. Subsequent deploys (cached layers): ~3 minutes.
See the comment header at the top of deploy.sh for the one-time prerequisites (AWS SSO login, docker login ghcr.io, QEMU binfmt registration for cross-builds).
|
The world's leading web data infrastructure. Fortune 500 companies, top universities, and every serious AI lab use Bright Data to turn the public web into structured data — compliantly, at massive scale, from any geography. For AI specifically, Bright Data provides:
If your AI needs to know what the world is doing right now, Bright Data is how it finds out. |
Production runtime for LLM agents. NAT wraps LangGraph-style agents in a hardened serving layer with OpenAI-compatible APIs, WebSocket streaming, OpenTelemetry instrumentation, and a plugin system for tools, guardrails, and evaluators. For this app specifically, NAT gives us:
|
This is a reference implementation. See LICENSE for terms. The Bright Data and NVIDIA names and logos are trademarks of their respective owners
Built with Bright Data + NVIDIA NeMo Agent Toolkit
