Eye Data

×

Eye Data

A social intelligence terminal powered by the world's best web data infrastructure.

Turn any topic into a live map of what the internet thinks — in under a minute.

Live demo · How it works · Architecture · Deploy your own

What is this?

Eye Data is a reference application showing what's possible when you combine Bright Data's web data infrastructure with NVIDIA's NeMo Agent Toolkit and a modern LLM pipeline.

Give it a topic — a company, a product, a brand, a trend — and in ~30 seconds it will:

Discover conversations happening right now on Reddit, X, and LinkedIn (plus the open web)
Analyze sentiment, themes, risks, and opportunities across ~90 real posts
Synthesize them into a handful of narratives that tell you what's actually being said
Visualize the evidence as a searchable graph + timeline so you can drill into every claim

It's a social-listening tool, a brand-intelligence tool, a competitive-intelligence tool, a market-research tool — depending on the topic you give it. The underlying engine doesn't care.

Why Bright Data is the backbone

Social data is only as good as your ability to collect it at scale, reliably, and without getting blocked.

Bright Data is the web data infrastructure this kind of agent needs to exist:

SERP API — programmatic Google/Bing/Yandex search, with filters that would take you weeks to replicate
Web Unlocker — scrape the sites that block everyone else (Reddit, X, LinkedIn, e-commerce platforms) without a single proxy or CAPTCHA conversation
Dataset marketplace — 150+ pre-built datasets for LinkedIn, Amazon, Instagram, TikTok, and more
MCP + Agent-native APIs — built from day one for AI agents, not hacked onto a scraper

The web is the largest training-and-grounding corpus in the world. Bright Data is how agents reach it.

Everything that happens below the "the LLM decides what to search for" step in this app runs on Bright Data:

┌─────────────────────────────────────┐
│  LLM generates 6 search keywords    │
└────────────────┬────────────────────┘
                 ▼
┌─────────────────────────────────────┐
│  Bright Data SERP API               │
│  → Reddit + X + LinkedIn + Web      │
│  → ~315 results in <15 seconds      │
└────────────────┬────────────────────┘
                 ▼
┌─────────────────────────────────────┐
│  Bright Data Web Unlocker           │
│  → Post-level content extraction    │
│  → Markdown-ready, no CAPTCHAs      │
└────────────────┬────────────────────┘
                 ▼
┌─────────────────────────────────────┐
│  Ranked down to 90 posts            │
│  Ready for the LLM to analyze       │
└─────────────────────────────────────┘

Why NVIDIA NeMo Agent Toolkit

The orchestration layer is NVIDIA's NeMo Agent Toolkit (NAT) — a production-grade runtime for LangGraph agents that gives us four things for free:

OpenAI-compatible API surface (/v1/chat/completions with stream: true) — any OpenAI client works against it
First-class WebSocket support for real-time streaming of intermediate steps
Automatic OpenTelemetry instrumentation for every LLM call (Langfuse, Langsmith, Phoenix, Grafana Tempo — all one config line away)
Plugin system for workflows — we register a langgraph_wrapper workflow that wraps the 8-stage pipeline as a single agent the toolkit can serve

The LLM itself is pluggable: OpenRouter, NVIDIA NIM, local inference — anything speaking the OpenAI API works.

How it works

The 8-stage pipeline (click to expand)

Every run executes these stages as a LangGraph state machine, streaming progress events to the browser as it goes.

#	Stage	What it does	Data in / out
1	plan	LLM generates 6 targeted search keywords from the topic. Falls back to heuristics if the LLM stumbles.	topic → keywords[]
2	search	Bright Data SERP API runs those keywords against Reddit, X, LinkedIn, and open-web discovery.	keywords[] → ~315 raw SERP results
3	collect	Deduplicates by URL; builds normalized `CollectedPost` objects.	SERP results → posts[]
4	rank	Heuristic scoring (topical relevance, recency, engagement, text richness) picks the best ~90 via round-robin across platforms.	posts[] → ranked_posts[]
5	analyze	LLM classifies each post in batches of 16 (up to 25 concurrent calls): sentiment, key takeaway, risk/opportunity tags, narrative candidates.	posts[] → analyses[]
6	synthesize	LLM clusters the analyses into 3–10 narratives with title/summary/momentum/sentiment lean. Produces an executive brief (risks, opportunities, recommended actions).	analyses[] → narratives[] + brief
7	render	Builds three visual artifacts: evidence graph (nodes + edges), timeline tape, sentiment clusters.	narratives[] → artifacts
8	persist	Parallel writes to Neon Postgres (posts, analyses, narratives, brief, session meta, full event log).	everything → DB

Each stage emits CUSTOM_START / CUSTOM_END OpenTelemetry events with structured progress. The frontend consumes these via SSE and renders the live terminal view.

What ends up in the database

Seven tables in Neon Postgres capture the complete run — the session itself, every event it emitted, every post collected, the LLM's analysis of each post, the synthesized narratives, and the executive brief. This means any session is fully replayable after the fact; no need to re-scrape the web.

Table	Contents
`sessions`	Session id, topic, status, progress, artifacts
`session_events`	Structured event log (every stage transition)
`session_posts`	Every collected post + scoring signals
`session_post_analysis`	LLM output per post (sentiment, tags, takeaway)
`session_narratives`	The clustered narratives for the session
`session_briefs`	Risks / opportunities / recommended actions

Architecture

        ┌────────────────────────────────────────────────────────────────┐
        │                    demos.brightdata.com/eye-data                │
        │                           (Vercel)                              │
        │                                                                 │
        │  ┌──────────────────────┐          ┌───────────────────────┐    │
        │  │ Next.js 16 App       │          │ Server-only /api/run  │    │
        │  │ React 19 · Tailwind  │◄────────►│ (proxies backend,     │    │
        │  │ Framer Motion        │   SSE    │  hides URL from       │    │
        │  │ react-force-graph-2d │          │  the browser)         │    │
        │  └──────────────────────┘          └───────────┬───────────┘    │
        └──────────────────────────────────────────────┬─┴────────────────┘
                                                       │
                                       HTTPS · stream=true
                                                       │
                                                       ▼
        ┌────────────────────────────────────────────────────────────────┐
        │               AWS Lightsail Container Service                  │
        │                    (us-east-1 · nano tier)                     │
        │                                                                │
        │   ┌────────────────────────────────────────────────────────┐   │
        │   │  NVIDIA NeMo Agent Toolkit                              │   │
        │   │  ├── FastAPI · /v1/chat/completions · /websocket       │   │
        │   │  ├── OpenTelemetry instrumentation (opt-in exporters)   │   │
        │   │  └── LangGraph: 8-stage pipeline (plan→…→persist)       │   │
        │   └────────────────────────────────────────────────────────┘   │
        └──────────┬──────────────────────┬──────────────────────┬───────┘
                   │                      │                      │
                   ▼                      ▼                      ▼
          ┌────────────────┐   ┌──────────────────┐   ┌──────────────────┐
          │   Bright Data  │   │ OpenRouter / NIM │   │  Neon Postgres   │
          │                │   │                  │   │                  │
          │  SERP API      │   │  LLM inference   │   │  Sessions, posts,│
          │  Web Unlocker  │   │  (pluggable)     │   │  narratives,     │
          │                │   │                  │   │  briefs, events  │
          └────────────────┘   └──────────────────┘   └──────────────────┘

Stack at a glance

Frontend

Next.js 16 (App Router)
React 19, TypeScript
Tailwind CSS v4
Framer Motion
Lucide icons
react-force-graph-2d
Drizzle ORM + Neon serverless

Backend

Python 3.11
NVIDIA NeMo Agent Toolkit
LangGraph (8-node state machine)
LangChain (OpenAI / NVIDIA)
NeMo Guardrails
FastAPI + SSE
SQLAlchemy (async) + asyncpg

Infrastructure

Bright Data SERP + Web Unlocker
NVIDIA NeMo Agent Toolkit
Vercel for the Next.js frontend
AWS Lightsail containers for NAT
GHCR for container images
Neon Postgres for persistence
Pluggable: Langfuse, Phoenix, Langsmith for observability

Repository layout

.
├── .github/workflows/vercel.yml   # CI: deploy frontend to Vercel on push
├── eye-data/
│   ├── eye-data-app/              # Next.js 16 frontend (Vercel)
│   │   ├── src/app/               # Pages, API routes
│   │   ├── src/components/        # Terminal, dashboard, evidence graph
│   │   └── src/lib/               # run-transport, nat-protocol, env
│   │
│   └── pipeline/                  # Python backend (AWS Lightsail)
│       ├── pipeline/
│       │   ├── graph/             # LangGraph pipeline + per-stage nodes
│       │   ├── brightdata/        # SERP + Web Unlocker clients
│       │   ├── llm/               # OpenAI-compatible LLM client
│       │   ├── db/                # SQLAlchemy models + repos
│       │   └── prompts/           # Versioned prompt templates
│       ├── nat_config.yaml        # NeMo Agent Toolkit workflow config
│       ├── Dockerfile             # Multi-stage build for Lightsail
│       ├── deploy.sh              # Manual deploy to Lightsail via GHCR
│       └── requirements.lock.txt  # Pinned Python deps (no pip backtracking)

Self-host

Run the frontend locally

cd eye-data/eye-data-app
cp .env.local.example .env.local
# edit .env.local (see the file for every variable)
npm install
npm run dev

The app serves under /eye-data (configurable via NEXT_PUBLIC_BASE_PATH). Visit http://localhost:3000/eye-data.

Run the backend locally

cd eye-data/pipeline
pip install -r requirements.lock.txt
pip install -e . --no-deps
nat serve --config_file nat_config.yaml --port 8001

Or in a container that matches production:

docker build -t eye-data-api .
docker run --rm -p 8080:8080 --env-file ../.env eye-data-api

Deploy

Full deployment uses:

Frontend → Vercel (GitHub Actions workflow in .github/workflows/vercel.yml)
Backend → AWS Lightsail Container Service, deployed via eye-data/pipeline/deploy.sh
Database → Neon Postgres (managed)

The deploy script builds a linux/amd64 image, pushes it to GitHub Container Registry, then tells Lightsail to pull and run. Total first deploy: ~20 minutes. Subsequent deploys (cached layers): ~3 minutes.

See the comment header at the top of deploy.sh for the one-time prerequisites (AWS SSO login, docker login ghcr.io, QEMU binfmt registration for cross-builds).

About the companies behind this

Bright Data

The world's leading web data infrastructure.

Fortune 500 companies, top universities, and every serious AI lab use Bright Data to turn the public web into structured data — compliantly, at massive scale, from any geography.

For AI specifically, Bright Data provides:

Real-time web access for agents (MCP + native APIs)
SERP results from every major search engine
Web Unlocker that bypasses anti-bot systems on the hardest sites
150+ prebuilt datasets for major platforms
Full compliance posture (GDPR, CCPA, SOC2, ISO)

If your AI needs to know what the world is doing right now, Bright Data is how it finds out.

NVIDIA NeMo Agent Toolkit

Production runtime for LLM agents.

NAT wraps LangGraph-style agents in a hardened serving layer with OpenAI-compatible APIs, WebSocket streaming, OpenTelemetry instrumentation, and a plugin system for tools, guardrails, and evaluators.

For this app specifically, NAT gives us:

/v1/chat/completions with streaming — every OpenAI client works
Real-time intermediate-step events (the stage-by-stage progress you see in the UI)
Opt-in tracing to Langfuse, Phoenix, Langsmith, and more
A LangGraph wrapper that deploys a multi-node pipeline as a single agent

License

This is a reference implementation. See LICENSE for terms. The Bright Data and NVIDIA names and logos are trademarks of their respective owners

Built with Bright Data + NVIDIA NeMo Agent Toolkit

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
eye-data		eye-data
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eye Data

A social intelligence terminal powered by the world's best web data infrastructure.

What is this?

Why Bright Data is the backbone

Why NVIDIA NeMo Agent Toolkit

How it works

Architecture

Stack at a glance

Repository layout

Self-host

Run the frontend locally

Run the backend locally

Deploy

About the companies behind this

Bright Data

NVIDIA NeMo Agent Toolkit

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Eye Data

A social intelligence terminal powered by the world's best web data infrastructure.

What is this?

Why Bright Data is the backbone

Why NVIDIA NeMo Agent Toolkit

How it works

Architecture

Stack at a glance

Repository layout

Self-host

Run the frontend locally

Run the backend locally

Deploy

About the companies behind this

Bright Data

NVIDIA NeMo Agent Toolkit

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages