A Python CLI tool for enriching lead data from CSV files with company information, AI-powered ICP scoring, and export capabilities. Select automates the process of taking a simple list of company names and domains, then enriching each lead with descriptions, contact emails, ideal customer profile scores, and recommended outreach strategies.
The pipeline supports multiple enrichment sources (Clearbit API, Hunter.io, web scraping) with automatic fallback, concurrent processing for speed, and both CLI and Streamlit dashboard interfaces. It uses Pydantic models for data validation, httpx for async HTTP, and integrates with OpenAI or local Ollama models for intelligent lead scoring.
┌─────────┐ ┌────────────┐ ┌────────────┐ ┌─────────┐ ┌────────┐
│ CLI │───▶│ Ingestion │───▶│ Enrichment │───▶│ Scoring │───▶│ Export │
│ (Typer) │ │ (CSV) │ │ (API/LLM) │ │ (ICP) │ │ (CSV) │
└─────────┘ └────────────┘ └────────────┘ └─────────┘ └────────┘
│
▼
┌───────────┐
│ Dashboard │
│(Streamlit)│
└───────────┘
Pipeline stages:
- Ingestion — Reads CSV input, validates rows via Pydantic, skips malformed data
- Enrichment — Calls Clearbit/Hunter.io APIs, falls back to web scraping on failure
- Scoring — Uses OpenAI (gpt-4o-mini) or Ollama (llama3.2) to classify ICP score (1-5) and recommend outreach approach
- Export — Writes enriched leads to CSV (and optionally Notion)
- Dashboard (optional) — Streamlit UI for upload, view, filter, and download
- Python 3.11+
- pip
# Clone the repository
git clone https://github.com/keyarr/Select.git
cd Select
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
# .venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtCreate a .env file in the project root:
# Required for OpenAI scoring (optional if using Ollama)
OPENAI_API_KEY=sk-your-openai-key
# Optional: Clearbit API for company enrichment
CLEARBIT_API_KEY=your-clearbit-key
# Optional: Hunter.io API for contact email lookup
HUNTER_API_KEY=your-hunter-key
# Optional: Notion integration for export
NOTION_TOKEN=your-notion-token
NOTION_DATABASE_ID=your-database-id# Basic usage — enrich leads from CSV
python sel.py --input leads.csv --output enriched.csv
# Specify number of concurrent workers
python sel.py --input leads.csv --output enriched.csv --workers 8
# Skip LLM scoring (enrichment only, no ICP classification)
python sel.py --input leads.csv --output enriched.csv --skip-scoring
# Use OpenAI for scoring (requires OPENAI_API_KEY)
python sel.py --input leads.csv --output enriched.csv --provider openai --model gpt-4o-mini
# Use local Ollama for scoring (default)
python sel.py --input leads.csv --output enriched.csv --provider ollama --model llama3.2
# Use the included sample data
python sel.py --input sample_leads.csv --output enriched.csv| Flag | Type | Default | Description |
|---|---|---|---|
--input / -i |
str |
(required) | Path to input CSV file containing leads |
--output / -o |
str |
(required) | Path to output CSV file for enriched leads |
--workers / -w |
int |
5 |
Number of concurrent workers (1-16) |
--skip-scoring |
flag |
False |
Skip LLM-based ICP scoring step |
--provider / -p |
str |
ollama |
LLM provider: openai or ollama |
--model / -m |
str |
llama3.2 |
Model name for the LLM provider |
--max-retries |
int |
3 |
Maximum retry attempts per lead on transient errors |
--help / -h |
flag |
— | Show help message and exit |
# Launch the Streamlit dashboard
streamlit run dashboard/app.pyThe dashboard provides:
- CSV Upload — Drag-and-drop or file picker to upload lead CSVs
- Enrichment Trigger — Start the pipeline directly from the browser
- Results Table — Sortable, filterable view of enriched leads
- ICP Filtering — Filter by score range and industry
- CSV Download — Export enriched results as a downloadable CSV
Used for company description and industry enrichment.
- Sign up at clearbit.com
- Get your API key from the dashboard
- Set
CLEARBIT_API_KEYin your.env
Used for contact email discovery.
- Sign up at hunter.io
- Get your API key from the API settings page
- Set
HUNTER_API_KEYin your.env
Used for AI-powered ICP scoring and outreach recommendation.
- Sign up at platform.openai.com
- Create an API key
- Set
OPENAI_API_KEYin your.env
Note: If no OpenAI key is provided, the pipeline falls back to Ollama (local) for scoring. Install Ollama and pull a model with
ollama pull llama3.2to use the local fallback.
The input CSV must have at minimum a company_name column. Optional columns include domain and industry:
company_name,domain,industry
TechNova Solutions,technovatech.com,Technology
GreenLeaf Capital,greenleafcapital.com,FinanceThe enriched CSV includes all input fields plus enrichment results:
company_name,domain,industry,description,contact_email,icp_score,recommended_approach,timestamp,enrichment_source,errorMIT License. See LICENSE for details.