Get Discogsography up and running in minutes
π Back to Main | π Documentation Index | ποΈ Architecture
This guide will help you get Discogsography running quickly, whether you're using Docker Compose for a simple setup or setting up a local development environment.
| Requirement | Details |
|---|---|
| Python | 3.13+ β install via uv |
| Docker | 20.10+ with Docker Compose v2 |
| Storage | 200GB free SSD (~76GB databases + room for growth) |
| Memory | 16GB RAM |
For Docker Compose Setup:
- Docker Engine 20.10+
- Docker Compose v2
For Local Development:
- Python 3.13+
- uv package manager
- just task runner (optional but recommended)
- Rust toolchain (only if developing Extractor)
The fastest way to get started is using Docker Compose, which handles all service dependencies automatically.
git clone https://github.com/SimplicityGuy/discogsography.git
cd discogsographyThe project includes sensible defaults, but you can customize settings:
# Copy the example environment file
cp .env.example .env
# Edit .env to customize (optional)
nano .envSee Configuration Guide for all available settings.
# Start all services
docker-compose up -d
# View logs to monitor progress
docker-compose logs -fOpen your browser and visit:
- Dashboard: http://localhost:8003 (System monitoring)
- Admin Panel: http://localhost:8003/admin (Extraction management β requires admin account)
- API: http://localhost:8004 (User auth, graph queries, sync)
- Neo4j Browser: http://localhost:7474 (Graph database UI)
- RabbitMQ Management: http://localhost:15672 (Queue monitoring)
| Service | URL | Default Credentials | Purpose |
|---|---|---|---|
| π API | http://localhost:8004 | Register via /api/auth/register |
User auth, graph queries, sync, OAuth |
| π Dashboard | http://localhost:8003 | None (monitoring) / admin-setup CLI (admin panel) | System monitoring + admin panel |
| π° RabbitMQ | http://localhost:15672 | discogsography / discogsography |
Queue management |
| π Neo4j | http://localhost:7474 | neo4j / discogsography |
Graph database UI |
| π PostgreSQL | localhost:5433 |
discogsography / discogsography |
Database access |
For development, you'll want to run services locally with hot-reload capabilities.
uv is 10-100x faster than pip for package management:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Verify installation
uv --versionjust provides convenient task aliases:
# macOS
brew install just
# Linux (with cargo)
cargo install just
# Or use the installer script
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash
# Verify installation
just --version# Install all project dependencies
just install
# Or using uv directly
uv sync --all-extras# Set up pre-commit hooks
just init
# Or using uv directly
uv run pre-commit installStart the required databases and message queue:
# Start only infrastructure services
docker-compose up -d neo4j postgres rabbitmq redisCreate a .env file or export variables:
# RabbitMQ settings (AMQP_CONNECTION is built automatically from these)
export RABBITMQ_HOST="localhost"
export RABBITMQ_USERNAME="discogsography"
export RABBITMQ_PASSWORD="discogsography"
# Neo4j settings
export NEO4J_HOST="localhost"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="discogsography"
# PostgreSQL settings
export POSTGRES_HOST="localhost"
export POSTGRES_USERNAME="discogsography"
export POSTGRES_PASSWORD="discogsography"
export POSTGRES_DATABASE="discogsography"
# Redis settings
export REDIS_HOST="localhost"
# API settings
export JWT_SECRET_KEY="dev-secret-key-change-in-production"
export DISCOGS_USER_AGENT="Discogsography/1.0 +https://github.com/SimplicityGuy/discogsography"
# Optional: Set log level
export LOG_LEVEL="INFO" # or DEBUG for detailed outputRun any service using just commands:
# Dashboard (monitoring UI)
just dashboard
# Explore (graph exploration)
just explore
# Extractor (high-performance ingestion, requires Rust)
just extractor-run
# Graphinator (Neo4j builder)
just graphinator
# Tableinator (PostgreSQL builder)
just tableinator
# Brainzgraphinator (MusicBrainz β Neo4j enrichment)
just brainzgraphinator
# Brainztableinator (MusicBrainz β PostgreSQL)
just brainztableinatorOr run services directly with Python:
# API (user auth & Discogs OAuth)
uv run python -m api.api
# Dashboard
uv run python dashboard/dashboard.py
# Explore
uv run python explore/explore.py
# Graphinator
uv run python graphinator/graphinator.py
# Tableinator
uv run python tableinator/tableinator.pyAll services expose health endpoints:
# Check each service
curl http://localhost:8000/health # Extractor
curl http://localhost:8001/health # Graphinator
curl http://localhost:8002/health # Tableinator
curl http://localhost:8003/health # Dashboard
curl http://localhost:8005/health # API
curl http://localhost:8007/health # Explore
curl http://localhost:8010/health # Brainztableinator
curl http://localhost:8011/health # BrainzgraphinatorExpected response:
{"status": "healthy"}Watch the logs to see data processing:
# All services
docker-compose logs -f
# Specific service
docker-compose logs -f extractor-discogs
docker-compose logs -f extractor-musicbrainz
docker-compose logs -f graphinator
docker-compose logs -f tableinatorLook for log messages like:
- π Service starting messages
- π₯ Download progress
- π Processing progress
- β Completion messages
Visit http://localhost:15672 and verify:
- All 4 Discogs queues are created (artists, labels, releases, masters)
- Messages are being published and consumed
- Consumer counts are appropriate
Neo4j:
# Open Neo4j Browser
open http://localhost:7474
# Run query to count nodes
MATCH (n) RETURN labels(n)[0] as type, count(n) as countPostgreSQL:
# Connect to database
PGPASSWORD=discogsography psql -h localhost -p 5433 -U discogsography -d discogsography
# Count records
SELECT 'artists' as table_name, COUNT(*) FROM artists
UNION ALL SELECT 'labels', COUNT(*) FROM labels
UNION ALL SELECT 'releases', COUNT(*) FROM releases
UNION ALL SELECT 'masters', COUNT(*) FROM masters;# Check if ports are already in use
netstat -an | grep -E "(5672|7474|7687|5433|6379|8003)"
# Stop and restart all services
docker-compose down
docker-compose up -d# Check available space
df -h
# Clean up Docker resources
docker system prune -a# Wait for services to fully start
docker-compose ps
# Check service logs
docker-compose logs [service-name]
# Restart specific service
docker-compose restart [service-name]# Check internet connectivity
curl -I https://discogs-data-dumps.s3.us-west-2.amazonaws.com
# Check extractor logs
docker-compose logs extractor-discogs
# Verify DISCOGS_ROOT permissions
ls -la /discogs-data # or your configured pathFor more detailed troubleshooting, see the Troubleshooting Guide.
Now that you have Discogsography running:
-
Explore the Dashboard: http://localhost:8003
- Monitor system health
- View processing metrics
- Track queue depths
-
Try Some Queries: See Usage Examples
- Neo4j graph queries
- PostgreSQL analytics
- Full-text search
-
Use the Explore API: http://localhost:8004/api/explore
- Interactive graph exploration
- Trend analysis and visualizations
- Path finding and relationship queries
-
Learn the Architecture: Read Architecture Guide
- Understand component interactions
- Learn about data flow
- Explore scalability options
-
Configure for Your Needs: See Configuration Guide
- Tune performance settings
- Adjust logging levels
- Customize data paths
If you're contributing to the project:
# Before making changes
just lint # Run linters
just format # Format code
just test # Run tests
just security # Security scan
# Or run everything
uv run pre-commit run --all-filesSee Development Guide and Contributing Guide for more information.
- Configuration Guide - All environment variables and settings
- Architecture Overview - System design and components
- Database Schema - Neo4j and PostgreSQL schemas
- Monitoring Guide - Observability and debugging
- Performance Guide - Optimization strategies
Last Updated: 2026-04-03