Pangolin supports multiple backend storage options for persisting catalog metadata. Choose the backend that best fits your deployment requirements.
Backend storage is where Pangolin stores catalog metadata including:
- Tenant information
- Warehouse configurations
- Catalog definitions
- Namespace hierarchies
- Asset (table/view) metadata
- Branch and tag information
- Audit logs
Note: Backend storage is separate from warehouse storage (S3, Azure, GCS) which stores the actual data files.
All backends (including Memory/SQLite) now feature:
- Metadata Cache: In-memory LRU cache (
moka) for high-latency Iceberg metadata files (manifests, snapshots), defaulting to 5-minute TTL. - Object Store Connection Pooling: Reuses S3/GCS/Azure connections to reduce handshake overhead.
- Unified Search: Optimized full-text search across Catalogs, Namespaces, and Tables regardless of the backing store.
| Scalability | Low | Low | High | Very High | | Transactions | ✅ ACID | ✅ ACID | ✅ ACID | ✅ ACID | | Foreign Keys | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | | Schema | Strict | Strict SQL | Strict SQL | Flexible | | Replication | ❌ No | ❌ Manual | ✅ Built-in | ✅ Built-in | | Cloud Managed | ❌ No | ❌ No | ✅ RDS, Azure, GCP | ✅ Atlas, Azure, GCP | | Resource Usage | Very Low | Very Low | Medium | Medium | | Persistence | ❌ No | ✅ Yes | ✅ Yes | ✅ Yes | | Multi-Tenant Isolation | ✅ Excellent | ✅ Excellent | ✅ Excellent | ✅ Excellent |
- ✅ You're developing locally
- ✅ You're running tests (unit or integration)
- ✅ You need instant setup with zero configuration
- ✅ You're prototyping or learning
- ✅ Data persistence is not required
- ✅ You're running in CI/CD pipelines
- ✅ You're developing locally and need persistence
- ✅ You need embedded database
- ✅ You're deploying to edge/IoT devices
- ✅ You want zero configuration with persistence
- ✅ You have low concurrent write needs
- ✅ You want minimal resource usage
- ✅ You need a proven, battle-tested SQL database
- ✅ You want strong consistency and ACID guarantees
- ✅ You're deploying to traditional infrastructure
- ✅ You need complex queries and joins
- ✅ You want managed cloud options (RDS, Azure Database, Cloud SQL)
- ✅ You prefer document-based storage
- ✅ You need horizontal scalability
- ✅ You're building cloud-native applications
- ✅ You want flexible schema evolution
- ✅ You're already using MongoDB in your stack
Set the backend using the DATABASE_URL environment variable:
# In-Memory (default - no DATABASE_URL needed)
# Just don't set DATABASE_URL
# SQLite
DATABASE_URL=sqlite:///path/to/pangolin.db
# PostgreSQL
DATABASE_URL=postgresql://user:password@localhost:5432/pangolin
# MongoDB
DATABASE_URL=mongodb://user:password@localhost:27017/pangolin- In-Memory Setup Guide
- SQLite Setup Guide
- PostgreSQL Setup Guide
- MongoDB Setup Guide
- Detailed Comparison
Currently, Pangolin does not provide automated migration tools between backends. If you need to migrate:
- Export metadata from source backend (custom script)
- Transform to target backend format
- Import into target backend
Tip: Start with SQLite for development, then migrate to PostgreSQL or MongoDB for production.