A high-performance cryptocurrency data ingestion and technical analysis pipeline built on TimescaleDB and Celery.
raghuLongTerm is designed to be a definitive foundation for cryptocurrency market analysis and algorithmic trading research. It provides an automated, scalable system for:
- Multi-Exchange Data Ingestion: Seamlessly fetching OHLCV data from major exchanges (Binance, Bybit, OKX, Bitget, MEXC) via CCXT.
- Global Market Insights: Integrating CoinGecko data for market caps, rankings, and historical price milestones (ATH/ATL).
- Time-Series Optimized Storage: Leveraging TimescaleDB's hypertables and continuous aggregates for efficient storage and lightning-fast queries across multiple timeframes.
- Large-Scale Technical Analysis: Automatically calculating a wide array of technical indicators (EMA, SMA, RSI, Bollinger Bands, Supertrend, TD Sequential, etc.) as new data arrives.
This repository represents a completed body of work, showcasing a robust engineering cycle from initial design to feature maturity.
- Phase 1: Inception & Prototyping (Nov 2023): Initial repository setup and core architectural design.
- Phase 2: Core Data Engine (Feb 2025):
- Implementation of the TimescaleDB hypertable system for high-throughput time-series data.
- Integration of CCXT for multi-exchange OHLCV data ingestion.
- Deployment of the Celery/Redis worker architecture for distributed task processing.
- Phase 3: Advanced Indicator Engineering (Feb - May 2025):
- Integration of
pandas-tafor scalable indicator processing. - Development of custom technical signals including EHMA and TD Sequential.
- Integration of
- Phase 4: Database Optimization & Multi-Timeframe Logic (June 2025):
- Implementation of TimescaleDB continuous aggregates and automated refresh policies to optimize multi-timeframe (1h to 1m) query performance.
- Phase 5: Market Context & Completion (July 2025):
- Finalized the CoinGecko global market data pipeline.
- Implemented FDV and market cap ranking analytics.
- Final stable release for demonstration and archival purposes.
- Language: Python 3.12+
- Database: TimescaleDB (PostgreSQL 17)
- Broker/Cache: Redis
- Task Orchestration: Celery & Celery Beat
- Data Ingestion: CCXT (CryptoCurrency eXchange Trading Library), CoinGecko API
- Data Analysis: Pandas, NumPy, Pandas-TA (Technical Analysis Library)
- Monitoring: Flower (Celery monitoring tool)
- DevOps: Docker, Docker Compose
-
Environment Configuration: Create a
.envfile in the root directory with the following variables:POSTGRES_USER=your_user POSTGRES_PASSWORD=your_password POSTGRES_DB=your_db REDIS_PASSWORD=your_redis_password COINGLASS_API_KEY=your_key (optional)
-
Deployment: Start the entire stack using Docker Compose:
docker-compose up -d
CcxtInitTask(Hourly): Refreshes asset listings and fetches the latest hourly OHLCV data for all tracked pairs.CoingeckoTask(Every 30m): Updates market rankings and global crypto statistics.CalcIndicatorsTask: Automatically triggered after data ingestion to update technical indicators.DbCleanupTask(Weekly): Performs maintenance and pruning of old indicator data to optimize storage.
Raw data and indicators are stored in TimescaleDB. You can query the following tables:
h1_data: Hourly OHLCV data.indicators: Technical indicators linked to price data.cg_data: CoinGecko market data.h2_data,h4_data,d1_data, etc.: Aggregated data for various timeframes.
The pipeline calculates several indicators including:
- Moving Averages: EMA (10, 36, 200), SMA (200), EHMA (180).
- Momentum: RSI (14).
- Volatility: Bollinger Bands (20, 2).
- Trend/Sequence: TD Sequential (Up/Down counts), Supertrend (7, 3).
- Custom Metrics: Range %, Volume in USD, High/Low Range %, etc.