| language | en | ||||||
|---|---|---|---|---|---|---|---|
| tags |
|
||||||
| license | apache-2.0 | ||||||
| model-index |
|
ML-powered 90-day weather forecasting trained on 150+ years of GHCN station data.
Runs on consumer GPUs. No proprietary data. No API dependencies.
Quick Start • Architecture • Training • Data Sources • API • Contributing
"The storm goddess sees all horizons."
Operational weather models (GFS, ECMWF IFS) require supercomputers, petabytes of assimilated data, and institutional access. ML weather models (GraphCast, Pangu-Weather, FourCastNet) train on ERA5 reanalysis — terabytes of gridded data that most researchers cannot access or afford to process.
L.I.L.I.T.H. takes a different approach:
- Station-native — Learns directly from GHCN ground observations, no reanalysis required
- Consumer hardware — Trains on a single GPU (RTX 3060 12 GB) in hours, not days
- 90-day horizon — Multi-scale temporal processing: synoptic (days 1–14), weekly (15–42), seasonal (43–90)
- Uncertainty quantification — Gaussian, quantile, or MC dropout ensemble heads
- Full stack — FastAPI backend + Next.js 14 frontend + Docker deployment
| Station-Graph Temporal Transformer | GATv2 encoder → SFNO processor → multi-scale decoder |
| 100,000+ GHCN stations | 150+ years of quality-controlled daily observations |
| Climate embeddings | ENSO, NAO, PDO, MJO, AO indices for long-range skill |
| INT8 / INT4 quantization | 2–4× memory reduction for edge deployment |
| Real-time API | FastAPI with WebSocket support, 15-minute caching |
| Interactive frontend | Next.js 14 with Tailwind, forecast charts, uncertainty bands |
git clone https://github.com/consigcody94/L.I.L.I.T.H..git
cd L.I.L.I.T.H.
# Core dependencies
pip install -e .
# With training extras
pip install -e ".[train]"
# With development tools
pip install -e ".[dev]"# Download GHCN station data (505 US stations, ~9.6M records)
python scripts/download_data.py --max-stations 500
# Process into training sequences
python scripts/process_data.py
# Train the model
python -m training.train_simple --epochs 50 --batch-size 64 --lr 1e-4# Use a trained checkpoint
python scripts/run_inference.py \
--checkpoint checkpoints/lilith_best.pt \
--lat 40.7128 --lon -74.006 \
--days 90
# Start the API server
LILITH_CHECKPOINT=checkpoints/lilith_best.pt python -m uvicorn web.api.main:app --port 8000
# Query the API
curl -X POST http://localhost:8000/v1/forecast \
-H "Content-Type: application/json" \
-d '{"latitude": 40.7128, "longitude": -74.006, "days": 90}'cd web/frontend
npm install
npm run dev
# Open http://localhost:3000docker-compose -f docker/docker-compose.yml up -dL.I.L.I.T.H. uses a Station-Graph Temporal Transformer (SGTT) architecture:
Station Observations (100K+ GHCN stations)
│
▼
┌──────────────────────────────────┐
│ ENCODER │
│ Station Embedding (3D + feat) │
│ → GATv2 (spatial correlations) │
│ → Temporal Transformer (RoPE) │
└──────────────┬───────────────────┘
│
▼
┌──────────────────────────────────┐
│ LATENT ATMOSPHERIC STATE │
│ 64 × 128 × 256 │
└──────────────┬───────────────────┘
│
▼
┌──────────────────────────────────┐
│ PROCESSOR │
│ SFNO (spherical harmonics) │
│ Multi-Scale Temporal: │
│ Days 1-14: 6h steps │
│ Days 15-42: 24h steps │
│ Days 43-90: 168h steps │
│ Climate Embedding (ENSO/NAO/..) │
└──────────────┬───────────────────┘
│
▼
┌──────────────────────────────────┐
│ DECODER │
│ Grid Decoder (global fields) │
│ Station Decoder (point fcsts) │
│ Ensemble Head (uncertainty) │
└──────────────────────────────────┘
| Variant | Parameters | VRAM (FP16) | VRAM (INT8) | Use Case |
|---|---|---|---|---|
| LILITH-Tiny | 50M | 4 GB | 2 GB | Edge deployment, fast inference |
| SimpleLILITH | 1.87M | ~23 MB | — | Default training, consumer GPUs |
| LILITH-Base | 150M | 8 GB | 4 GB | Balanced accuracy / speed |
| LILITH-Large | 400M | 12 GB | 6 GB | High-accuracy forecasts |
| LILITH-XL | 1B | 24 GB | 12 GB | Research, maximum accuracy |
| Component | Purpose | Details |
|---|---|---|
StationEmbedding |
Encode station features + 3D position | MLP with spherical coordinates |
GATEncoder |
Spatial relationships | Graph Attention Network v2 |
TemporalTransformer |
Time series processing | Flash Attention + RoPE |
SFNO |
Global atmospheric dynamics | Spherical Fourier Neural Operator, O(N log N) |
ClimateEmbedding |
Long-range climate indices | ENSO, MJO, NAO, seasonal cycles |
EnsembleHead |
Uncertainty quantification | Diffusion / Gaussian / Quantile / MC dropout |
SimpleLILITH |
Single-station encoder-decoder | Lightweight Transformer for training |
A pre-trained checkpoint is available in releases:
- 505 US GHCN stations, 9.6 million weather records
- 1.15 million training sequences
- Final RMSE: 3.88°C (temperature prediction)
# Download pre-trained checkpoint
curl -L -o checkpoints/lilith_best.pt \
https://github.com/consigcody94/L.I.L.I.T.H./releases/download/v1.0/lilith_best.pt
# Start API with trained model
LILITH_CHECKPOINT=checkpoints/lilith_best.pt python -m uvicorn web.api.main:app --port 8000# Full pipeline
python scripts/download_data.py --max-stations 500
python scripts/process_data.py
python -m training.train_simple --epochs 50 --batch-size 64
# Resume from checkpoint
python -m training.train_simple \
--resume checkpoints/lilith_best.pt \
--epochs 100 --lr 5e-5| GPU | Training (50 epochs, 1M samples) | Inference (single location) |
|---|---|---|
| RTX 3060 12 GB | ~5 hours | 0.8s |
| RTX 4090 24 GB | ~1.5 hours | 0.3s |
| CPU only | ~24 hours | 3s |
| Forecast Range | Metric | L.I.L.I.T.H. Target | Climatology Baseline |
|---|---|---|---|
| Days 1–7 | Temperature RMSE | < 2°C | ~5°C |
| Days 8–14 | Temperature RMSE | < 3°C | ~5°C |
| Days 15–42 | Skill Score | > 0.3 | 0.0 |
| Days 43–90 | Skill Score | > 0.1 | 0.0 |
# INT8 quantization (2× memory reduction)
python inference/quantize.py --checkpoint checkpoints/lilith_best.pt --bits 8
# INT4 quantization (4× memory reduction)
python inference/quantize.py --checkpoint checkpoints/lilith_best.pt --bits 4from huggingface_hub import HfApi
api = HfApi()
api.upload_file(
path_or_fileobj="checkpoints/lilith_best.pt",
path_in_repo="lilith_base_v1.pt",
repo_id="your-username/lilith-weather",
repo_type="model"
)L.I.L.I.T.H. is built entirely on freely available public data.
| Dataset | Coverage | Stations | Variables | Resolution |
|---|---|---|---|---|
| GHCN-Daily | 1763–present | 100,000+ | Temp, Precip, Snow | Daily |
| GHCN-Hourly | 1900s–present | 20,000+ | Wind, Pressure, Humidity | Hourly |
Source: NOAA NCEI
| Priority | Dataset | What It Adds |
|---|---|---|
| High | Climate Indices (ENSO, NAO, MJO, PDO, AO) | Long-range predictability drivers |
| High | ERA5 Reanalysis (ECMWF) | Full atmospheric state, gridded global |
| Medium | NOAA OISST | Sea surface temperatures, ocean influence |
| Medium | GFS Analysis | Physics-based ensemble blending |
| Optional | GOES/GPM Satellite | Real-time cloud cover and precipitation |
# Download climate indices (small, fast)
python -m data.download.climate_indices --indices enso,nao,pdo,mjo,ao
# Download ERA5 for a region (requires ECMWF CDS account)
python -m data.download.era5 --start-year 2000 --end-year 2024 --region north_americaGenerate a point forecast.
{
"latitude": 40.7128,
"longitude": -74.006,
"days": 90,
"ensemble_members": 10,
"variables": ["temperature", "precipitation", "wind"]
}Response:
{
"location": {"latitude": 40.7128, "longitude": -74.006, "name": "New York, NY"},
"generated_at": "2025-01-15T12:00:00Z",
"model_version": "SimpleLILITH v1",
"forecasts": [
{
"date": "2025-01-16",
"temperature": {"mean": 2.5, "min": -1.2, "max": 6.8},
"precipitation": {"probability": 0.35, "amount_mm": 2.1},
"wind": {"speed_ms": 5.2, "direction_deg": 270},
"uncertainty": {"temperature_std": 1.2, "confidence": 0.85}
}
]
}Batch inference for multiple locations.
Historical observations for a GHCN station.
Health check and model status.
L.I.L.I.T.H./
├── models/ Model architecture
│ ├── simple_lilith.py SimpleLILITH (shared train/inference)
│ ├── lilith.py Full LILITH model (SGTT)
│ ├── losses.py Multi-task loss functions
│ └── components/ Building blocks
│ ├── station_embed.py Station embedding (3D + features)
│ ├── gat_encoder.py GATv2 spatial encoder
│ ├── temporal_transformer.py Flash Attention + RoPE
│ ├── sfno.py Spherical Fourier Neural Operator
│ ├── climate_embed.py Climate index embedding
│ └── ensemble_head.py Uncertainty quantification
│
├── training/ Training infrastructure
│ ├── train_simple.py SimpleLILITH training loop
│ └── trainer.py Full trainer with DeepSpeed
│
├── inference/ Inference and serving
│ ├── simple_forecaster.py Checkpoint loading + forecast generation
│ ├── forecast.py High-level forecast API
│ └── quantize.py INT8/INT4 quantization
│
├── data/ Data pipeline
│ ├── download/ GHCN download scripts
│ ├── processing/ QC, normalization, gridding
│ └── loaders/ PyTorch datasets
│
├── web/
│ ├── api/ FastAPI backend
│ └── frontend/ Next.js 14 frontend
│
├── scripts/ CLI utilities
├── tests/ Test suite
├── docker/ Containerization
└── docs/ Documentation
| Variable | Required | Default | Description |
|---|---|---|---|
LILITH_CHECKPOINT |
No | Auto-detected | Path to model checkpoint |
OPENWEATHER_API_KEY |
No | — | OpenWeatherMap key (fallback forecasts only) |
The ML model works without any API keys. OpenWeatherMap is only used as a fallback when no trained model is loaded.
- GraphCast (Google DeepMind) — Pioneering ML weather prediction
- Pangu-Weather (Huawei) — Transformer architectures for weather
- FourCastNet (NVIDIA) — Fourier neural operators for atmospheric modeling
- FuXi (Fudan University) — Subseasonal forecasting advances
- NOAA NCEI — GHCN dataset, a public resource funded by U.S. taxpayers
- ECMWF — ERA5 reanalysis data
Contributions are welcome. L.I.L.I.T.H. is built on the principle that weather forecasting should be accessible to everyone.
# Development setup
git clone https://github.com/consigcody94/L.I.L.I.T.H..git
cd L.I.L.I.T.H.
pip install -e ".[dev]"
pre-commit install
pytest tests/ -v- Code — Model improvements, new features, bug fixes
- Data — Additional data sources, quality control improvements
- Testing — Unit tests, integration tests, benchmarking
- Documentation — Tutorials, guides, architecture deep-dives
@software{lilith2025,
author = {Churchwell, Cody},
title = {L.I.L.I.T.H.: Long-range Intelligent Learning for Integrated Trend Hindcasting},
year = {2025},
url = {https://github.com/consigcody94/L.I.L.I.T.H.}
}