An end-to-end MLOps pipeline for predicting household energy consumption using time-based features, built with FastAPI, Streamlit, MLflow, DVC, and Prefect.
This project is fully containerized and reproducible via Docker.
🚀 Current setup: Local deployment via Docker on
0.0.0.0.
This project forecasts household hourly electricity usage based on:
- Hour of the day
- Day of the week
- Month
It features:
- XGBoost regression model
- A real-time FastAPI inference service
- A user-friendly Streamlit dashboard
- End-to-end experiment tracking with MLflow
- DVC for data/model versioning
- Prefect for pipeline automation
- Packaged & deployable with Docker
- Model type:
XGBoostRegressor - 17 features built by the shared pipeline in
src/features.py(guarantees train/serve parity):- Calendar (10): hour, dayofweek, month, is_weekend + cyclical (sin/cos) encodings
- Lags (3): consumption 1h / 24h / 168h ago
- Rolling (4): mean & std over the past 24h and 168h
- Target:
Global_active_power(kW) - Because the model uses recent consumption, the API forecasts recursively: each predicted hour is fed back in to build the next hour's features.
energy-forecasting/
├── api/ # FastAPI inference server
│ └── main.py
├── dashboard/ # Streamlit dashboard
│ └── app.py
├── data/
│ ├── raw/ # Original dataset (from UCI)
│ └── processed/ # Cleaned + resampled data
├── models/ # Trained model files (via DVC)
│ └── latest_model_path.txt
├── mlops/
│ ├── mlflow_config.yaml
│ └── register_model.py
├── pipelines/ # Prefect automation
│ └── prefect_flow.py
├── src/
│ ├── features.py # Shared feature pipeline (train/serve parity)
│ ├── data_loader.py # Preprocessing script
│ └── train_model.py # Model training & logging
├── tests/ # pytest unit tests
├── pyproject.toml # Deps + ruff / mypy / pytest config
├── Dockerfile # Container setup
├── docker-compose.yaml # Service orchestration
├── dvc.yaml # Pipeline stages
├── dvc.lock
├── requirements.txt
└── README.md
| Stage | Tool | Description |
|---|---|---|
| Data versioning | DVC |
Tracks data and models (e.g. energy_clean.csv) |
| Training | XGBoost |
17 features: calendar + lags + rolling stats |
| Experiment tracking | MLflow |
Logs parameters, metrics, model artifacts |
| Automation | Prefect |
Defines retraining pipeline (data → train) |
| Serving | FastAPI |
Recursive multi-step forecast on /forecast |
| Monitoring UI | Streamlit |
Frontend to submit inputs & visualize results |
| Packaging | Docker |
Full stack in one container |
# 1. Build and start
docker-compose up --build
# 2. Access:
FastAPI → http://localhost:8000
Streamlit → http://localhost:8502
MLflow UI → http://localhost:5050The model is stateful: it forecasts forward from the latest observed data.
GET /forecast?horizon=24 — recursive multi-step forecast (1–168 hours):
{
"from_timestamp": "2010-11-26T20:00:00",
"horizon_hours": 24,
"forecast": [
{"timestamp": "2010-11-26T21:00:00", "predicted_energy_kW": 1.234}
]
}Also: GET /predict (single next hour), GET /model/info (features,
metrics, baseline skill), GET /health (liveness).
Access: http://localhost:8502
- Pick a forecast horizon (1–168 h) and run a recursive forward forecast
- View the predicted consumption curve
- See model metrics & baseline skill in the sidebar
- Compare against recent actual consumption
# Run full pipeline
dvc repro
# Push data + model versions to remote (optional)
dvc pushpython pipelines/prefect_flow.pyRuns:
data_loader.py→ preprocessingtrain_model.py→ model training + MLflow logging
Visit: http://localhost:5050
Browse runs, parameters, metrics, models.
- Source: UCI - Individual household electric power consumption
- Resampled to hourly intervals
- Target:
Global_active_power
Python 3.10FastAPI,Streamlitxgboost,scikit-learn,pandasMLflow,Prefect,DVCDocker,docker-compose
👨💻 Taey Kim
📫 GitHub
💡 Passionate about MLOps, system automation, and real-time inference!
- Add CI/CD via GitHub Actions
- Deploy to Heroku / Fly.io
- Batch forecasting + scheduling
- User login for dashboard
MIT License | 2025

