An end-to-end MLOps pipeline for predicting household energy consumption using time-based features, built with FastAPI, Streamlit, MLflow, DVC, and Prefect.
This project is fully containerized and reproducible via Docker.
π Current setup: Local deployment via Docker on
0.0.0.0.
This project forecasts household hourly electricity usage based on:
- Hour of the day
- Day of the week
- Month
It features:
- XGBoost regression model
- A real-time FastAPI inference service
- A user-friendly Streamlit dashboard
- End-to-end experiment tracking with MLflow
- DVC for data/model versioning
- Prefect for pipeline automation
- Packaged & deployable with Docker
- Model type:
XGBoostRegressor - 17 features built by the shared pipeline in
src/features.py(guarantees train/serve parity):- Calendar (10): hour, dayofweek, month, is_weekend + cyclical (sin/cos) encodings
- Lags (3): consumption 1h / 24h / 168h ago
- Rolling (4): mean & std over the past 24h and 168h
- Target:
Global_active_power(kW) - Because the model uses recent consumption, the API forecasts recursively: each predicted hour is fed back in to build the next hour's features.
energy-forecasting/
βββ api/ # FastAPI inference server
β βββ main.py
βββ dashboard/ # Streamlit dashboard
β βββ app.py
βββ data/
β βββ raw/ # Original dataset (from UCI)
β βββ processed/ # Cleaned + resampled data
βββ models/ # Trained model files (via DVC)
β βββ latest_model_path.txt
βββ mlops/
β βββ mlflow_config.yaml
β βββ register_model.py
βββ pipelines/ # Prefect automation
β βββ prefect_flow.py
βββ src/
β βββ features.py # Shared feature pipeline (train/serve parity)
β βββ data_loader.py # Preprocessing script
β βββ train_model.py # Model training & logging
βββ tests/ # pytest unit tests
βββ pyproject.toml # Deps + ruff / mypy / pytest config
βββ Dockerfile # Container setup
βββ docker-compose.yaml # Service orchestration
βββ dvc.yaml # Pipeline stages
βββ dvc.lock
βββ requirements.txt
βββ README.md
| Stage | Tool | Description |
|---|---|---|
| Data versioning | DVC |
Tracks data and models (e.g. energy_clean.csv) |
| Training | XGBoost |
17 features: calendar + lags + rolling stats |
| Experiment tracking | MLflow |
Logs parameters, metrics, model artifacts |
| Automation | Prefect |
Defines retraining pipeline (data β train) |
| Serving | FastAPI |
Recursive multi-step forecast on /forecast |
| Monitoring UI | Streamlit |
Frontend to submit inputs & visualize results |
| Packaging | Docker |
Full stack in one container |
# 1. Build and start
docker-compose up --build
# 2. Access:
FastAPI β http://localhost:8000
Streamlit β http://localhost:8502
MLflow UI β http://localhost:5050The model is stateful: it forecasts forward from the latest observed data.
GET /forecast?horizon=24 β recursive multi-step forecast (1β168 hours):
{
"from_timestamp": "2010-11-26T20:00:00",
"horizon_hours": 24,
"forecast": [
{"timestamp": "2010-11-26T21:00:00", "predicted_energy_kW": 1.234}
]
}Also: GET /predict (single next hour), GET /model/info (features,
metrics, baseline skill), GET /health (liveness).
Access: http://localhost:8502
- Pick a forecast horizon (1β168 h) and run a recursive forward forecast
- View the predicted consumption curve
- See model metrics & baseline skill in the sidebar
- Compare against recent actual consumption
# Run full pipeline
dvc repro
# Push data + model versions to remote (optional)
dvc pushpython pipelines/prefect_flow.pyRuns:
data_loader.pyβ preprocessingtrain_model.pyβ model training + MLflow logging
Visit: http://localhost:5050
Browse runs, parameters, metrics, models.
- Source: UCI - Individual household electric power consumption
- Resampled to hourly intervals
- Target:
Global_active_power
Python 3.10FastAPI,Streamlitxgboost,scikit-learn,pandasMLflow,Prefect,DVCDocker,docker-compose
π¨βπ» Taey Kim
π« GitHub
π‘ Passionate about MLOps, system automation, and real-time inference!
- Add CI/CD via GitHub Actions
- Deploy to Heroku / Fly.io
- Batch forecasting + scheduling
- User login for dashboard
MIT License | 2025

