GitHub - giantswarm/ML-demo: ML demo example

ML Example Demo (Flask) with KEDA Autoscaling

This is a small Flask app that simulates a simple ML inference workload and exposes endpoints for health, prediction, and generating artificial CPU load. A Kubernetes Deployment and Service are provided, along with a KEDA ScaledObject that creates an HPA to scale the app based on Prometheus (Mimir) metrics.

Endpoints

GET /health: basic health with GPU usage
POST /predict: accepts JSON { "data": [100 floats] }, returns a mock prediction
GET /load?duration=10: generates GPU load for testing autoscaling

Run locally

Requirements: Python 3.9+

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
python app.py
# then in another shell
curl http://localhost:8080/health

Example predict request (array must have 100 numbers):

curl -X POST http://localhost:8080/predict \
  -H 'Content-Type: application/json' \
  -d '{"data": ['$(python - <<'PY'
print(','.join(['1']*100))
PY
  )']}'

Generate load locally (default: 10s):

curl "http://localhost:8080/load?duration=15"

Deploy to Kubernetes

The manifests are in deploy/manifests/.

kubectl create namespace ml-workloads
kubectl apply -n ml-workloads -f deploy/manifests/deployment.yaml
kubectl apply -f deploy/manifests/keda.yaml

Port-forward to test:

kubectl -n ml-workloads port-forward svc/ml-demo 8080:80
curl http://localhost:8080/health
curl "http://localhost:8080/load?duration=30"

Autoscaling with KEDA

The KEDA ScaledObject (deploy/manifests/keda.yaml) targets the ml-demo Deployment and defines Prometheus queries against Mimir to drive scaling.
minReplicaCount and maxReplicaCount bound the replicas; advanced.horizontalPodAutoscalerConfig.behavior tunes HPA scale up/down.
The Mimir server URL and basic auth are configured via a Secret and TriggerAuthentication in the same file. Replace the default credentials and server URL for your environment before applying.

Clean up

kubectl delete -n ml-workloads -f deploy/manifests/deployment.yaml
kubectl delete -f deploy/manifests/keda.yaml
kubectl delete namespace ml-workloads

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
deploy/manifests		deploy/manifests
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
DCO		DCO
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
app.py		app.py
renovate.json5		renovate.json5
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Example Demo (Flask) with KEDA Autoscaling

Endpoints

Run locally

Deploy to Kubernetes

Autoscaling with KEDA

Clean up

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ML Example Demo (Flask) with KEDA Autoscaling

Endpoints

Run locally

Deploy to Kubernetes

Autoscaling with KEDA

Clean up

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages