Skip to content

Commit 35de853

Browse files
authored
feat: Dynamically deploy models to match request targets (#1)
Allow configuring on-demand models through the API, by providing info necessary for deploying CMS instances automatically when requests are received targeting a certain model. This includes specifying the model name, the tracking server artifact URI or tracking ID of the run that generated it, Docker resource requirements, the time a deployed instance can stay idle before being removed. Model names must be unique, since they are used to determine request targets, serving as the Docker container name for the deployed CMS instances. When a request is received for a model that does not have a deployed instance, a new instance is created automatically using the provided configuration- if one exists- and the request is routed to it. If no configuration exists for the requested model, an error is returned. From now on, available models as listed through the API include both running instances and on-demand models. This commit also updates the client library to support creating, updating, deleting, and listing on-demand configurations, and extends the integration tests to cover the new endpoints as well as the on-demand deployment functionality. Signed-off-by: Phoevos Kalemkeris <phoevos.kalemkeris@ucl.ac.uk>
1 parent a93558e commit 35de853

51 files changed

Lines changed: 9310 additions & 2489 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ ipython_config.py
8585
# pyenv
8686
# For a library or package, you might want to ignore these files since the code is
8787
# intended to run in multiple environments; otherwise, check them in:
88-
# .python-version
88+
.python-version
8989

9090
# pipenv
9191
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
@@ -170,3 +170,6 @@ cython_debug/
170170

171171
# Mac
172172
.DS_Store
173+
174+
# Tests
175+
tests/integration/assets/config.json

README.md

Lines changed: 24 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,6 @@ through environment variables. Before deploying the Gateway, make sure to set th
4141
either by exporting them in the shell or by creating a `.env` file in the root directory of the
4242
project. The following variables are required:
4343

44-
* `MLFLOW_TRACKING_URI`: The URI for the MLflow tracking server.
45-
* `CMS_PROJECT_NAME`: The name of the Docker project where the CogStack ModelServe stack is running.
46-
* `CMS_HOST_URL` (optional): Useful when running CogStack ModelServe instances behind a proxy. If
47-
omitted, the Gateway will attempt to reach the services directly over the internal Docker network.
4844
* `CMG_SCHEDULER_MAX_CONCURRENT_TASKS`: The max number of concurrent tasks the scheduler can handle.
4945
* `CMG_DB_USER`: The username for the PostgreSQL database.
5046
* `CMG_DB_PASSWORD`: The password for the PostgreSQL database.
@@ -65,37 +61,29 @@ not allowed in MinIO bucket names). The configuration should be saved in a `.env
6561
directory of the project before running Docker Compose (or sourced directly in the shell):
6662

6763
```shell
68-
CMS_PROJECT_NAME=<cms-docker-compose-project-name> # e.g. cms
69-
70-
# (optional) Useful when running CMS behind a proxy
71-
CMS_HOST_URL=https://<proxy-docker-service-name>/cms # e.g. https://proxy/cms
72-
7364
CMG_SCHEDULER_MAX_CONCURRENT_TASKS=1
7465

7566
# Postgres
7667
CMG_DB_USER=admin
7768
CMG_DB_PASSWORD=admin
78-
CMG_DB_HOST=postgres
69+
CMG_DB_HOST=db
7970
CMG_DB_PORT=5432
8071
CMG_DB_NAME=cmg_tasks
8172

8273
# RabbitMQ
8374
CMG_QUEUE_USER=admin
8475
CMG_QUEUE_PASSWORD=admin
85-
CMG_QUEUE_HOST=rabbitmq
76+
CMG_QUEUE_HOST=queue
8677
CMG_QUEUE_PORT=5672
8778
CMG_QUEUE_NAME=cmg_tasks
8879

8980
# MinIO
9081
CMG_OBJECT_STORE_ACCESS_KEY=admin
9182
CMG_OBJECT_STORE_SECRET_KEY=admin123
92-
CMG_OBJECT_STORE_HOST=minio
83+
CMG_OBJECT_STORE_HOST=object-store
9384
CMG_OBJECT_STORE_PORT=9000
9485
CMG_OBJECT_STORE_BUCKET_TASKS=cmg-tasks
9586
CMG_OBJECT_STORE_BUCKET_RESULTS=cmg-results
96-
97-
# MLflow (use container IP when running locally)
98-
MLFLOW_TRACKING_URI=http://<mlflow-docker-service-name>:<mlflow-port> # e.g. http://mlflow-ui:5000
9987
```
10088

10189
To install the CogStack Model Gateway, clone the repository and run `docker compose` inside the root
@@ -127,15 +115,27 @@ monitoring the state of submitted tasks. The following endpoints are available:
127115

128116
* **Model Servers**: Interact with CMS model servers.
129117

130-
* `GET /models`: List all available model servers (i.e. Docker containers with the
131-
"org.cogstack.model-serve" label and "com.docker.compose.project" set to `$CMS_PROJECT_NAME`).
118+
* `GET /models`: List all available model servers, returning both running containers and on-demand
119+
models that can be auto-deployed.
132120

121+
* **Response**: Dictionary with `running` and `on_demand` keys each containing a list of models.
133122
* **Query Parameters**:
134-
* `verbose (bool)`: Include model metadata from the tracking server (if available).
123+
* `verbose (bool, default=false)`: When false, returns minimal info (name, uri, is_running).
124+
When true, includes description, model_type, deployment_type, idle_ttl, resources, tracking
125+
metadata, and runtime info (for running models).
126+
127+
* `GET /models/{model_name}`: Get information about a specific model (running or on-demand)
128+
without triggering auto-deployment.
129+
130+
* **Query Parameters**:
131+
* `verbose (bool, default=false)`: When false, returns minimal info (name, uri, is_running).
132+
When true, includes description, model_type, deployment_type, idle_ttl, resources, tracking
133+
metadata, and runtime info (for running models).
134+
135+
* `GET /models/{model_name}/info`: Get detailed information about a running model server
136+
(equivalent to the CMS `/info` endpoint). May trigger auto-deployment for on-demand models.
135137

136-
* `GET /models/{model_server_name}/info`: Get information about a specific model (equivalent to
137-
the `/info` CMS endpoint).
138-
* `POST /models/{model_server_name}`: Deploy a new model server from a previously trained model.
138+
* `POST /models/{model_name}`: Deploy a new model server from a previously trained model.
139139

140140
* **Body**:
141141
* `tracking_id (str)`: The tracking ID of the run that generated the model to serve (e.g.
@@ -144,9 +144,9 @@ monitoring the state of submitted tasks. The following endpoints are available:
144144
* `ttl (int, default=86400)`: The deployed model will be deleted after TTL seconds (defaults
145145
to 1 day). Set -1 as the TTL value to protect the model from being deleted.
146146

147-
* `POST /models/{model_server_name}/tasks/{task_name}`: Execute a task on the specified model
148-
server, providing any query parameters or request body required (follows the CMS API, striving
149-
to support the same endpoints).
147+
* `POST /models/{model_name}/tasks/{task_name}`: Execute a task on the specified model server,
148+
providing any query parameters or request body required (follows the CMS API, striving to
149+
support the same endpoints).
150150

151151
* **Tasks**: Monitor the state of submitted tasks.
152152

0 commit comments

Comments
 (0)