Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,23 @@
.venv
.venv-style
**/.venv
.pytest_cache
.devcontainer
.vscode
.vs
.idea
.gdb_history
out
bazel-bin
bazel-model_server/
bazel-openvino-model-server/
bazel-out
bazel-ovms
bazel-ovms-c
bazel-testlogs
demos/continuous_batching
demos/embeddings
demos/common/export_models/models
demos/common/export_models/models
*.log
*.img
models
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ __pycache__/
report.json
trace.json
bazel-bin
bazel-model_server/
bazel-openvino-model-server/
bazel-out
bazel-ovms
bazel-ovms-c
Expand All @@ -28,8 +30,6 @@ tags
src/test/llm_testing
node_modules/
yarn.*
bazel-openvino-model-server/
bazel-model_server/
out
.user.bazelrc
*.log
Expand Down
225 changes: 208 additions & 17 deletions demos/image_generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
This demo shows how to deploy image generation models (Stable Diffusion/Stable Diffusion 3/Stable Diffusion XL/FLUX) to create and edit images with the OpenVINO Model Server.
Image generation pipelines are exposed via [OpenAI API](https://platform.openai.com/docs/api-reference/images/create) `images/generations` and `images/edits` endpoints.

Supported workloads:
- **Text-to-image** — generate an image from a text prompt (`/v3/images/generations`)
- **Image-to-image** — transform an existing image guided by a prompt (`/v3/images/edits`)
- **Inpainting** — repaint a masked region of an image (`/v3/images/edits` with `mask` field)
- **Outpainting** — extend an image beyond its original borders (`/v3/images/edits` with `mask` field and larger canvas)

Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

> **Note:** Please note that FLUX models are not supported on NPU.
Expand Down Expand Up @@ -364,35 +370,32 @@ ovms --rest_port 8000 ^

Wait for the model to load. You can check the status with a simple command:
```console
curl http://localhost:8000/v1/config
curl http://localhost:8000/v3/models
```

```json
{
"OpenVINO/stable-diffusion-v1-5-int8-ov" :
{
"model_version_status": [
{
"version": "1",
"state": "AVAILABLE",
"status": {
"error_code": "OK",
"error_message": "OK"
"object": "list",
"data": [
{
"id": "OpenVINO/stable-diffusion-v1-5-int8-ov",
"object": "model",
"created": 0,
"owned_by": "openvinotoolkit"
}
}
]
}
}
```

## Request Generation

A single servable exposes following endpoints:
- text to image: `images/generations`
- image to image: `images/edits`
A single servable exposes the following endpoints:
- **Text-to-image**: `images/generations` — JSON body with `prompt`
- **Image-to-image**: `images/edits` — multipart form with `image` + `prompt` (no mask)
- **Inpainting**: `images/edits` — multipart form with `image` + `mask` + `prompt`
- **Outpainting**: `images/edits` — multipart form with `image` + `mask` + `prompt` (image placed on larger canvas, mask marks the area to fill)

Endpoints unsupported for now:
- inpainting: `images/edits` with `mask` field
> **Note:** For inpainting/outpainting, dedicated inpainting models (e.g. `stable-diffusion-v1-5/stable-diffusion-inpainting`) only support the `images/edits` endpoint. Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

All requests are processed in unary format, with no streaming capabilities.

Expand Down Expand Up @@ -519,6 +522,194 @@ image.save('edit_output.png')
Output file (`edit_output.png`):
![edit_output](./edit_output.png)

### Requesting inpainting with cURL

Inpainting replaces a masked region in an image based on the prompt. The `mask` is a black-and-white image where white pixels mark the area to repaint.

![cat](./cat.png) ![cat_mask](./cat_mask.png)

::::{tab-set}
:::{tab-item} Linux
:sync: linux
```bash
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tab for sphinx? Can you generate sphinx doc so we can see how it is going to look? @atobiszei

curl http://localhost:8000/v3/images/edits \
-F "model=diffusers/stable-diffusion-xl-1.0-inpainting-0.1" \
-F "prompt=a golden retriever dog sitting on a bench in a sunny park" \
-F "image=@cat.png" \
-F "mask=@cat_mask.png" \
-F "num_inference_steps=50" \
-F "size=1024x1024" | jq -r '.data[0].b64_json' | base64 --decode > inpaint_output.png
```
:::

:::{tab-item} Windows Command Prompt
:sync: windows
```bat
curl http://localhost:8000/v3/images/edits ^
-F "model=diffusers/stable-diffusion-xl-1.0-inpainting-0.1" ^
-F "prompt=a golden retriever dog sitting on a bench in a sunny park" ^
-F "image=@cat.png" ^
-F "mask=@cat_mask.png" ^
-F "num_inference_steps=50" ^
-F "size=1024x1024"
```
:::

::::

Expected output (`inpaint_output.png`):

![inpaint_output](./inpaint_output.png)

### Requesting inpainting with OpenAI Python package

```python
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)

response = client.images.edit(
model="diffusers/stable-diffusion-xl-1.0-inpainting-0.1",
image=open("cat.png", "rb"),
mask=open("cat_mask.png", "rb"),
prompt="a golden retriever dog sitting on a bench in a sunny park",
extra_body={
"num_inference_steps": 50,
"size": "1024x1024"
}
)
base64_image = response.data[0].b64_json

image_data = base64.b64decode(base64_image)
image = Image.open(BytesIO(image_data))
image.save('inpaint_output.png')
```

### Requesting outpainting with cURL

Outpainting extends an image beyond its original borders. Prepare two images:
- **outpaint_input.png** — the original image centered on a larger canvas (e.g. 768×768) with black borders
- **outpaint_mask.png** — white where the new content should be generated (the borders), black where the original image is

![outpaint_input](./outpaint_input.png) ![outpaint_mask](./outpaint_mask.png)

::::{tab-set}
:::{tab-item} Linux
:sync: linux
```bash
curl http://localhost:8000/v3/images/edits \
-F "model=stable-diffusion-v1-5/stable-diffusion-inpainting" \
-F "prompt=a cat sitting on a bench in a park" \
-F "image=@outpaint_input.png" \
-F "mask=@outpaint_mask.png" \
-F "num_inference_steps=50" \
-F "size=768x768" | jq -r '.data[0].b64_json' | base64 --decode > outpaint_output.png
```
:::

:::{tab-item} Windows Command Prompt
:sync: windows
```bat
curl http://localhost:8000/v3/images/edits ^
-F "model=stable-diffusion-v1-5/stable-diffusion-inpainting" ^
-F "prompt=a cat sitting on a bench in a park" ^
-F "image=@outpaint_input.png" ^
-F "mask=@outpaint_mask.png" ^
-F "num_inference_steps=50" ^
-F "size=768x768"
```
:::

::::

Expected output (`outpaint_output.png`):

![outpaint_output](./outpaint_output.png)

### Requesting outpainting with OpenAI Python package

```python
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)

response = client.images.edit(
model="stable-diffusion-v1-5/stable-diffusion-inpainting",
image=open("outpaint_input.png", "rb"),
mask=open("outpaint_mask.png", "rb"),
prompt="a cat sitting on a bench in a park",
extra_body={
"num_inference_steps": 50,
"size": "768x768"
}
)
base64_image = response.data[0].b64_json

image_data = base64.b64decode(base64_image)
image = Image.open(BytesIO(image_data))
image.save('outpaint_output.png')
```

### Using dedicated inpainting models

For best inpainting/outpainting quality, use a dedicated inpainting model. These models have a 9-channel UNet specifically trained for masked generation.

Example models for inpainting:
- `stable-diffusion-v1-5/stable-diffusion-inpainting` — SD 1.5 based, 512×512 native resolution
- `diffusers/stable-diffusion-xl-1.0-inpainting-0.1` — SDXL based, 1024×1024 native resolution

For the full list see [supported image generation models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

> **Note:** Dedicated inpainting models only expose the `images/edits` endpoint (with mask). Text-to-image and image-to-image requests will return an error indicating the pipeline is not available for this model. Base models (e.g. `stable-diffusion-v1-5/stable-diffusion-v1-5`) support all endpoints including inpainting.

::::{tab-set}
:::{tab-item} Docker (Linux) — GPU
:sync: docker-gpu
```bash
mkdir -p models

docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
--user $(id -u):$(id -g) --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
openvino/model_server:latest-gpu \
--rest_port 8000 \
--model_repository_path /models/ \
--task image_generation \
--source_model stable-diffusion-v1-5/stable-diffusion-inpainting \
--weight-format int8 \
--target_device GPU
```
:::

:::{tab-item} Bare metal (Windows)
:sync: bare-metal
```bat
mkdir models

ovms --rest_port 8000 ^
--model_repository_path ./models/ ^
--task image_generation ^
--source_model stable-diffusion-v1-5/stable-diffusion-inpainting ^
--weight-format int8 ^
--target_device GPU
```
:::

::::


### Strength influence on final damage

![strength](./strength.png)
Expand Down
Binary file added demos/image_generation/cat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/cat_mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/inpaint_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_input.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/model_server_rest_api_image_edit.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ curl -X POST http://localhost:8000/v3/images/edits \
|-----|----------|----------|---------|-----|
| model | ✅ | ✅ | string (required) | Name of the model to use. Name assigned to a MediaPipe graph configured to schedule generation using desired embedding model. **Note**: This can also be omitted to fall back to URI based routing. Read more on routing topic **TODO** |
| image | ⚠️ | ✅ | string or array of strings (required) | The image to edit. Must be a single image (⚠️**Note**: Array of strings is not supported for now.) |
| mask | | ✅ | string | Triggers inpainting pipeline type. An additional image whose fully transparent areas (e.g. where alpha is zero) indicate where `image` should be edited. Not supported for now. |
| mask | | ✅ | string | Triggers inpainting pipeline. An additional image where white pixels mark the area to repaint. Send as a multipart file field alongside `image`. |
| prompt | ✅ | ✅ | string (required) | A text description of the desired image(s). |
| size | ✅ | ✅ | string or null (default: auto) | The size of the generated images. Must be in WxH format, example: `1024x768`. Default model W/H will be used when using `auto`. |
| n | ✅ | ✅ | integer or null (default: `1`) | A number of images to generate. If you want to generate multiple images for the same combination of generation parameters and text prompts, you can use this parameter for better performance as internally computations will be performed with batch for Unet / Transformer models and text embeddings tensors will also be computed only once. |
Expand Down
11 changes: 11 additions & 0 deletions src/http_frontend/multi_part_parser_drogon_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,17 @@ std::string_view DrogonMultiPartParser::getFileContentByFieldName(const std::str
return it->second.fileContent();
}

std::vector<std::string_view> DrogonMultiPartParser::getFilesArrayByFieldName(const std::string& name) const {
const std::vector<drogon::HttpFile>& files = this->parser->getFiles();
std::vector<std::string_view> result;
for (const drogon::HttpFile& file : files) {
if (file.getItemName() == name) {
result.push_back(file.fileContent());
}
}
return result;
}

std::set<std::string> DrogonMultiPartParser::getAllFieldNames() const {
std::set<std::string> fieldNames;
auto fileMap = this->parser->getFilesMap();
Expand Down
1 change: 1 addition & 0 deletions src/http_frontend/multi_part_parser_drogon_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ class DrogonMultiPartParser : public MultiPartParser {
std::string getFieldByName(const std::string& name) const override;
std::vector<std::string> getArrayFieldByName(const std::string& name) const override;
std::string_view getFileContentByFieldName(const std::string& name) const override;
std::vector<std::string_view> getFilesArrayByFieldName(const std::string& name) const override;
std::set<std::string> getAllFieldNames() const override;
};

Expand Down
2 changes: 1 addition & 1 deletion src/image_conversion.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ ov::Tensor loadImageStbi(unsigned char* image, const int x, const int y, const i
SharedImageAllocator(image, desiredChannels, y, x));
}

ov::Tensor loadImageStbiFromMemory(const std::string& imageBytes) {
ov::Tensor loadImageStbiFromMemory(std::string_view imageBytes) {
int x = 0, y = 0, channelsInFile = 0;
constexpr int desiredChannels = 3;
unsigned char* image = stbi_load_from_memory(
Expand Down
3 changes: 2 additions & 1 deletion src/image_conversion.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,14 +16,15 @@
#pragma once

#include <string>
#include <string_view>
#include <vector>

#include <openvino/runtime/tensor.hpp>

namespace ovms {

ov::Tensor loadImageStbi(unsigned char* image, const int x, const int y, const int desiredChannels);
ov::Tensor loadImageStbiFromMemory(const std::string& imageBytes);
ov::Tensor loadImageStbiFromMemory(std::string_view imageBytes);
ov::Tensor loadImageStbiFromFile(const char* filename);
std::vector<std::string> saveImagesStbi(const ov::Tensor& tensor);

Expand Down
1 change: 1 addition & 0 deletions src/image_gen/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ ovms_cc_library(
deps = [
"imagegenpipelineargs",
"//src:libovmslogging",
"//src:libovms_queue",
"//src:libovmsstring_utils",
"//third_party:genai",],
visibility = ["//visibility:public"],
Expand Down
Loading