Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
@@ -1,7 +1,23 @@
.venv
.venv-style
**/.venv
.pytest_cache
.devcontainer
.vscode
.vs
.idea
.gdb_history
out
bazel-bin
bazel-model_server/
bazel-openvino-model-server/
bazel-out
bazel-ovms
bazel-ovms-c
bazel-testlogs
demos/continuous_batching
demos/embeddings
demos/common/export_models/models
demos/common/export_models/models
*.log
*.img
models
4 changes: 2 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ __pycache__/
report.json
trace.json
bazel-bin
bazel-model_server/
bazel-openvino-model-server/
bazel-out
bazel-ovms
bazel-ovms-c
Expand All @@ -28,8 +30,6 @@ tags
src/test/llm_testing
node_modules/
yarn.*
bazel-openvino-model-server/
bazel-model_server/
out
.user.bazelrc
*.log
Expand Down
206 changes: 201 additions & 5 deletions demos/image_generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,12 @@
This demo shows how to deploy image generation models (Stable Diffusion/Stable Diffusion 3/Stable Diffusion XL/FLUX) to create and edit images with the OpenVINO Model Server.
Image generation pipelines are exposed via [OpenAI API](https://platform.openai.com/docs/api-reference/images/create) `images/generations` and `images/edits` endpoints.

Supported workloads:
- **Text-to-image** — generate an image from a text prompt (`/v3/images/generations`)
- **Image-to-image** — transform an existing image guided by a prompt (`/v3/images/edits`)
- **Inpainting** — repaint a masked region of an image (`/v3/images/edits` with `mask` field)
- **Outpainting** — extend an image beyond its original borders (`/v3/images/edits` with `mask` field and larger canvas)
Comment thread
atobiszei marked this conversation as resolved.

Check [supported models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

> **Note:** Please note that FLUX models are not supported on NPU.
Expand Down Expand Up @@ -387,12 +393,13 @@ curl http://localhost:8000/v1/config

## Request Generation

A single servable exposes following endpoints:
- text to image: `images/generations`
- image to image: `images/edits`
A single servable exposes the following endpoints:
- **Text-to-image**: `images/generations` — JSON body with `prompt`
- **Image-to-image**: `images/edits` — multipart form with `image` + `prompt` (no mask)
- **Inpainting**: `images/edits` — multipart form with `image` + `mask` + `prompt`
- **Outpainting**: `images/edits` — multipart form with `image` + `mask` + `prompt` (image placed on larger canvas, mask marks the area to fill)

Endpoints unsupported for now:
- inpainting: `images/edits` with `mask` field
> **Note:** For inpainting/outpainting, dedicated inpainting models (e.g. `stable-diffusion-v1-5/stable-diffusion-inpainting`) only support the `images/edits` endpoint. Base models (e.g. `stable-diffusion-v1-5/stable-diffusion-v1-5`) support all endpoints.

All requests are processed in unary format, with no streaming capabilities.

Expand Down Expand Up @@ -519,6 +526,195 @@ image.save('edit_output.png')
Output file (`edit_output.png`):
![edit_output](./edit_output.png)

### Requesting inpainting with cURL

Inpainting replaces a masked region in an image based on the prompt. The `mask` is a black-and-white image where white pixels mark the area to repaint.

![cat](./cat.png) ![cat_mask](./cat_mask.png)

Linux
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow switch formating with sphinx, as in earlier part of readme.

```bash
Comment thread
dkalinowski marked this conversation as resolved.
curl http://localhost:8000/v3/images/edits \
-F "model=OpenVINO/stable-diffusion-v1-5-int8-ov" \
-F "prompt=a dalmatian puppy sitting on a bench in a park, photorealistic" \
-F "image=@cat.png" \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use seed, like in previous part of the demo, otherwise the images will change in between openvino releases

-F "mask=@cat_mask.png" \
-F "num_inference_steps=50" \
-F "size=512x512" | jq -r '.data[0].b64_json' | base64 --decode > inpaint_output.png
```

Windows Command Prompt
```bat
curl http://localhost:8000/v3/images/edits ^
-F "model=OpenVINO/stable-diffusion-v1-5-int8-ov" ^
-F "prompt=a dalmatian puppy sitting on a bench in a park, photorealistic" ^
-F "image=@cat.png" ^
-F "mask=@cat_mask.png" ^
-F "num_inference_steps=50" ^
-F "size=512x512"
```

Expected output (`inpaint_output.png`):

![inpaint_output](./inpaint_output.png)

### Requesting inpainting with OpenAI Python package

```python
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)

response = client.images.edit(
model="OpenVINO/stable-diffusion-v1-5-int8-ov",
image=open("cat.png", "rb"),
mask=open("cat_mask.png", "rb"),
prompt="a dalmatian puppy sitting on a bench in a park, photorealistic",
extra_body={
"num_inference_steps": 50,
"size": "512x512"
}
)
base64_image = response.data[0].b64_json

image_data = base64.b64decode(base64_image)
image = Image.open(BytesIO(image_data))
image.save('inpaint_output.png')
```

### Requesting outpainting with cURL

Outpainting extends an image beyond its original borders. Prepare two images:
- **outpaint_input.png** — the original image centered on a larger canvas (e.g. 768×768) with black borders
- **outpaint_mask.png** — white where the new content should be generated (the borders), black where the original image is

![outpaint_input](./outpaint_input.png) ![outpaint_mask](./outpaint_mask.png)

Linux
```bash
curl http://localhost:8000/v3/images/edits \
-F "model=OpenVINO/stable-diffusion-v1-5-int8-ov" \
-F "prompt=a cat sitting on a bench in a park, photorealistic" \
-F "image=@outpaint_input.png" \
-F "mask=@outpaint_mask.png" \
-F "num_inference_steps=50" \
-F "size=768x768" | jq -r '.data[0].b64_json' | base64 --decode > outpaint_output.png
```

Windows Command Prompt
```bat
curl http://localhost:8000/v3/images/edits ^
-F "model=OpenVINO/stable-diffusion-v1-5-int8-ov" ^
-F "prompt=a cat sitting on a bench in a park, photorealistic" ^
-F "image=@outpaint_input.png" ^
-F "mask=@outpaint_mask.png" ^
-F "num_inference_steps=50" ^
-F "size=768x768"
```

Expected output (`outpaint_output.png`):

![outpaint_output](./outpaint_output.png)

### Requesting outpainting with OpenAI Python package

```python
from openai import OpenAI
import base64
from io import BytesIO
from PIL import Image

client = OpenAI(
base_url="http://localhost:8000/v3",
api_key="unused"
)

response = client.images.edit(
model="OpenVINO/stable-diffusion-v1-5-int8-ov",
image=open("outpaint_input.png", "rb"),
mask=open("outpaint_mask.png", "rb"),
prompt="a cat sitting on a bench in a park, photorealistic",
extra_body={
"num_inference_steps": 50,
"size": "768x768"
}
)
base64_image = response.data[0].b64_json

image_data = base64.b64decode(base64_image)
image = Image.open(BytesIO(image_data))
image.save('outpaint_output.png')
```

### Using dedicated inpainting models

For best inpainting/outpainting quality, use a dedicated inpainting model. These models have a 9-channel UNet specifically trained for masked generation.

Example models for inpainting:
- `stable-diffusion-v1-5/stable-diffusion-inpainting` — SD 1.5 based, 512×512 native resolution
- `diffusers/stable-diffusion-xl-1.0-inpainting-0.1` — SDXL based, 1024×1024 native resolution

For the full list see [supported image generation models](https://openvinotoolkit.github.io/openvino.genai/docs/supported-models/#image-generation-models).

> **Note:** Dedicated inpainting models only expose the `images/edits` endpoint (with mask). Text-to-image and image-to-image requests will return an error indicating the pipeline is not available for this model. Base models (e.g. `stable-diffusion-v1-5/stable-diffusion-v1-5`) support all endpoints including inpainting.

::::{tab-set}
:::{tab-item} Docker (Linux) — CPU
:sync: docker
```bash
mkdir -p models

docker run -d --rm --user $(id -u):$(id -g) -p 8000:8000 -v $(pwd)/models:/models/:rw \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
openvino/model_server:latest \
--rest_port 8000 \
--model_repository_path /models/ \
--task image_generation \
--source_model stable-diffusion-v1-5/stable-diffusion-inpainting \
--weight-format int8
```
:::

:::{tab-item} Docker (Linux) — GPU
:sync: docker-gpu
```bash
mkdir -p models

docker run -d --rm -p 8000:8000 -v $(pwd)/models:/models/:rw \
--user $(id -u):$(id -g) --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) \
-e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy \
openvino/model_server:latest-gpu \
--rest_port 8000 \
--model_repository_path /models/ \
--task image_generation \
--source_model stable-diffusion-v1-5/stable-diffusion-inpainting \
--weight-format int8 \
--target_device GPU
```
:::

:::{tab-item} Bare metal (Windows)
:sync: bare-metal
```bat
mkdir models

ovms --rest_port 8000 ^
--model_repository_path ./models/ ^
--task image_generation ^
--source_model stable-diffusion-v1-5/stable-diffusion-inpainting ^
--weight-format int8
```
:::

::::


### Strength influence on final damage

![strength](./strength.png)
Expand Down
Binary file added demos/image_generation/cat.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/cat_mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/inpaint_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_input.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_mask.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added demos/image_generation/outpaint_output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions src/http_frontend/multi_part_parser_drogon_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,17 @@ std::string_view DrogonMultiPartParser::getFileContentByFieldName(const std::str
return it->second.fileContent();
}

std::vector<std::string_view> DrogonMultiPartParser::getFilesArrayByFieldName(const std::string& name) const {
const std::vector<drogon::HttpFile>& files = this->parser->getFiles();
std::vector<std::string_view> result;
for (const drogon::HttpFile& file : files) {
if (file.getItemName() == name) {
result.push_back(file.fileContent());
}
}
return result;
}
Comment thread
atobiszei marked this conversation as resolved.

std::set<std::string> DrogonMultiPartParser::getAllFieldNames() const {
std::set<std::string> fieldNames;
auto fileMap = this->parser->getFilesMap();
Expand Down
1 change: 1 addition & 0 deletions src/http_frontend/multi_part_parser_drogon_impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ class DrogonMultiPartParser : public MultiPartParser {
std::string getFieldByName(const std::string& name) const override;
std::vector<std::string> getArrayFieldByName(const std::string& name) const override;
std::string_view getFileContentByFieldName(const std::string& name) const override;
std::vector<std::string_view> getFilesArrayByFieldName(const std::string& name) const override;
std::set<std::string> getAllFieldNames() const override;
};

Expand Down
58 changes: 53 additions & 5 deletions src/image_gen/http_image_gen_calculator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ static bool progress_bar(size_t step, size_t num_steps, ov::Tensor&) {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Image Generation Step: {}/{}", step + 1, num_steps);
return false;
}

// written out separately to avoid msvc crashing when using try-catch in process method ...
static absl::Status generateTensor(ov::genai::Text2ImagePipeline& request,
const std::string& prompt, ov::AnyMap& requestOptions,
Expand Down Expand Up @@ -94,6 +95,28 @@ static absl::Status generateTensorImg2Img(ov::genai::Image2ImagePipeline& reques
return absl::OkStatus();
}
// written out separately to avoid msvc crashing when using try-catch in process method ...
static absl::Status generateTensorInpainting(ov::genai::InpaintingPipeline& request,
const std::string& prompt, ov::Tensor image, ov::Tensor mask, ov::AnyMap& requestOptions,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please apply.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::unique_ptr<ov::Tensor>& images) {
try {
requestOptions.insert(ov::genai::callback(progress_bar));
images = std::make_unique<ov::Tensor>(request.generate(prompt, image, mask, requestOptions));
auto dims = images->get_shape();
std::stringstream ss;
for (const auto& dim : dims) {
ss << dim << " ";
}
ss << " element type: " << images->get_element_type().get_type_name();
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator generated inpainting tensor: {}", ss.str());
} catch (const std::exception& e) {
SPDLOG_LOGGER_ERROR(llm_calculator_logger, "ImageGenCalculator Inpainting Error: {}", e.what());
return absl::InternalError("Error during inpainting generation");
} catch (...) {
return absl::InternalError("Unknown error during inpainting generation");
}
return absl::OkStatus();
}
// written out separately to avoid msvc crashing when using try-catch in process method ...
static absl::Status makeTensorFromString(const std::string& filePayload, ov::Tensor& imageTensor) {
try {
imageTensor = loadImageStbiFromMemory(filePayload);
Expand Down Expand Up @@ -140,10 +163,12 @@ class ImageGenCalculator : public CalculatorBase {
auto pipe = it->second;

auto payload = cc->Inputs().Tag(INPUT_TAG_NAME).Get<ovms::HttpPayload>();
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Request URI: {}", cc->NodeName(), payload.uri);

std::unique_ptr<ov::Tensor> images; // output

if (absl::StartsWith(payload.uri, "/v3/images/generations")) {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Routed to image generations path", cc->NodeName());
if (payload.parsedJson->HasParseError())
return absl::InvalidArgumentError("Failed to parse JSON");

Expand All @@ -154,13 +179,15 @@ class ImageGenCalculator : public CalculatorBase {
SET_OR_RETURN(std::string, prompt, getPromptField(*payload.parsedJson));
SET_OR_RETURN(ov::AnyMap, requestOptions, getImageGenerationRequestOptions(*payload.parsedJson, pipe->args));

ov::genai::Text2ImagePipeline request = pipe->text2ImagePipeline->clone();

auto status = generateTensor(request, prompt, requestOptions, images);
// single request assumption - use pipeline instance directly
if (!pipe->text2ImagePipeline)
return absl::FailedPreconditionError("Text-to-image pipeline is not available for this model");
auto status = generateTensor(*pipe->text2ImagePipeline, prompt, requestOptions, images);
Comment thread
atobiszei marked this conversation as resolved.
Outdated
if (!status.ok()) {
return status;
}
} else if (absl::StartsWith(payload.uri, "/v3/images/edits")) {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Routed to image edits path", cc->NodeName());
if (payload.multipartParser->hasParseError())
return absl::InvalidArgumentError("Failed to parse multipart data");

Expand All @@ -176,8 +203,29 @@ class ImageGenCalculator : public CalculatorBase {

SET_OR_RETURN(ov::AnyMap, requestOptions, getImageEditRequestOptions(*payload.multipartParser, pipe->args));

ov::genai::Image2ImagePipeline request = pipe->image2ImagePipeline->clone();
status = generateTensorImg2Img(request, prompt, imageTensor, requestOptions, images);
SET_OR_RETURN(std::optional<std::string_view>, mask, getFileFromPayload(*payload.multipartParser, "mask"));
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Mask present: {}", cc->NodeName(), mask.has_value() && !mask.value().empty());

if (mask.has_value() && !mask.value().empty()) {
if (!pipe->inpaintingPipeline)
return absl::FailedPreconditionError("Inpainting pipeline is not available for this model");
// Inpainting path — uses the pre-built InpaintingPipeline that was loaded from disk
// during initialization. Do NOT derive InpaintingPipeline from Image2ImagePipeline
// at request time — that derivation direction causes a SEGFAULT in GenAI.
Comment thread
atobiszei marked this conversation as resolved.
Outdated
ov::Tensor maskTensor;
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Inpainting: decoding mask tensor", cc->NodeName());
status = makeTensorFromString(std::string(mask.value()), maskTensor);
if (!status.ok()) {
Comment thread
atobiszei marked this conversation as resolved.
return status;
}
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "ImageGenCalculator [Node: {}] Inpainting: mask tensor decoded, invoking generate()", cc->NodeName());
status = generateTensorInpainting(*pipe->inpaintingPipeline, prompt, imageTensor, maskTensor, requestOptions, images);
} else {
if (!pipe->image2ImagePipeline)
return absl::FailedPreconditionError("Image-to-image pipeline is not available for this model");
// image-to-image path - single pipeline instance, no clone needed
status = generateTensorImg2Img(*pipe->image2ImagePipeline, prompt, imageTensor, requestOptions, images);
Comment thread
atobiszei marked this conversation as resolved.
Outdated
}
if (!status.ok()) {
return status;
}
Expand Down
Loading
Loading