Skip to content

[FR] Prompt Enhancer - API #41

@fszontagh

Description

@fszontagh

To first shot, i just plan to implement an ollama / openai rest API client to achieve this feature.

For ollama, i just found an interesting model: https://ollama.com/brxce/stable-diffusion-prompt-generator

And there is some interesting results:

ollama run brxce/stable-diffusion-prompt-generator

REALISTIC

Ollama prompt:
>>> epic macro photo of a red rose, water drops, wallpaper, dynamic lights

Model's answer:
epic macro photo of a red rose, water drops, wallpaper, dynamic lights, by ilya kuvshinov and takashi kitamura and artem demura, trending on artstation, very detailed

Final prompt:
epic macro photo of a red rose, water drops, wallpaper, dynamic lights, by ilya kuvshinov and takashi kitamura and artem demura, trending on artstation, very detailed <lora:more_details:0.8>

Final negative prompt:
(worst quality, low quality:1.4, letterboxed), lowres, loli, child, bokeh, text, signature, sketch, watermark, artist name, speech bubble, blurry, pubic hair, pubes, (mole, mole under eye, mole on breast)

Image

schedule type: Karras
steps: 30
model hash: 4fe4ab5ef0
seed: 595707153
operations: txt2img
backend: stable-diffusion.cpp (ver.: fbd42b6 using cuda)
size: 768x512
scheduler: karras
cfg scale: 7.00
app: StableDiffusionGUI 0.2.7 3aab399
parser: stable-diffusion.cpp
sampler: Automatic
model: opiate_V50

ANIME

Ollama prompt:
>>> epic macro photo of a red rose, water drops, wallpaper, dynamic lights, anime

Model's answer:
epic macro photo of a red rose, water drops, wallpaper, dynamic lights, anime, trending on artstation, by greg rutkowski and takashi takahashi and artem demura, very detailed, 8k resolution

Final prompt:
epic macro photo of a red rose, water drops, wallpaper, dynamic lights, anime, trending on artstation, by greg rutkowski and takashi takahashi and artem demura, very detailed, 8k resolution

Final negative prompt:
(worst quality, low quality:1.4, letterboxed), lowres, loli, child, bokeh, text, signature, sketch, watermark, artist name, speech bubble, blurry, pubic hair, pubes, (mole, mole under eye, mole on breast)

Image

schedule type: Karras
steps: 30
model hash: da52bf6da0
seed: 778953140
operations: txt2img
backend: stable-diffusion.cpp (ver.: fbd42b6 using cuda)
size: 768x512
scheduler: karras
cfg scale: 7.00
app: StableDiffusionGUI 0.2.7 3aab399
parser: stable-diffusion.cpp
sampler: Automatic
model: mistoonAnime_v30

So the anime is not the best, but maybe it's just depends on the SD model's knowledge. And they are SD1.X models, so the prompts works differently from the FLUX models, but i think this feature worth it with these models too.

The plan is:

  • i will add a new tab into the settings window, where user can configure the REST API settings to an ollama endpoint
  • i will add a model selector to the same place. The default and recommended model will be the brxce/stable-diffusion-prompt-generator
  • if the user configured it and the endpoint works, then show up a button with name "enhance prompt" near the prompts area, which sends the content of the prompt input to the language model to generate a response. The response will overwrite the original prompt

Originally posted by @fszontagh in #40

Metadata

Metadata

Assignees

Labels

Projects

Status

Planning

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions