Skip to content

mflux-generate-z-image-turbo takes twice of VRAM at the end of generation #311

@vladyuhuu

Description

@vladyuhuu

First — thanks to filipstrand and contributors!

Question – is it OK? With mflux-generate-z-image-turbo (--model filipstrand/Z-Image-Turbo-mflux-4bit), it uses about 9–10 GB of VRAM during generation, but at the end it consumes much more—I'd say 1.5 to 2 times as much.
I use macbook pro m1 pro.
graph in activity monitor

Also I couldn't find, does mflux support LoRA lightening?
One more question — right now, the model loads and unloads from VRAM for every prompt. Is it possible to not unload the model automatically after generation, or may be there is a UI that can do it (like LMStudio for mlx-lm)?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions