Skip to content

[Feature Request]: new attention optimization #39

@Nidal890

Description

@Nidal890

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits

What would your feature do ?

Wondering if it is possible to incorporate metal-flash-attention into this project. Webui currently uses the mps backend, meanwhile metal flash attention is an open source alternative to mps based off of dao AI labs flashattention v2. It's mainly built for apple silicon gpus and is much faster and less resource hungry than mps. Not sure about amd radeon gpus on macos though. I used it on the draw things app and it's really good. I know that it can't do fp64 calculations and some other stuff I don't understand but I though I should share. The linked repo has more information and benchmarks.
https://github.com/philipturner/metal-flash-attention
This article article talks more in detail about mfa: https://engineering.drawthings.ai/integrating-metal-flashattention-accelerating-the-heart-of-image-generation-in-the-apple-ecosystem-16a86142eb18

Proposed workflow

  1. Go to ....
  2. Press ....
  3. ...

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions