Skip to content

Comments

[AI] AI object mask tool#20378

Open
andriiryzhkov wants to merge 20 commits intodarktable-org:masterfrom
andriiryzhkov:split/ai-object-mask
Open

[AI] AI object mask tool#20378
andriiryzhkov wants to merge 20 commits intodarktable-org:masterfrom
andriiryzhkov:split/ai-object-mask

Conversation

@andriiryzhkov
Copy link
Contributor

@andriiryzhkov andriiryzhkov commented Feb 21, 2026

Adds a new mask tool that lets users select objects in the image by clicking. Built on the AI subsystem from #20322.

How it works

AI object mask is an interactive single-object selection tool. The user activates the object mask tool, waits for the image to be encoded (background thread), then clicks to place foreground/background point prompts. The model segments the object in real time. Right-click finalizes the selection by vectorizing the raster mask into Bézier path forms that integrate with darktable's existing mask system.

Architecture

  • Segmentation engine (src/ai/segmentation.c): Implements the two-stage encoder/decoder pipeline. Supports both SAM2.1 (multi-mask + IoU selection + low-res refinement) and SegNext (single mask, full-res refinement). Encoder outputs are cached so multiple clicks don't re-encode.

  • Object mask tool (src/develop/masks/object.c): Runs image encoding in a background thread to keep the UI responsive. Displays a "working..." overlay during encoding. Supports foreground clicks (label 1), background clicks (label 0), and box prompts (SAM only).

  • Raster-to-vector (src/common/ras2vect.c): Extended with cleanup (turdsize), smoothing (alphamax), and boundary sign output for hole detection.

Models

The segmentation engine supports both SegNext and SAM model architectures. SegNext is the default — it produces good enough results and is compliant with the Open Source AI Definition. Models are downloaded on demand by the AI subsystem from the model repository: https://github.com/andriiryzhkov/darktable-ai

Depends on #20322
Fixes #12295

@TurboGit
Copy link
Member

Thanks for this new implementation. I'll test soon.

@TurboGit
Copy link
Member

@andriiryzhkov : I have created the darktable-org/darktable-ai repository. You should be able to clone this repository and create a PR to initialize it. If needed I can initialize it with the current content of your darktable-ai repo.

@TurboGit TurboGit added this to the 5.6 milestone Feb 22, 2026
@TurboGit TurboGit added priority: low core features work as expected, only secondary/optional features don't feature: new new features to add difficulty: hard big changes across different parts of the code base scope: image processing correcting pixels labels Feb 22, 2026
@andriiryzhkov
Copy link
Contributor Author

@TurboGit I will create a PR with models that are ready, because I have some models just for next experiments. I will keep them separate.

@TurboGit
Copy link
Member

Sounds good to me.

@andriiryzhkov
Copy link
Contributor Author

@TurboGit can you initialize repository darktable-org/darktable-ai with some empty file, so I will be able to fork it?

@TurboGit
Copy link
Member

@andriiryzhkov : Done.

@andriiryzhkov
Copy link
Contributor Author

Done darktable-org/darktable-ai#1

@TurboGit
Copy link
Member

Merged.

@TurboGit
Copy link
Member

@andriiryzhkov : How to test this, I have each time a message that the model is not available.

@andriiryzhkov
Copy link
Contributor Author

@TurboGit: Oh, that's because the repository reorganization which I did yesterday. I renamed my original repo to dt-ai. Anyway, I created a release with packaged models on my fork of darktable-ai, so download should work now.

Regarding the testing, just want to warn you that SegNext works somewhat slower (it's just bigger model) and produces way too many paths at the end. I am testing it too and tweaking post-processing parameters to get rid of unwanted artefacts.

@Donatzsky
Copy link

Is this compatible with more "classic", un-prompted, approaches using CNNs trained to only mask specific classes (person, dog, sky etc.) of objects?

@andriiryzhkov
Copy link
Contributor Author

@Donatzsky AI subsystem is compatible with "classic" segmentation models - it can easily handle them. But current PR does not include such functionality for masking. That is something for further consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

difficulty: hard big changes across different parts of the code base feature: new new features to add priority: low core features work as expected, only secondary/optional features don't scope: image processing correcting pixels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AI Masks

3 participants