Skip to content

[AI] Object mask: better boundaries, more consistent single-click results#20988

Open
andriiryzhkov wants to merge 6 commits into
darktable-org:masterfrom
andriiryzhkov:mask_refinement_clean
Open

[AI] Object mask: better boundaries, more consistent single-click results#20988
andriiryzhkov wants to merge 6 commits into
darktable-org:masterfrom
andriiryzhkov:mask_refinement_clean

Conversation

@andriiryzhkov
Copy link
Copy Markdown
Collaborator

I collected feedback in the pixls.us testing thread and the two issues people raised most consistently were that the mask boundary didn't track the image well, and that clicking felt hit-and-miss – repeated clicks didn't reliably improve the result. This PR addresses both.

What I changed

  • Joint-bilateral upsampling of the decoder output, using the encoded RGB as a guide. Mask edges now follow image edges instead of being bilinearly blurred when resizing from the encoder's internal resolution.
  • Optional DenseCRF edge refinement behind a new per-mask "refine mask boundary" checkbox. Snaps the soft mask to nearby color edges; especially helps with small notches and bumps the model missed. Adds a few hundred ms per click, off by default.
  • Automatic multi-pass refinement (inspired by SAMRefiner). Instead of relying on users to click more, I re-run the decoder with prompts auto-augmented from the previous mask (distance-transform peak point + bbox for SAM). The augmented prompts are more informative than additional human clicks, so the result is more consistent – one click typically gets you what 3–4 clicks used to. IoU-based early-stop keeps it from doing more work than needed.

What users will notice

  • New "refine mask boundary" checkbox in the mask manager. The user's choice is saved to config and remembered for future masks; can be toggled per-mask at any time. I also registered it with the actions system so it's shortcut-bindable.
  • I raised the refine_passes default from 1 → 2 and lowered the max from 5 → 3.
  • render_size default 1024 → 1536 (the CRF needs more spatial detail than the encoder's internal 1024). I tested this at length; quality/speed feels right at 1536.
  • smoothing default changed from 1.0 → 0.0 – only affects new users' first vectorization output, but worth flagging in release notes.

Note for testers and existing users

Several default values changed in this PR. Existing darktablerc entries override defaults, so users who tested earlier nightlies won't pick up the new defaults automatically. To get the new behavior, delete all object-mask conf keys before launching darktable.

One-liner to strip the keys (Linux/macOS):

sed -i.bak '/^plugins\/darkroom\/masks\/object\//d' ~/.config/darktable/darktablerc

This removes all plugins/darkroom/masks/object/* lines and keeps a .bak next to the original. On next launch, darktable repopulates them with the new defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant