Skip to content

Only blend with transpose mask in Levenshtein functions if cheaper#19

Open
kristoisberg wants to merge 1 commit intoDaniel-Liu-c0deb0t:masterfrom
kristoisberg:master
Open

Only blend with transpose mask in Levenshtein functions if cheaper#19
kristoisberg wants to merge 1 commit intoDaniel-Liu-c0deb0t:masterfrom
kristoisberg:master

Conversation

@kristoisberg
Copy link
Copy Markdown

Fixes #17, where Restricted Damerau-Levenshtein distances were calculated incorrectly: transposition masks were used even when other operations were cheaper, resulting in higher-than-correct distances. The fix is applied to both distance calculation (including traces) and search functions.

I have to make a disclaimer that while I tested this fix in multiple ways:

  • all tests pass, including new test cases based on Wrong result for the case of SIMD enabled (rdamerau) #17 and my own observations
  • there isn't a perceivable performance hit in benchmarks
  • I tested it against a large set of names and couldn't find any anomalies compared to a known good implementation of regular Levenshtein (PostgreSQL fuzzystrmatch), only expected differences

...the fix itself is mostly AI-generated, as SIMD Rust is quite far out of my comfort zone, so approach with caution. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong result for the case of SIMD enabled (rdamerau)

1 participant