So, to set the scene here, I have a proptest between two libraries set up. One of the libraries uses unicode-normalization under the hood, the other icu_normalizer.
I expected that both output the same values, but my CI exploded at some point on the weird string "\u{11366}\u{113ce}".
When put through the NFC normalizer, you get two different outputs:
unicode-normalization: "\u{113ce}\u{11366}"
icu_normalizer: "\u{11366}\u{113ce}"
Just a fun little thing I thought I'd report since it's technically a correctness issue (I'm just not good enough with Unicode to determine whether it's an issue with ICU4X or this crate).
So, to set the scene here, I have a proptest between two libraries set up. One of the libraries uses
unicode-normalizationunder the hood, the othericu_normalizer.I expected that both output the same values, but my CI exploded at some point on the weird string
"\u{11366}\u{113ce}".When put through the NFC normalizer, you get two different outputs:
unicode-normalization:"\u{113ce}\u{11366}"icu_normalizer:"\u{11366}\u{113ce}"Just a fun little thing I thought I'd report since it's technically a correctness issue (I'm just not good enough with Unicode to determine whether it's an issue with ICU4X or this crate).