It appears to be easy way to improve the performance for multiplication by avoiding pure 64-bit path. `__mulh` and `__umulh` can also work on x64, but I expect `_mul128` and `_umul128` to be superior there.
It appears to be easy way to improve the performance for multiplication by avoiding pure 64-bit path.
__mulhand__umulhcan also work on x64, but I expect_mul128and_umul128to be superior there.