Skip to content

make RoundTripDoubleToBuffer locale-independent#2082

Open
dxbjavid wants to merge 2 commits into
abseil:masterfrom
dxbjavid:highprecision-locale-radix
Open

make RoundTripDoubleToBuffer locale-independent#2082
dxbjavid wants to merge 2 commits into
abseil:masterfrom
dxbjavid:highprecision-locale-radix

Conversation

@dxbjavid

Copy link
Copy Markdown

RoundTripDoubleToBuffer formats with snprintf %g, whose radix character follows the global C locale's LC_NUMERIC category, so in a process that has called setlocale() to something like de_DE or fr_FR absl::HighPrecision(d) comes out as e.g. "0,1". SimpleAtod only accepts '.', so the value no longer parses back to itself even though round-tripping through SimpleAtod is exactly what HighPrecision promises, and the rest of the float formatting here (RoundTripFloatToBuffer, SixDigitsToBuffer) is already locale-independent. This rewrites the radix character back to '.' in the produced buffer, and adds a regression test that exercises HighPrecision under a comma-radix locale (skipped when no such locale is installed).

@derekmauro derekmauro left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch, but I have a few comments. Let me know if you want me to make the suggested improvements, or if you would like to handle them.

Comment thread absl/strings/numbers.cc Outdated
// round-trippable through SimpleAtod().
const char radix = *localeconv()->decimal_point;
if (radix != '.') {
if (char* p = strchr(buffer, radix)) *p = '.';

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If localeconv()->decimal_point is empty (which might happen in some minimal environments), radix will be \0. In this case, strchr(buffer, '\0') will find the NUL terminator and replace it with ., corrupting the string.

I also wonder if there is any chance that localeconv()->decimal_point will be multibyte.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, handled both. it now skips an empty decimal_point so the NUL terminator can't get clobbered, and matches the whole radix string with strstr rather than a single char. for the multibyte case it rewrites the first byte to '.' and memmoves the trailing bytes away so the output collapses to a single '.'.

Comment thread absl/strings/numbers_test.cc Outdated
EXPECT_TRUE(
absl::SimpleAtod(absl::StrCat(absl::HighPrecision(0.1)), &parsed));
EXPECT_EQ(parsed, 0.1);
setlocale(LC_NUMERIC, old_locale.c_str());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to use absl::Cleanup as an RAII to restore this around line 1728.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, switched to absl::MakeCleanup so the locale is restored even if an assertion bails out early.

Comment thread absl/strings/numbers.cc
snprintf_result < numbers_internal::kFastToBufferSize);
}

// snprintf() writes the radix character chosen by the global C locale's

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we should be able to use std::to_chars for locale independence, but compilers were late to implement full floating-point support, and we still support some of these older compilers. Maybe a TODO would be nice here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, added a TODO noting we should move to std::to_chars once all the compilers we support have floating-point support for it.

@dxbjavid

Copy link
Copy Markdown
Author

thanks, took care of all three. pushed the radix guarding (empty + multibyte), the std::to_chars TODO, and the absl::Cleanup in the test. test still passes under a comma-radix locale here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants