Skip to content

fix wrong auto-detection of country by language#1791

Merged
kingthorin merged 1 commit intomainfrom
fix/language-code-collisions
Apr 5, 2026
Merged

fix wrong auto-detection of country by language#1791
kingthorin merged 1 commit intomainfrom
fix/language-code-collisions

Conversation

@asolntsev
Copy link
Copy Markdown
Collaborator

@asolntsev asolntsev commented Apr 5, 2026

When give Locale contains only language, but not country, the phone generator tries to guess the country.

For few specific languages, the guess was wrong. It happened for languages which code occasionally matches some other country's code.

new Faker(new Locale("am").phoneNumber(); // generated Armenian phone instead of Ethiopian

new Faker(new Locale("ar").phoneNumber(); // generated Argentina phone instead of Saudi Arabia

etc.

Ideally, users should always provide locale with country:

  String phoneNumber = new Faker(new Locale("am_ET").phoneNumber();

But at least we don't predict the wrong country anymore.

Inspired by #1788

@what-the-diff
Copy link
Copy Markdown

what-the-diff bot commented Apr 5, 2026

PR Summary

  • Enhanced Language Support in PhoneNumber Method
    The detectCountryByLanguage method in the phone number functionality has been updated. New mappings have been added for several languages including Afrikaans, Arabic, Amharic, Bengali, and others to correspond to the correct country codes. Existing mappings have also been refined for more accurate results. This change provides greater language support and ensures the accuracy of country codes associated with various languages.

  • Additional Testing for PhoneNumberValidityFinder
    A new test method named detectsCountryByLanguage has been introduced to verify the correct return of country codes for different languages. The test uses pairs of languages and countries to check the efficiency and correctness of the language-to-country mapping feature. This has resulted in a significant improvement in the coverage of our tests for this functionality.

@asolntsev asolntsev force-pushed the fix/language-code-collisions branch from af4eca4 to 3726d4b Compare April 5, 2026 10:58
@asolntsev asolntsev added this to the 2.6.0 milestone Apr 5, 2026
@asolntsev asolntsev self-assigned this Apr 5, 2026
@asolntsev asolntsev added the enhancement New feature or request label Apr 5, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 5, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 73.07692% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 92.17%. Comparing base (4117c61) to head (8b576e6).

Files with missing lines Patch % Lines
...java/net/datafaker/providers/base/PhoneNumber.java 73.07% 7 Missing ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1791      +/-   ##
============================================
- Coverage     92.34%   92.17%   -0.17%     
- Complexity     3447     3456       +9     
============================================
  Files           339      339              
  Lines          6794     6813      +19     
  Branches        670      670              
============================================
+ Hits           6274     6280       +6     
- Misses          354      364      +10     
- Partials        166      169       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@asolntsev asolntsev requested review from bodiam and kingthorin April 5, 2026 11:01
When locale contains only language, but not country, the phone generator tries to guess the country.

For few specific languages, the guess was wrong. It happened for languages which code occasionally matches some other country's code.

```
new Faker(new Language("am").phoneNumber(); // generated Armenian phone instead of Ethiopian

new Faker(new Language("ar").phoneNumber(); // generated Argentina phone instead of Saudi Arabia
```

etc.

Inspired by #1788
@asolntsev asolntsev force-pushed the fix/language-code-collisions branch from 3726d4b to 8b576e6 Compare April 5, 2026 11:12
Copy link
Copy Markdown
Collaborator

@kingthorin kingthorin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kingthorin kingthorin merged commit de4a180 into main Apr 5, 2026
13 checks passed
@kingthorin kingthorin deleted the fix/language-code-collisions branch April 5, 2026 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants