Optimize stats() functions, fall back on IntegrityError by flodolo · Pull Request #4036 · mozilla/pontoon

flodolo · 2026-03-23T08:55:22Z

Going for a very long explanation, since folks will understand the code much better than me, and maybe there's a better approach to this.

The code in pontoon/base/models/translation.py takes a snapshot of the stats, at entity level, before saving the translation, then takes another after saving, and tries to store the delta via adjust_stats().
That leads to things breaking when human translator and pretranslation work on the same entity at the same time.

A possible solution is to drop the delta approach, and calculate the stats for the entire resource after saving (using calculate_stats(). That's completely safe compared to the current approach, but costly for large resources. These are the top 20 resources in prod

Project	Resource Path	Strings
sumo	LC_MESSAGES/django.po	2611
firefox-for-ios	firefox-ios.xliff	1700
firefox-for-android	mozilla-mobile/fenix/app/src/main/res/values/strings.xml	1680
amo	LC_MESSAGES/django.po	1501
seamonkey	suite/chatzilla/chrome/chatzilla.properties	1154
firefox	browser/browser/preferences/preferences.ftl	1016
mozilla-accounts	settings.ftl	974
thunderbirdnet	LC_MESSAGES/messages.po	883
amo-frontend	LC_MESSAGES/amo.po	807
thunderbird	mail/chrome/messenger/messenger.dtd	750
thunderbird	mail/messenger/preferences/preferences.ftl	525
firefox	browser/browser/browser.ftl	514
seamonkey	suite/chrome/mailnews/messenger.dtd	511
mozilla-accounts	LC_MESSAGES/client.po	506
mozilla-vpn-client	mozillavpn.xliff	490
common-voice	web/locales/common-voice/en/pages/common.ftl	479
thunderbird	calendar/chrome/calendar/timezones.properties	443
firefox	devtools/client/netmonitor.properties	419
mozilla-accounts	payments-next.ftl	380
firefox	browser/browser/newtab/newtab.ftl	378

In the process of explaining the code, Claude pointed out that calculate_stats() can be made more efficient (reducing the number of queries), so that takes away part of the performance hit. But that's potentially still 5x worse in production :-(

In the end (last commit) I went for a middle ground: use the same optimization for get_stats(). The delta is still applied via adjust_stats(), but in case of an IntegrityError it falls back to a full calculate_stats(). Also added a UI error notification, because I don't think we're showing anything at the moment?

Performance benchmarks

I got Claude to come up with a couple of benchmark scripts.

calculate_stats() before and after

Script: https://gist.github.com/flodolo/187a9d7d497282eae4d3378dabd4953b

Analyzed Italian, largest 10 resources.

Locally I can get 9x improvement, in prod closer to 5x.

Local Docker install

Top 10 resources:

firefox-for-ios|firefox-ios.xliff — 1700 strings (resource_id=39)
firefox-for-android|mozilla-mobile/fenix/app/src/main/res/values/strings.xml — 1680 strings (resource_id=38)
firefox|browser/browser/preferences/preferences.ftl — 1016 strings (resource_id=143)
firefox|browser/browser/browser.ftl — 514 strings (resource_id=105)
firefox|devtools/client/netmonitor.properties — 419 strings (resource_id=349)
firefox|browser/browser/newtab/newtab.ftl — 378 strings (resource_id=127)
firefox|devtools/client/debugger.properties — 373 strings (resource_id=332)
firefox|toolkit/toolkit/pdfviewer/viewer.ftl — 357 strings (resource_id=244)
firefox|dom/chrome/dom/dom.properties — 335 strings (resource_id=180)
firefox|toolkit/toolkit/neterror/nsserrors.ftl — 331 strings (resource_id=241)

project	resource	strings	current (ms)	aggregate (ms)	speedup
firefox-for-ios	firefox-ios.xliff	1700	24.0	2.8	8.5x
firefox-for-android	mozilla-mobile/fenix/app/src/main/res/values/strings.xml	1680	24.0	2.8	8.5x
firefox	browser/browser/preferences/preferences.ftl	1016	23.7	2.6	9.2x
firefox	browser/browser/browser.ftl	514	24.1	2.4	9.8x
firefox	devtools/client/netmonitor.properties	419	23.8	2.4	10.1x
firefox	browser/browser/newtab/newtab.ftl	378	24.2	2.4	10.2x
firefox	devtools/client/debugger.properties	373	23.7	2.4	10.1x
firefox	toolkit/toolkit/pdfviewer/viewer.ftl	357	23.8	2.3	10.2x
firefox	dom/chrome/dom/dom.properties	335	24.1	2.3	10.2x
firefox	toolkit/toolkit/neterror/nsserrors.ftl	331	24.2	2.3	10.4x

Overall totals (20 runs each):

Current (5 queries): 4.855s, 24.27ms avg per call
Aggregate (1 query): 0.496s, 2.48ms avg per call
Overall speedup: 9.8x

Production

Top 10 resources:

sumo|LC_MESSAGES/django.po — 2611 strings (resource_id=564)
marketplace|LC_MESSAGES/django.po — 1810 strings (resource_id=2614)
firefox-for-ios|firefox-ios.xliff — 1700 strings (resource_id=580)
firefox-for-android|mozilla-mobile/fenix/app/src/main/res/values/strings.xml — 1680 strings (resource_id=3436)
amo|LC_MESSAGES/django.po — 1501 strings (resource_id=578)
seamonkey|suite/chatzilla/chrome/chatzilla.properties — 1154 strings (resource_id=4291)
firefox|browser/browser/preferences/preferences.ftl — 1016 strings (resource_id=3124)
mozilla-accounts|settings.ftl — 974 strings (resource_id=4198)
thunderbirdnet|LC_MESSAGES/messages.po — 883 strings (resource_id=3168)
amo-frontend|LC_MESSAGES/amo.po — 807 strings (resource_id=2790)

project	resource	strings	current (ms)	aggregate (ms)	speedup
sumo	LC_MESSAGES/django.po	2611	56.3	14.6	3.8x
marketplace	LC_MESSAGES/django.po	1810	46.8	15.8	3.0x
firefox-for-ios	firefox-ios.xliff	1700	49.1	13.3	3.7x
firefox-for-android	mozilla-mobile/fenix/app/src/main/res/values/strings.xml	1680	45.5	11.8	3.8x
amo	LC_MESSAGES/django.po	1501	40.8	13.6	3.0x
seamonkey	suite/chatzilla/chrome/chatzilla.properties	1154	35.4	9.6	3.7x
firefox	browser/browser/preferences/preferences.ftl	1016	186.0	7.0	26.7x
mozilla-accounts	settings.ftl	974	36.7	9.5	3.9x
thunderbirdnet	LC_MESSAGES/messages.po	883	33.4	9.4	3.6x
amo-frontend	LC_MESSAGES/amo.po	807	34.9	11.9	2.9x

Overall totals (20 runs each):

Current (5 queries): 10.531s, 52.66ms avg per call
Aggregate (1 query): 2.238s, 11.19ms avg per call
Overall speedup: 4.7x

Delta vs calculate_stats()

Script: https://gist.github.com/flodolo/21e66cc03bc5e8ddcc8275db1375a26a

This benchmark was used to rule out calling calculate_stats() all the time as the solution.

Benchmark with 50 translations, 5 largest resources.

Local Docker install

  Old approach (get_stats x2 + adjust_stats):
    Avg queries : 8.0
    Avg time    : 2.25 ms

  New approach (calculate_stats):
    Avg queries : 5.0
    Avg time    : 3.98 ms

  Query reduction : 3.0 fewer queries per save (38%)
  Time change  : 1.73 ms per save (77%)

Production

  Old approach (get_stats x2 + adjust_stats):
    Avg queries : 8.2
    Avg time    : 7.32 ms

  New approach (calculate_stats):
    Avg queries : 9.1
    Avg time    : 49.07 ms

  Query reduction : -0.9 fewer queries per save (-11%)
  Time change  : 41.76 ms per save (570%)

codecov-commenter · 2026-03-23T09:00:47Z

Codecov Report

❌ Patch coverage is 93.54839% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.83%. Comparing base (e04d98e) to head (6c68ea7).

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…fter

flodolo · 2026-03-24T06:00:18Z

I ended up with a ton of code overlap between calculate_stats() and get_stats(), so extracted that in a helper function aggregate_translation_stats() ~~(had to put it in a separate file to avoid circular dependencies).~~

Actually, I should be able to put it in translations after getting rid of one import.

flodolo requested review from eemeli and mathjazz March 23, 2026 08:55

flodolo changed the title ~~Optimize and use calculate_stats() instead of delta when saving~~ Optimize calculate_stats(), fall back on IntegrityError Mar 23, 2026

flodolo added 3 commits March 23, 2026 19:23

Use resource.calculate_stats() when saving, instead of delta before/a…

993bd0f

…fter

Optimize query for calculate_stats()

f16f23d

Only recalculate stats of IntegrityError

4126a98

flodolo force-pushed the issue2263 branch from 850b282 to 4126a98 Compare March 23, 2026 18:24

Refactor get_entity_stats() instead of reimplementing it

4d31a7b

flodolo changed the title ~~Optimize calculate_stats(), fall back on IntegrityError~~ Optimize stats() functions, fall back on IntegrityError Mar 23, 2026

Reduce code duplication

ed5c58b

flodolo force-pushed the issue2263 branch from d817496 to ed5c58b Compare March 24, 2026 06:04

Move aggregate_translation_stats() to models/translations

6c68ea7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize stats() functions, fall back on IntegrityError#4036

Optimize stats() functions, fall back on IntegrityError#4036
flodolo wants to merge 6 commits intomozilla:mainfrom
flodolo:issue2263

flodolo commented Mar 23, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Mar 23, 2026 •

edited

Loading

Uh oh!

flodolo commented Mar 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

flodolo commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Performance benchmarks

calculate_stats() before and after

Local Docker install

Production

Delta vs calculate_stats()

Local Docker install

Production

Uh oh!

codecov-commenter commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

flodolo commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

flodolo commented Mar 23, 2026 •

edited

Loading

codecov-commenter commented Mar 23, 2026 •

edited

Loading

flodolo commented Mar 24, 2026 •

edited

Loading