Skip to content

fix(converter): preserve theme rFonts on DOCX round-trip (SD-2894)#3225

Merged
luccas-harbour merged 7 commits into
mainfrom
tadeu/sd-2894-preserve-theme-rfonts-on-round-trip
May 14, 2026
Merged

fix(converter): preserve theme rFonts on DOCX round-trip (SD-2894)#3225
luccas-harbour merged 7 commits into
mainfrom
tadeu/sd-2894-preserve-theme-rfonts-on-round-trip

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented May 11, 2026

Summary

  • Extends the combineRunProperties font-slot dedup to cover all four <w:rFonts> script slots (was only ascii/asciiTheme), so a higher-priority style with hAnsiTheme / eastAsiaTheme / cstheme no longer leaves the lower-priority concrete name in place.
  • Adds the same theme-aware merge to calculateInlineRunPropertiesPlugin so theme references survive the mark-driven recompute that runs after import.

Why

Customer fixture in SD-2894 had every body run mapped to <w:rFonts w:asciiTheme="majorBidi" w:hAnsiTheme="majorBidi" w:cstheme="majorBidi"/>. The cascade only knew how to drop target.ascii when source.asciiTheme was present — the other three slots fell through, so the merged run carried { ascii: "Arial", hAnsi: "Arial", cs: "Arial", asciiTheme: "majorBidi", hAnsiTheme: "majorBidi", cstheme: "majorBidi" }. Word treats the concrete value as an override that defeats per-script theme resolution: Latin body text mapped to majorBidi rendered as Calibri Light instead of the per-script font (Times New Roman for Hebrew/Arab).

A second loss happened in calculateInlineRunPropertiesPlugin on initial load — the plugin computes fontFamily from textStyle marks (which only carry concrete CSS names) and overrode runProperties.fontFamily with concrete-only values, dropping the theme references that the import had stored.

Test plan

  • New integration test (sd-2894-theme-rfonts-roundtrip.test.js) that round-trips the customer fixture and asserts w:asciiTheme / w:hAnsiTheme / w:cstheme counts don't drop and that no source theme reference becomes a concrete font on export.
  • Existing ooxml-rFonts-rstyle-linked-combos-roundtrip test still passes.
  • Full pnpm test for super-editor (12 708 / 12 721 pass).
  • @superdoc/style-engine cascade unit tests pass.
  • Manual verification: re-exported customer fixture from the dev app, opened in Word. asciiTheme count: 65 → 65 (was 65 → ~25). No Calibri Light substitutions on body text. Per-script resolution intact.

Verified diffs on the customer fixture

File input before fix after fix
w:asciiTheme in document.xml 65 25 65
w:hAnsiTheme in document.xml 65 25 65
w:cstheme in document.xml 65 25 65
w:ascii="Calibri Light" runs in document.xml 0 1 (concrete substitution) 0

The cascade's font-slot dedup only handled the ascii/asciiTheme pair, so a
higher-priority style with hAnsiTheme/eastAsiaTheme/cstheme would still leave
the lower-priority concrete name in place. Word reads the concrete value as
an override that defeats per-script theme resolution — Latin body text mapped
to majorBidi rendered as Calibri Light instead of the per-script font
(Times New Roman for Hebrew/Arab).

Extend the dedup to all four <w:rFonts> slots, and add the same logic to
calculateInlineRunPropertiesPlugin so theme refs from the imported run survive
the mark-driven recompute that runs on initial load.

Verified with a round-trip integration test on the customer fixture: asciiTheme
count 65 → 65 (was dropping to ~25 before the fix), no concrete font
substitutions on runs that originally carried theme references.
@tupizz tupizz requested a review from a team as a code owner May 11, 2026 14:26
@linear
Copy link
Copy Markdown

linear Bot commented May 11, 2026

SD-2894

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@luccas-harbour luccas-harbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @tupizz!
I had a look at the code and found a few things. I will add them as inline comments.

The main problem I found was flagged by codex:

- [P1] Let explicit font changes replace stale theme refs — packages/super-editor/src/editors/v1/extensions/run/calculateInlineRunPropertiesPlugin.js:57-59
When text imported with w:*Theme rFonts later has a concrete font applied by the user, fromMarks contains the newly selected font while existing still contains the old theme reference. This branch unconditionally copies the old theme ref and deletes the concrete mark value, so the export keeps the original theme font instead of the user's font change for those runs.

I tested this in the browser and indeed it causes an issue. If you try to load the document below into the editor, then select the text that says "Change this text to Arial" and try to actually change the font, you'll see that nothing happens. This was working correctly in the main branch so the problem is being caused by the changes in this PR.

theme-rfonts-explicit-font-change-repro.docx

Another thing I will ask is that you run visual tests after making these adjustments since this change has the potential to impact the rendering of a lot of documents.

…894 review)

Addresses Luccas's PR review feedback on #3225.

P1: when text imported with w:*Theme rFonts is later given a concrete font by
the user, mergeFontFamilyPreservingThemeRefs unconditionally restored the old
theme reference and dropped the user's choice. setFontFamily('Arial') on themed
text appeared to do nothing on export.

Fix: gate the merge on whether the mark-derived value still encodes to the same
OOXML mark as the run's existing fontFamily. If they match, the marks are a
faithful re-derivation of the import and we preserve the theme. If they differ,
the user has overridden — drop the theme and keep the new concrete value.

DRY: FONT_FAMILY_THEME_PAIRS in the plugin duplicated FONT_SLOT_THEME_PAIRS in
style-engine cascade.ts. Export the cascade constant through
@superdoc/style-engine/ooxml and consume it from the plugin so the four-slot
mapping has a single source of truth.

Tests: two new AAA cases in calculateInlineRunPropertiesPlugin.test.js lock
both directions of the contract — preserve theme when marks re-encode the
import, drop theme on user override.
@tupizz
Copy link
Copy Markdown
Contributor Author

tupizz commented May 13, 2026

Thanks @luccas-harbour — both points addressed in f046df9c0.

P1 (user override silently reverted): I was unconditionally restoring theme refs from existingRunProperties whenever they were present. The new gate marksMatchExistingFontFamily() only preserves the theme when the mark-derived value still encodes to the same OOXML mark as the run's existing fontFamily — i.e., when marks are a re-derivation of the import. When marks diverge (the user picked a new font), we now drop the stale theme and respect the user's value.

Reproduced your theme-rfonts-explicit-font-change-repro.docx case in unit form: with existing = { asciiTheme: 'majorBidi' } and marks decoding to { ascii: 'Arial' }, the encoder produces 'Calibri Light' vs 'Arial' → mismatch → theme dropped, 'Arial' kept.

DRY (FONT_FAMILY_THEME_PAIRS duplication): Exported FONT_SLOT_THEME_PAIRS from cascade.ts, re-exported through @superdoc/style-engine/ooxml, and the plugin now imports it. Local copy deleted.

Safeguards: Two new AAA tests in calculateInlineRunPropertiesPlugin.test.js pin both directions of the contract:

  • preserves theme reference when marks re-encode the imported theme value — locks the SD-2894 happy path.
  • drops theme references when the user explicitly sets a different font — locks your P1 case.

Both fail loudly if either branch flips in the future.

Visual tests: queued — will post results in a follow-up comment once they finish.

Test results from this commit:

  • calculateInlineRunPropertiesPlugin.test.js: all new + existing tests pass
  • Full super-editor suite: 12,710 passed, 13 skipped, 0 failed
  • @superdoc/style-engine: 129/129 pass
  • tsc --noEmit on style-engine: clean

tupizz added 2 commits May 13, 2026 14:16
…ot (SD-2894)

The previous gate (f046df9) compared the full encoded textStyle attrs between
markFromMarks and markFromExisting. That over-fired on imported runs whose
existing.fontFamily mixed theme refs with a concrete per-script slot
(e.g. { asciiTheme, hAnsiTheme, cstheme, eastAsia: 'Arial' }): the mark
re-decode produces a full concrete shape including a `cs:` slot, which the
encoder turns into `csFontFamily` on markFromMarks but not on markFromExisting,
so JSON.stringify(attrs) mismatched and the theme was dropped.

Browser round-trip of the SD-2894 customer fixture: 65/65/65 -> 62/62/62
theme attrs, with 3 `w:ascii="Calibri Light"` substitutions. Regression of the
original PR.

Fix: compare only `markFromMarks.attrs.fontFamily` against
`markFromExisting.attrs.fontFamily`. That is the primary user-override signal;
per-script extras don't indicate user intent (the toolbar sets all four slots
to the same family). Confirmed in browser:
- customer fixture round-trip: 65/65/65, zero CL substitutions
- Luccas user-override repro: setFontFamily('Arial') -> exported
  <w:rFonts w:ascii="Arial" w:hAnsi="Arial" w:eastAsia="Arial" w:cs="Arial"/>

Adds a third regression test pinning the mixed theme+concrete-per-script case
that the suite previously missed (because the integration test uses
isHeadless: true which bypasses the plugin entirely).
13 parametrized scenarios pin every shape of existing.fontFamily x mark-decoded
shape that the plugin must handle on round-trip:

Group A — theme preservation when marks re-encode the imported value:
  A1: all 4 theme slots
  A2: 3 theme slots (ascii/hAnsi/cs) — SD-2894 customer fixture shape
  A3: 3 themes + concrete eastAsia — the regression case the gate now handles
  A4: ascii-only theme
  A5: hAnsi-only theme
  A6: cs-only theme (lowercase `cstheme` OOXML quirk)
  A7: eastAsia-only theme

Group B — user override drops theme refs:
  B1: all 4 themes + Arial
  B2: customer shape + Arial (Luccas's case)
  B3: mixed theme+concrete + Arial

Group C — no-theme baselines:
  C1: pure concrete unchanged
  C2: pure concrete + different concrete
  C3: empty existing

The universal encoder mock mirrors resolveDocxFontFamily's behaviour: any
theme ref wins for the primary slot, falling back to concrete ascii/hAnsi/etc.
Per-script extras (eastAsiaFontFamily/csFontFamily) attach only when concrete
slot value differs from primary fontFamily — same emission rule as the real
encoder in super-converter/styles.js.

These scenarios are the basis for catching any future drift in the gate
logic or merge helper — both the import-preservation path and the
user-override path are locked at every shape we ship.
@tupizz
Copy link
Copy Markdown
Contributor Author

tupizz commented May 13, 2026

Update: pushed e06c0b831 — a 13-scenario mutation matrix that walks the cross-product of (existing.fontFamily shape × mark-decoded shape) the plugin must handle.

Scenarios covered

Group A — theme preservation (marks re-encode imported value → keep theme refs)

  • A1: all 4 theme slots
  • A2: 3 theme slots (SD-2894 customer shape)
  • A3: 3 themes + concrete eastAsia (the regression case bec7e63ed fixed)
  • A4: ascii-only theme
  • A5: hAnsi-only theme
  • A6: cs-only theme (lowercase cstheme OOXML quirk)
  • A7: eastAsia-only theme

Group B — user override (marks diverge → drop theme refs)

  • B1: all 4 themes + user picks Arial
  • B2: customer shape + Arial (the case you originally reported)
  • B3: mixed theme+concrete + Arial

Group C — no-theme baselines

  • C1: pure concrete unchanged
  • C2: pure concrete + different concrete
  • C3: empty existing

Encoder mock fidelity

The universal encoder mirrors resolveDocxFontFamily's real behaviour: any theme ref wins for the primary slot, then fall back to concrete ascii/hAnsi/eastAsia/cs. Per-script extras (eastAsiaFontFamily/csFontFamily) attach only when the concrete slot value differs from the primary fontFamily — same emission rule as encodeMarksFromRPr in super-converter/styles.js. I had to make the mock encoder smarter to catch A5/A6/A7 (single non-ascii theme slots); the real resolver's 'major' fallback handles those identically.

Results

  • Full super-editor suite: 12,724 passed, 13 skipped, 0 failed (+13 from bec7e63ed).
  • Browser round-trip of the SD-2894 customer fixture: still 65/65/65 theme attrs preserved.
  • Browser repro of your case: still exports <w:rFonts w:ascii="Arial" .../> with zero theme attrs on the edited run.

The matrix will fail loudly if the gate logic or merge helper ever drift. Together with the smaller unit tests already on this PR, both the preservation path AND the user-override path are now locked at every shape we ship.

…limitation

Addresses two codex review findings on PR #3225:

(1) Cascade test coverage: the original SD-2894 bug was that combineRunProperties'
    slot dedup only handled (ascii, asciiTheme) correctly; hAnsi/eastAsia/cs leaked.
    Existing cascade tests didn't directly exercise the four-slot behaviour. Adds:
      - one test per slot for the "theme drops concrete" direction
      - one test per slot for the "concrete drops theme" direction
      - an Athenaintelligence-customer-shape test
      - an export assertion pinning FONT_SLOT_THEME_PAIRS so it stays in sync with
        the super-editor plugin that re-imports it
    These prevent any single-slot regression of the original SD-2894 fix.

(2) Per-script gate limitation: the plugin's gate + merge pair operate at the
    fontFamily-key level, not per-slot. A user action that mutates only
    eastAsiaFontFamily or csFontFamily without changing the primary fontFamily
    will pass the gate. Documented as an AIDEV-NOTE on mergeFontFamilyPreservingThemeRefs.
    We accept the trade-off because (a) the toolbar's setFontFamily always changes
    the primary, (b) no current API or paste path produces this shape against a
    themed import, (c) the prior attempt at increased precision (f046df9) regressed
    the customer fixture (3 of 65 runs); narrowing to per-slot would re-open that
    risk without a concrete reproduction. If a real customer scenario surfaces,
    file a follow-up with the fixture and rework into per-slot decisions then.
@tupizz tupizz requested a review from luccas-harbour May 13, 2026 18:09
@caio-pizzol caio-pizzol self-assigned this May 13, 2026
Copy link
Copy Markdown
Contributor

@luccas-harbour luccas-harbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luccas-harbour luccas-harbour merged commit ef9821e into main May 14, 2026
69 checks passed
@luccas-harbour luccas-harbour deleted the tadeu/sd-2894-preserve-theme-rfonts-on-round-trip branch May 14, 2026 13:19
@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 14, 2026

🎉 This PR is included in @superdoc-dev/mcp v0.3.0-next.93

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 14, 2026

🎉 This PR is included in @superdoc-dev/react v1.2.0-next.137

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 14, 2026

🎉 This PR is included in vscode-ext v2.3.0-next.139

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 14, 2026

🎉 This PR is included in superdoc-cli v0.8.0-next.108

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 14, 2026

🎉 This PR is included in superdoc v1.30.0-next.88

The release is available on GitHub release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants