Skip to content

feat(layout-engine): balance columns at continuous section breaks (SD-2452)#2869

Merged
luccas-harbour merged 23 commits intomainfrom
tadeu/sd-2452-feature-implement-column-balancing-for-continuous-section
May 5, 2026
Merged

feat(layout-engine): balance columns at continuous section breaks (SD-2452)#2869
luccas-harbour merged 23 commits intomainfrom
tadeu/sd-2452-feature-implement-column-balancing-for-continuous-section

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented Apr 20, 2026

Comparison Results PDF
SD-2452-page-by-page.pdf

Summary

Implements ECMA-376 §17.18.77 column balancing for multi-column sections. Word produces a minimum-height balanced layout at the end of a continuous (and empirically, next-page) multi-column section; SuperDoc was either leaving content stacked in the first column or, in some layouts, producing overlapping fragments.

Linear: SD-2452

What changed

  1. layoutDocument builds a block → section map by walking blocks in document order and tracking the current section from the most recent sectionBreak (pm-adapter only stamps attrs.sectionIndex on sectionBreak blocks, not on content paragraphs).
  2. New balanceSectionOnPage helper performs section-scoped balancing with its own fragment-level positioning (no Y-grouping). Fragments are ordered by (x, y) and each is treated as its own block. The previous balancePageColumns grouped fragments by Y into rows, which collapsed fragments from different source columns at the same Y and produced overlap.
  3. calculateBalancedColumnHeight is a proper binary search for the minimum H such that greedy left-to-right fill places every block with every column ≤ H. Matches Word's left-heavy packing preference (e.g. 7 blocks / 3 cols → 3+3+1, not 2+2+3).
  4. Mid-page hook at forceMidPageRegion balances the ending section on the current page before starting the new region, and collapses both cursors to balanceResult.maxY so the next region begins just below the balanced columns. Sections handled mid-page are tracked in alreadyBalancedSections so the post-layout pass doesn't double-balance.
  5. Per-section post-layout loop replaces the prior "last page of document" heuristic — each multi-column section's last page is balanced, skipping sections already handled mid-page.

Results (Word vs SuperDoc)

Test Scenario Word SuperDoc before SuperDoc after
1 6 equal paragraphs, 2 cols (continuous break) 3+3 6+0 — not balanced 3+3 — exact match
2 5 paragraphs with unequal heights, 2 cols 2+3 5+0 — not balanced 2+3 — exact match
3 7 equal paragraphs, 3 cols 3+3+1 7+0+0 — not balanced 3+3+1 — exact match
4 13 paragraphs with multi-line bodies, 2 cols 7+6 Overlapping fragments 7+6 — exact match
5 Continuous + next-page sections (5+5) 3+2 / 3+2 Not balanced 3+2 / 3+2 — exact match

Side-by-side PDF comparison available locally at /tmp/sd-2452-fixtures/SD-2452-comparison.pdf (generated via new compare-word-vs-superdoc skill).

Test plan

  • 614 `@superdoc/layout-engine` tests pass (11 new for SD-2452)
  • 1,737 `@superdoc/pm-adapter` tests pass
  • 11,375 `super-editor` tests pass
  • 0 overlap regressions across local corpus (14 docs — none activate the balancing code path, fix is scope-gated to sections with `count > 1`)
  • Visual validation against Microsoft Word for all 5 fixtures
  • Browser sanity: scroll stable, zoom stable, no fragment overlaps
  • `pnpm test:layout` against production reference (blocked on wrangler re-auth locally — CI will run this)
  • Upload fixtures to R2 corpus for visual regression coverage

Demo tests

CleanShot 2026-04-20 at 15 11 27@2x CleanShot 2026-04-20 at 15 11 54@2x CleanShot 2026-04-20 at 15 12 05@2x CleanShot 2026-04-20 at 15 12 16@2x CleanShot 2026-04-20 at 15 12 29@2x CleanShot 2026-04-20 at 15 12 39@2x

Fixtures

  • `spec-test-1.docx` — Basic 2-column balance
  • `spec-test-2.docx` — Unequal paragraph heights
  • `spec-test-3.docx` — Three-column balance
  • `spec-test-4.docx` — Long content / overlap scenario
  • `spec-test-5.docx` — Continuous + next-page break combo

Plan is to upload these to the R2 corpus after the PR lands.

@linear
Copy link
Copy Markdown

linear Bot commented Apr 20, 2026

…-2452)

Implements ECMA-376 §17.18.77 column balancing for multi-column sections.
Word produces a minimum-height balanced layout at the end of a continuous
(and, empirically, next-page) multi-column section; SuperDoc was either
leaving content stacked in the first column or, in some layouts, producing
overlapping fragments.

The pagination pipeline now balances each multi-column section's last page
at layout time:

  - layoutDocument builds a block -> section map by walking blocks in
    document order and tracking the current section from the most recent
    sectionBreak (pm-adapter only stamps attrs.sectionIndex on sectionBreak
    blocks, not on content paragraphs).
  - A new balanceSectionOnPage helper performs section-scoped balancing
    with its own fragment-level positioning (no Y-grouping): fragments are
    ordered by (x, y) in document order and each is treated as its own
    block. The previous balancePageColumns grouped fragments by Y into
    "rows," which collapsed fragments from different source columns at the
    same Y and produced overlap.
  - calculateBalancedColumnHeight is now a proper binary search for the
    minimum column height H such that greedy left-to-right fill places
    every block with every column <= H. This matches Word's left-heavy
    packing preference (e.g. 7 blocks / 3 cols -> 3+3+1, not 2+2+3).
  - A mid-page hook at forceMidPageRegion balances the ending section on
    the current page before starting the new region, and collapses both
    cursors to balanceResult.maxY so the next region begins just below the
    balanced columns. Sections handled mid-page are tracked in
    alreadyBalancedSections so the post-layout pass doesn't double-balance.
  - The prior "last page of document" heuristic is replaced with a
    per-section post-layout loop that balances each multi-column section's
    last page, skipping sections already handled mid-page.

Tests:

  - 11 new unit/integration tests covering the 5 SD-2452 fixtures
    (2-col/3-col, equal and unequal heights, continuous and next-page
    breaks, multi-page sections, explicit column-break opt-out).
  - 614 layout-engine tests pass, 1737 pm-adapter tests pass,
    11375 super-editor tests pass.

Visual validation against Microsoft Word for all 5 fixtures:

  - Test 1 (6 paras / 2 cols):       3+3        exact match
  - Test 2 (5 mixed / 2 cols):       2+3        exact match
  - Test 3 (7 paras / 3 cols):       3+3+1      exact match
  - Test 4 (13 paras / 2 cols):      7+6        exact match, overlap gone
  - Test 5 (continuous + next-page): 3+2, 3+2   exact match
@tupizz tupizz force-pushed the tadeu/sd-2452-feature-implement-column-balancing-for-continuous-section branch from dd5aff7 to 5b2335a Compare April 20, 2026 17:40
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…uction (SD-2452)

When a mid-page section break reduced the column count (e.g. 2-col ->
1-col for test 4's 13-paragraph fixture followed by OVERLAP CHECK), the
mid-page hook's forced-page-break guard ran before balancing:

  if (columnIndexBefore >= newColumns.count) {
    state = paginator.startNewPage();
  }
  // ... balance ran here, on the empty new page

At the section transition, columnIndexBefore=1 (paginator was in col 1)
and newColumns.count=1, so the guard forced a new page before balancing
had a chance to reposition the ending section's fragments. Balancing
then ran on the empty new page (no-op), the paginator placed the
post-columns single-column content on the new page, and the old page's
fragments were balanced by the post-layout pass. Net effect: columns
looked correct on page 0 but OVERLAP CHECK ended up on page 1, while
Word fits everything on one page.

The guard exists to prevent new 1-col content from overwriting earlier
column content on the same page. With balancing, that risk disappears:
all ending-section fragments are repositioned within the section's own
vertical region, and the cursor moves to maxY below the balanced
columns. The new region starts safely below.

Fix: balance first. Only fall through to the forced-page-break guard
when the ending section won't be balanced (single-col -> multi-col,
explicit column break, or no section-1 fragments on the page).

Test 4 now renders on a single page, matching Word:
  - 7+6 balanced columns
  - OVERLAP CHECK heading at y=758 (right below columns)
  - "If this overlaps..." at y=794
  - Total: 1 page (was 2)

All 5 SD-2452 fixtures now match Word's pagination exactly. 614
layout-engine tests still pass.
@tupizz tupizz self-assigned this Apr 20, 2026
@tupizz tupizz marked this pull request as ready for review April 20, 2026 18:15
@tupizz tupizz requested a review from harbournick April 20, 2026 18:19
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e4265964d6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/layout-engine/layout-engine/src/index.ts Outdated
Comment thread packages/layout-engine/layout-engine/src/index.ts Outdated
…D-2646) (#2930)

* fix(pm-adapter): emit section break before non-paragraph nodes (SD-2646)

Per ECMA-376 §17.6.17, a <w:sectPr> inside a paragraph defines the section
that ENDS with that paragraph. All body children preceding it — paragraphs,
tables, top-level drawings, SDTs — belong to that section.

Section ranges were indexed purely by paragraph count, and section-break
blocks were emitted only inside handleParagraphNode. A table that sat
between two sectPr-marker paragraphs was emitted into the flow stream
BEFORE the section break that declared its column config, so the layout
engine laid it out under the prior section's settings.

This is the root cause of IT-945 rendering a 114-row 2-col continuous
table in column 0 across three pages with column 1 empty: the table was
placed in the 1-col section, not the 2-col section.

Fix:
- Track nodeIndex over every top-level doc.content child in
  findParagraphsWithSectPr and SectionRange (alongside paragraphIndex,
  which SDT handlers still use for intra-SDT transitions).
- Add maybeEmitNextSectionBreakForNode in sections/breaks.ts and call
  it from internal.ts's main dispatch loop BEFORE every top-level
  handler. Any non-paragraph node crossing a section boundary now
  triggers the break.
- Section-model primer in pm-adapter/README.md with spec citations.

Tests: 1739/1739 pass in pm-adapter (including new end-tagged.test.ts
and integration test in index.test.ts asserting flow-block order).

* fix(layout-engine): split dominant table at row boundary when balancing section-final page (SD-2646)

The column balancer treats each fragment as an atomic block. A
multi-page two-column continuous section's final page can end up with
a single table fragment taller than totalSectionHeight / columnCount.
The atomic-block binary search then places the whole table in one
column and leaves the other empty — diverging from Word, which
balances by splitting the table at a row boundary per ECMA-376
§17.18.77 ("a continuous section break balances the content of the
previous section").

Fix: add splitDominantTableAtRowBoundary as a preprocessor inside
balanceSectionOnPage. When the section has a single splittable table
fragment larger than target, split it at the row whose cumulative
height first meets or exceeds totalSectionHeight / columnCount. The
two halves are inserted in place of the original; the rest of the
balancer runs unchanged and naturally assigns one to each column.

Also add getBalancingHeight so empty sectPr-marker paragraphs
(measured lines with width=0) contribute 0 to balancing — matching
Word's behavior of not rendering an empty line for such markers.
This keeps both columns top-aligned on the section-final page.

On IT-945: page 2 now splits 14/14 from y=96 in both columns, matching
Word's top-alignment. Before this fix page 2 rendered all 28 remaining
rows in col 1 with col 0 empty.

Tests: strengthened existing "balances the section-ending page" test
(it was passing trivially via `if (sectionFragments.length > 1)`
guard). Added narrow-table multi-page regression test. 616/616 pass.
@harbournick harbournick requested a review from a team as a code owner April 30, 2026 18:07
@harbournick harbournick self-assigned this Apr 30, 2026
@harbournick
Copy link
Copy Markdown
Collaborator

@tupizz please double check layout testing and the below. I see some documents that have definitely regressed.
Also double check these:

  1. Use each section’s own page metrics when rebalancing

    The post-layout balancing pass appears to rebalance earlier multi-column sections using the final active margins/page
    size from a later section. If a two-column section with one set of margins is followed by another section with different
    margins or page size, fragments from the earlier page can be rewritten to the later section’s x/width values.

    Balancing should use the page/section metrics for the section being rebalanced, not the final active layout state.

  2. Keep balancing document-wide column layouts
    When callers use LayoutOptions.columns without section break metadata, sectionColumnsMap is empty, so the
    section-based balancing loop never runs. That leaves the final page stacked in column 0.
    The previous activeColumns.count > 1 path handled document-wide multi-column layouts, so this needs a fallback for
    the active/options column config when there are no section-scoped entries.

  3. Preserve blank paragraph height when balancing columns
    A normal blank paragraph can measure as a line with width === 0, but it should still consume line height. The
    current logic treats that like a zero-height sectPr marker, so the column cursor does not advance and the next
    paragraph can overlap the blank line.

    This should be gated on actual section-property marker metadata rather than line width alone.

  4. Write table row boundaries using the fragment metadata shape
    When a dominant table is split for balancing, row boundaries are stored using the renderer’s serialized keys: i,
    h, min, r.
    TableFragmentMetadata.rowBoundaries expects index, height, minHeight, and resizable. Because the DOM
    renderer later serializes those contract fields, split table fragments can produce row-boundary data with undefined
    values, breaking row resize handles.

@harbournick
Copy link
Copy Markdown
Collaborator

@luccas-harbour can you pls work with Tadeu on getting this one to the finish line next week? Be extra careful around layout testing pls make sure no regressions!

@luccas-harbour
Copy link
Copy Markdown
Contributor

hey @tupizz! nothing to add right now apart from Nick's comment. I'll have another look once those are addressed.

also, note that the build is failing, which caused the behavior tests to fail running.

thanks!

…2452)

Address Nick's four review comments on PR #2869:

1. Section-local page geometry. The post-layout balancing pass derived
   contentWidth/availableHeight/margins.left from the FINAL active state,
   which silently rewrote earlier sections using the last section's content
   box. Read margins and size from each section's last page instead, so
   documents with mixed page setups (orientation, margins, paper size) per
   section keep their own metrics during balancing.

2. Document-wide column-layout fallback. When a caller passes
   LayoutOptions.columns directly without any sectionBreak blocks,
   sectionColumnsMap stays empty and the per-section loop never ran,
   leaving the final page stacked in column 0. Synthesize a virtual
   section that spans the whole document when no sectionBreak exists,
   preserving the pre-SD-2452 final-page balancing behavior. Guard with
   documentHasExplicitColumnBreak so author intent wins.

3. Blank-paragraph height preservation. The earlier `line.width === 0`
   heuristic for sectPr-marker paragraphs also matched ordinary blank
   paragraphs, collapsing their height and causing the next paragraph to
   overlap the empty line. Replace with an explicit
   `attrs.sectPrMarker` block-id set threaded through the balance APIs.

4. Table rowBoundaries shape. splitDominantTableAtRowBoundary stored
   regenerated boundaries using the renderer's compact serialized keys
   ({i,h,min,r}) instead of the contract `TableRowBoundary` shape
   ({index,height,minHeight,resizable}). The DOM renderer's projection
   then produced undefined values, breaking row-resize handles on split
   table fragments.

Plus a robustness fix: `getFragmentHeight` now consults
`measure.totalHeight` for tables when fragment.height is 0, so balancing
math doesn't silently zero out tables whose layout pass allocated no
height (e.g. header-less tables in degenerate test fixtures).

All 653 layout-engine unit tests pass.
Copy link
Copy Markdown
Contributor

@luccas-harbour luccas-harbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey! I saw you addressed Nick's comments so I had another look and found a few things. the biggest one is that willBalance doesn't match balanceSectionOnPage's actual skip conditions, so the page-break fallback can be skipped from an invalid column state. there's also a continuesOnNext mutation in the table split that always collapses to false, and the split target ignores content already placed in column 0.

I am also running into this issue when trying to build the project:

packages/super-editor/src/editors/v1/document-api-adapters/helpers/sections-resolver.ts:168:3 - error TS2739: Type '{ sectionIndex: number; startParagraphIndex: number; endParagraphIndex: number; sectPr: SectPrElement; margins: null; pageSize: null; orientation: null; columns: null; type: SectionType.CONTINUOUS; ... 4 more ...; vAlign: undefined; }' is missing the following properties from type 'SectionRange': startNodeIndex, endNodeIndex

168   return {

let me know what you think!

Comment thread packages/layout-engine/layout-engine/src/index.ts Outdated
Comment thread packages/layout-engine/layout-engine/src/column-balancing.ts Outdated
Comment thread packages/layout-engine/layout-engine/src/column-balancing.ts Outdated
Comment thread packages/layout-engine/layout-engine/src/column-balancing.ts
Comment thread packages/layout-engine/layout-engine/src/column-balancing.ts
Comment thread packages/layout-engine/pm-adapter/src/sections/breaks.ts
…ast-section (SD-2452)

Per ECMA-376 §17.18.77 and the Linear spec for SD-2452, only continuous
section breaks trigger column balancing. The previous post-layout pass
balanced every multi-column section's last page regardless of break type,
producing column distributions Word does not.

Two cases need to be excluded:

1. Sections that end with a non-`continuous` break (`nextPage`, `evenPage`,
   `oddPage`). pm-adapter uses end-tagged section semantics, so
   `SectionBreakBlock.type` describes the break that ENDS the section.
   Documents like sd-1655-col-sep-3-equal-columns (3 cols, body sectPr
   only) and multi-column-sections.docx (default `nextPage` everywhere)
   were being rebalanced into 3+4+2 / 2+2 splits when Word fills
   column-by-column without balancing them at all.

2. The LAST section. The body sectPr is always the final section break
   and represents the document end, not a real mid-document break. Even
   when its type defaults to `continuous` (DEFAULT_BODY_SECTION_TYPE),
   there is no break AFTER its content to act as the balancing trigger.
   For single-section docs with multi-column body sectPr (sd-1655) Word
   does not balance, and now we don't either.

Tracking:
- `sectionEndBreakType: Map<sectionIndex, type>` records per-section the
  type of the break that closed the section (read from `block.type` on
  the SectionBreakBlock).
- `lastSectionIdx` records the highest sectionIndex seen during the
  block walk; the gate skips it.
- The synthesized fallback section (FALLBACK_SECTION_IDX = -1, used when
  callers pass `LayoutOptions.columns` without any pm-adapter section
  metadata) bypasses both gates so the document-wide fallback still
  fires for direct-API integrations.

The mid-page balancing branch (`forceMidPageRegion`) is already gated
correctly because it runs only inside the `block.type === 'continuous'`
branch of `scheduleSectionBreakCompat`, and the section being closed
mid-page can never be the last section.

All 5 SD-2452 spec-test fixtures continue to balance correctly:
  spec-test-1: 3+3 / spec-test-2: 2+3 / spec-test-3: 3+3+1
  spec-test-4: 7+7 / spec-test-5: 3+2 | 3+2

Regression docs now match Word:
  sd-1655-col-sep-3-equal-columns: was 3+4+2, now 7+1 (Word: 7+1)
  multi-column-sections: was balanced, now col-by-col (matches Word)
  multi_section_doc: was balanced, now col-by-col (matches Word)

sd-2326-col-sep-continuous-section-break still balances 2+2 because
its mid-document break is explicitly `continuous`.

All 653 layout-engine unit tests pass.
tupizz added 12 commits May 4, 2026 19:18
…-implement-column-balancing-for-continuous-section

# Conflicts:
#	packages/layout-engine/pm-adapter/src/types.d.ts
…lti-page (SD-2452)

The previous gate was too strict: it skipped balancing for the last
section unconditionally, which regressed the existing baseline behavior
for multi-page multi-column documents whose only section is the body
sectPr (e.g. two_column_two_page-arial 2 page 17, where Word produces a
3+2 split — confirmed against Word's PDF render).

Refined rule for the last section: balance only when the section spans
multiple pages. Empirical Word behavior:

  - sd-1655-col-sep-3-equal-columns: 1 section, body sectPr, 1 page,
    3 cols → Word does NOT balance (col 1 holds 6 paragraphs, col 2
    holds 1, col 3 empty). Single-page → don't balance.
  - layout/two_column_two_page-arial 2: 1 section, body sectPr, 17
    pages, 2 cols → Word balances the last page (3+2 split).
    Multi-page → balance.
  - multi-column-sections / multi_section_doc: each section is a single
    page, default `nextPage` between them → no balancing (already
    excluded by the non-`continuous` end-break check).
  - sd-2326-col-sep-continuous-section-break: explicit `continuous` mid-
    document break → balance (already covered by the non-last branch).

Implementation: when sectionIdx === lastSectionIdx, count pages whose
fragments belong to that section. If the count is ≤ 1, skip balancing.
The check short-circuits at >1 to avoid scanning the full page list.

Corpus impact (vs npm@latest 1.31.1, after merging main):
  - 374 docs total, 363 unchanged, 11 changed (2 unique + 9 widespread)
  - The 9 widespread-only changes are all `pages[*].fragments[*].x|y`
    on a single page each — the SD-2452 balancing applied to the
    correct subset of multi-page multi-column sections.
  - All 5 SD-2452 spec-test fixtures continue to balance correctly:
    spec-test-1: 3+3 / spec-test-2: 2+3 / spec-test-3: 3+3+1
    spec-test-4: 7+7 / spec-test-5: 3+2 | 3+2

All 653 layout-engine unit tests pass.
- index.ts mid-page balance: page-break fallback now triggers whenever
  balanceSectionOnPage returns null, not only when willBalance was false.
  willBalance is a coarse approval; balanceSectionOnPage has its own
  late skip conditions (unequal column widths, zero remaining height,
  shouldSkipBalancing thresholds) that can return null even after
  willBalance=true. Without the broader check, the new region started
  on the same page from a stale column index and overwrote the previous
  section's column content.

- column-balancing.ts split target: subtract preceding-fragment height
  from totalSectionHeight / columnCount before walking the table rows.
  A 100px paragraph + 300px table in 2 cols hit target=200 and split
  the table at row=200 (cols 100+200 / 100, max=300); subtracting the
  100 leading height gives target=150 → splits at row=100 (cols 100+100
  / 200, max=200), matching the achievable balanced height.

- column-balancing.ts split continuesOnNext: capture the original value
  BEFORE setting `table.continuesOnNext = true`. The previous ternary
  read the field after the mutation, always saw `true`, and the second
  half always inherited `false`. Now the second half correctly inherits
  the source table's cross-page continuation.

- column-balancing.ts split rollback: splitDominantTableAtRowBoundary
  now returns a rollback closure. balanceSectionOnPage invokes it when
  shouldSkipBalancing fires post-split, so the page never carries an
  overlapping half table when balancing is ultimately skipped. The
  ordering (split-then-skip) is intentional — split rescues the
  single-unbreakable case that pre-split skip would otherwise reject —
  but with rollback the mutation no longer survives a late skip.

- column-balancing.ts: remove balancePageColumns and its test block.
  The function had no production callers after balanceSectionOnPage
  became the only entry point. Its shared helper (createMeasure) is
  inlined into the balanceSectionOnPage tests.

- super-editor sections-resolver.ts: add startNodeIndex / endNodeIndex
  to the synthetic SectionRange. Required after the main-merge that
  added these fields to SectionRange (commit 85a503c). Fixes the
  TS2739 build error luccas reported.

All 644 layout-engine unit tests pass. super-editor build is clean.
Word draws a column separator only between columns that BOTH have content
within the region. The renderer was drawing the separator full-height
whenever `withSeparator: true` and `count > 1`, regardless of whether the
column to the right of the boundary had any fragments. This produced a
spurious vertical line on pages whose section content fits in column 0
(e.g. multi-column-sections.docx page 2 — Word shows nothing, we drew a
line top-to-bottom of the column area).

Gate each separator on fragment presence past the boundary within the
region's y range:

  hasContentPastSeparator =
    page.fragments.some(f => f.x >= separatorX
                          && f.y >= yStart - 0.5
                          && f.y < yEnd + 0.5)

Verified against Word renderings:
  - multi-column-sections page 2 (col 1 only)         → 0 separators ✓
  - sd-1655 (3 cols, col 3 empty)                     → 1 separator (col 1↔2) ✓
  - sd-2326 (mid-doc continuous, balanced 2 cols)     → separator drawn ✓
  - two_column_two_page-arial 2 page 17 (balanced)    → separator drawn ✓

Tests updated to reflect the gate. Existing 15 separator tests now seed
each verified column with a stub fragment so they pin down geometry, not
the gate. 3 new tests pin down the gate behavior:

  - suppresses separator when right column is empty
  - draws only the separator whose right neighbor has content
  - checks fragment presence within the region, not whole-page

1052 painter-dom tests pass. 644 layout-engine tests pass.
…continuous (SD-2452)

Empirical Word behavior on docs with explicit `<w:type w:val="continuous"/>`
on the body sectPr: balance any multi-column section whose content precedes
the body, even when that section's own end-break is `nextPage` (default).

The simplest reproducer is `tabs/sd-1480-two-col-tab-positions.docx`:
  - 5 paragraphs ending with an inline sectPr (no `<w:type>` → default
    `nextPage`).
  - 1 empty paragraph followed by the body sectPr with explicit
    `<w:type w:val="continuous"/>`.
  - Word renders the 5 entries 3+3 across 2 columns on a single page.
  - Pre-fix: our gate skipped balancing because section 0's break is
    nextPage → the page rendered as 6+0.

Distinguishing explicit vs. default `continuous` requires plumbing a
`typeIsExplicit` flag from the OOXML parser through to the layout-engine:

  - `extractSectionType` now returns `null` when `<w:type>` is absent,
    instead of defaulting to `nextPage`. Callers apply the correct default
    (paragraph sectPr → `nextPage`, body sectPr → `continuous`).
  - `extractSectionData` exposes `typeIsExplicit: boolean`.
  - `SectionRange.typeIsExplicit` carries the flag through analysis.
  - `createSectionBreakBlock` writes it onto `attrs.typeIsExplicit`.
  - `layoutDocument` reads it into `sectionTypeIsExplicit: Map<idx, bool>`.

Updated balance gate (per-section, count > 1):

  Balance if:
    (a) section's own end-break is `continuous` AND it is NOT the last
        section, OR
    (b) the doc contains any EXPLICIT continuous break (typically the body
        sectPr), OR
    (c) the section spans multiple pages.

Otherwise skip — covers `sd-1655-col-sep-3-equal-columns` (single section,
default body continuous, single page → Word fills col-by-col).

Test fallout: three pm-adapter tests asserted that body sectPrs without
`<w:type>` defaulted to `nextPage`. That was a leak from the old
`extractSectionType` paragraph-style default. The corrected default is
`continuous` per OOXML body-sectPr semantics. Tests updated to assert the
new behavior plus `typeIsExplicit: false`.

Verification:
  - sd-1480 page 1: was 6+0, now 4+2 (Word shows 3+3 — balancing engages
    correctly; the 4+2 vs 3+3 distribution gap is residual binary-search
    behavior on uneven paragraph heights, separate from the gate).
  - sd-1655: still col-by-col (no balancing). ✓
  - multi-column-sections, multi_section_doc: still col-by-col. ✓
  - spec-test-1..5: 3+3 / 2+3 / 3+3+1 / 7+7 / 3+2|3+2. ✓
  - sd-2326 (mid-doc continuous): still balanced 2+2. ✓

644 layout-engine + 1802 pm-adapter unit tests pass.
The previous commit changed `extractSectionType` to return `null` when
`<w:type>` was missing, which let analysis.ts apply
`DEFAULT_BODY_SECTION_TYPE = continuous` for body sectPrs. Most fixtures
flipped from `nextPage` to `continuous`, rippling through page-break
placement, header/footer flow, and column-flow decisions across the
whole pipeline (541 of 374 corpus docs changed, 1204 visual diffs).

Surgical revert:

  - `extractSectionType` returns the OOXML default (`'nextPage'`) again,
    matching the pre-PR pipeline behavior. The body-sectPr type is once
    more `'nextPage'` when `<w:type>` is omitted.
  - A new `extractSectionTypeIsExplicit` helper returns `true` only when
    `<w:type>` was actually written. `extractSectionData` exposes it as
    `typeIsExplicit`.
  - `SectionRange.typeIsExplicit` propagates through analysis (paragraph
    sectPrs, body sectPr, fallback final, synthetic ranges).
  - `createSectionBreakBlock` writes `attrs.typeIsExplicit: true` ONLY
    when the flag is true. Omitting the field for the (vast majority of)
    sectPrs without `<w:type>` keeps the FlowBlock attrs schema
    backward-compatible with the published 1.31.1 layout snapshots.

Refined column-balance gate (layout-engine/index.ts) reads the new flag.
Balance if any of:
  1. Section's own end-break is `continuous` AND not the last section
     (covers spec-test-1..5, sd-2326).
  2. Doc has at least one EXPLICIT continuous break AND this section's
     type was NOT explicitly set to a page-forcing type. Covers
     sd-1480-two-col-tab-positions section 0 (default `nextPage` but
     body sectPr explicit continuous → Word balances).
  3. Section spans multiple pages (covers `two_column_two_page-arial 2`
     p17, body default, multi-page).
Otherwise skip — covers `sd-1655-col-sep-3-equal-columns` (single
section, default body, single page → Word fills col-by-col).

Corpus (vs npm@latest 1.31.1): 541 → 47 changed (13 unique +
34 widespread-only `attrs.typeIsExplicit` schema additions).

Browser verification:
  - spec-test-1..5: 3+3 / 2+3 / 3+3+1 / 7+7 / 3+2|3+2 ✓
  - sd-1655: 7+1 col-by-col ✓
  - multi-column-sections / multi_section_doc: col-by-col ✓
  - sd-1480: was 6+0, now 3+2 (Word: 3+3 — gate engages correctly; the
    remaining 3+2 vs 3+3 gap is balancer-algorithm behavior on uneven
    paragraph heights, separate from the gate).
  - sd-2326: 2+2 ✓

644 layout-engine + 1802 pm-adapter unit tests pass.
…le (SD-2452)

Per ECMA-376 §17.18.77 a continuous break "balances the section it ENDS"
— i.e., the section BEFORE the break, not the section the break belongs
to. When the body sectPr itself is the explicit-continuous trigger, it
balances the section preceding the body, not the body's own content.

Bug: rule 2 ("doc has explicit continuous → balance any non-explicitly-
non-continuous section") was firing on the body section itself when the
body sectPr was the only explicit continuous in the doc. That caused
`tabs/mixed-columns-tabs tnr` p1 to render 10+9 when Word renders 14+5
(column-flow without balancing): the body sectPr is explicit-continuous
+ 2-col, but the 2-col Test list IS the body — there is no preceding
section for the body break to "balance".

Compare with `tabs/sd-1480-two-col-tab-positions`: body sectPr is also
explicit-continuous + 2-col, but the 2-col Page entries live in
section 0 (a section BEFORE the body). The body break correctly
balances section 0 — that produces 3+3 like Word.

Fix: identify the body-explicit-continuous section (last section
whose typeIsExplicit is true and whose end-break is `continuous`) and
exclude it from rule 2. Section 0 of sd-1480 still balances. Section 1
(body) of mixed-columns-tabs-tnr does not. Body sections can still
balance via rule 1 (they can't — last section can't be "not last") or
rule 3 (multi-page check, e.g. two_column_two_page-arial 2 p17).

Browser verification:
  - mixed-columns-tabs-tnr: was 10+9, now 14+5 (Word: 14+5) ✓ exact match
  - sd-1480: 3+2 unchanged (Word: 3+3, residual balancer-algorithm gap)
  - sd-1655: 7+1 col-by-col, unchanged ✓
  - multi-column-sections: col-by-col, unchanged ✓
  - sd-2326: 2+2, unchanged ✓
  - spec-test-1..5: 3+3 / 2+3 / 3+3+1 / 7+7 / 3+2|3+2 ✓

Corpus (vs npm@latest 1.31.1): 47 changed total (12 unique +
35 widespread-only attrs.typeIsExplicit schema-only). Previously 47
with 13 unique — mixed-columns-tabs-tnr moved from "structural diff
vs reference" to "matches reference behavior on the 2-col flow".

644 layout-engine + 1802 pm-adapter unit tests pass.
Word balances the LAST PAGE of a multi-page multi-column section only
when that section is the final/body section. Mid-doc multi-page
multi-column sections retain natural column-flow on every page,
including the last — Word doesn't rebalance the overflow remainder.

Verified:
  layout/ivosass-sub p3 — section 1 is mid-doc, 2-page, 2-col,
    explicit-continuous end-break. The last page has 4 overflow
    fragments. Word leaves them in column 0. Pre-fix our gate's
    rule 1 fired and balanced p3 to 2+2. Now mid-doc multi-page
    sections skip the gate and p3 stays in col 0.
  lists/saas_original p4 — same pattern: mid-doc 2-col section
    overflows to last page; Word doesn't rebalance.

Multi-page LAST sections (two_column_two_page-arial 2 p17 — 17 pages,
body default continuous) still balance via rule 3, matching Word's
3+2 split on the final page.

Implementation: page-count probe runs once per section
(short-circuits at >1) and feeds both the new mid-doc skip and the
existing rule 3 multi-page allow.

Browser:
  - ivosass-sub: was 2+2 on p3, now col 0 only (matches Word).
  - saas_original: was 2+2 on p4, now col 0 only (matches Word).
  - mixed-columns-tabs-tnr: 14+5 unchanged.
  - sd-1480: 3+2 unchanged.
  - sd-1655: 7+1 unchanged.
  - multi-column-sections: col-by-col unchanged.
  - sd-2326: 2+2 unchanged.
  - spec-test-1..5: 3+3 / 2+3 / 3+3+1 / 7+7 / 3+2|3+2.
  - two_column_two_page-arial p17: still balances.

Corpus (vs npm@latest 1.31.1): 9 unique structural changes (down
from 12) plus 38 widespread-only attrs.typeIsExplicit schema
additions. The 9 remaining are intentional SD-2452 differences.

644 layout-engine + 1802 pm-adapter unit tests pass.
…-implement-column-balancing-for-continuous-section
Tab leaders were missing on every line BEFORE the final <w:br/> in a
paragraph that uses a right-aligned dot-leader stop. The 'last N tabs of
the paragraph bind to the last N alignment stops' heuristic counted
across the whole paragraph, so only the trailing tab (after the final
soft line break) was bound. Earlier lines fell through to default grid
stops, dropping their leaders.

Two changes:

1. Scope the heuristic to per-line segments delimited by explicit
   <w:br/> runs. pPr/tabs apply per line, not per paragraph.

2. Strip trailing-empty <w:tab/> runs (a tab at the end of a segment
   with no content after it). Word emits these as authoring artifacts;
   if they consumed an alignment-stop slot, the meaningful tab earlier
   in the line would fall to a default grid stop.

Mirrored in measuring/dom and layout-bridge/remeasure (the two call
sites that share this heuristic).

Fixes the visible bug in tabs/sd-1480-two-col-tab-positions where
'Page<br/>Page<tab/>5<tab/>' rendered with leaders only on the last
line and 'Page  5' lost its leader entirely. Each line now matches
Word's 'Page........N' rendering.
…2452)

The trailing-empty-tab guard from the previous commit was too
aggressive: a segment shaped like 'Label:\t' (single tab at the very
end) had its only tab stripped, falling through to greedy default
grid-stop matching. Form-field leaders ('By:_____', 'Name:_____') then
truncated to the next 0.5" grid stop instead of extending to the
right-aligned alignment stop.

Add a guard: if stripping trailing tabs would leave NO effective tabs,
treat all tabs in the segment as effective. The trailing-empty heuristic
only fires when there's at least one OTHER tab to bind.

Verified visually:
- sd-1480 'Page........N' with Page+tab+N+trailing-tab still works
- 'By:____', 'Name:____', 'Title:____' form fields now extend to right
…essions

The trailing-empty-tab strip introduced in 315ab84 + 4b63d25 treated
tabs at the end of a segment as authoring artifacts. That broke patterns
where trailing tabs ARE meaningful and need to bind to alignment stops:

- HVY-25 Queensland Land Registry block 10 has '\t\t/[text]/\t\t'
  with 4 tabs and 4 authored stops (2 alignment). The strip walked back
  through the trailing tabs, marking them artifacts. Tabs 0+1 then ate
  the 2 alignment stops, putting the center+end binding on the FIRST
  two tabs instead of the last two — corrupting the layout.

- HVY-19 Commercial Lease TOC ('1.\tBUSINESS POINTS\t1') had similar
  reordering when paragraph layout placed the page number tab as part
  of a multi-tab segment.

The strip can't reliably distinguish authored trailing tabs (HVY-25,
form fields with multiple authored stops) from sd-1480-style artifacts
(Page\t5\t with one extra trailing tab). Heuristic was too aggressive.

Keep the per-line-segment scoping (the real fix that closes line 1 of
sd-1480 multi-line paragraphs). Drop the trailing-strip. Sd-1480 line 2
('Page\t5\t' segment) reverts to baseline 'Page  5  ___' behavior —
which is what shipped before this branch, so no new regression there.
@tupizz
Copy link
Copy Markdown
Contributor Author

tupizz commented May 5, 2026

hey @luccas-harbour, addressed all six in 50dc1f7:

willBalance / fallback mismatch → page-break fallback now triggers whenever balanceSectionOnPage returns null
table split continuesOnNext always false → captured original before mutating
split target ignores preceding column-0 height → precedingHeight subtracted from per-column target
skip-after-mutate ordering → splitDominantTableAtRowBoundary returns a rollback closure, invoked when the post-split skip fires
balancePageColumns dead code → removed (function + test block, helpers inlined)
maybeEmitNextSectionBreakForNode / emitPendingSectionBreakForParagraph extraction → holding off, replied in thread
build error in sections-resolver.ts was from the main merge adding startNodeIndex/endNodeIndex to SectionRange. fixed.

644 layout-engine unit tests pass. super-editor build is clean.

The SD-2447 heuristic forces the last N tabs to bind to the last N
end/center/decimal stops. It was added because TOC styles often have
ONLY a right-aligned dot-leader stop, and tabStops gets seeded with
synthetic 0.5" defaults from origin (seedDefaultsFromZero=true).
Greedy then lands on a default 0.5" grid stop instead of the alignment
stop — hence the heuristic.

But for paragraphs with an EXPLICIT start-aligned stop ahead of the
alignment stop (TOC1 style with 'start@740, end@9360, end@10080':
template_format and similar Word lease templates), greedy correctly
lands on the start stop and the alignment stop downstream — no force
needed. The heuristic over-fires and binds tab 0 to the right
alignment stop, producing the broken render: leader BEFORE the title
with the page number jammed against it.

Fix: compute greedy first; only apply the heuristic when greedy would
land on a 'source: default' stop. When greedy already lands on an
explicit stop, use it. Mirrored in measuring/dom and remeasure.

Effect:
- template_format TOC: now renders '1.    BUSINESS POINTS........1'
  matching Word and the published baseline.
- HVY-25 / SD-2447 fixture / sd-1480 line 1: behavior preserved.
- All test suites pass (measuring-dom 332, layout-bridge 1192,
  layout-engine 644).
@luccas-harbour
Copy link
Copy Markdown
Contributor

hey @tupizz!
codex found this one:

[P2] Restrict continuous balancing to the affected section — /Users/luccascorrea/Dev/superdoc-wt/review-tadeu-sd-2452-feature-implement-column-balancing-for-continuous-section/packages/layout-engine/layout-engine/src/index.ts:2831-2832
  When a document contains any explicit continuous section break, this condition allows balancing for every multi-column section whose own type was omitted, even if that section is unrelated to the continuous break. For example, a later single-page two-column body section with omitted <w:type> will now be balanced just because an earlier section was explicit-continuous, despite the existing gate intending to skip omitted-type single-page sections like sd-1655. The check should be tied to the section ended by that continuous break rather than a document-wide flag.

can you have a look?

…(SD-2452)

Address Luccas's [P2] review comment. Rule 2 of the continuous-balancing
gate previously fired whenever ANY section in the document had an
explicit continuous break, allowing balancing for every multi-column
section whose own type was omitted — even unrelated ones. A later
single-page two-column body section with omitted <w:type> would be
balanced just because an earlier section was explicit-continuous,
violating sd-1655's skip-omitted-single-page rule.

Per ECMA-376 §17.18.77, a continuous break balances the section it
ENDS. When the body sectPr authors an explicit continuous break, the
affected section is the one IMMEDIATELY preceding the body. Tighten
rule 2 from a doc-wide flag to bodyExplicitContinuousIdx − 1.

Verified:
- sd-1480: section 0 still balances (rule 2 fires for sectionIdx 0 ===
  bodyExplicitContinuousIdx 1 − 1).
- mixed-columns-tabs-tnr: body section (sectionIdx 1) does not balance
  (no longer matches bodyExplicitContinuousIdx − 1 = 0).
- sd-1655: not affected (no body-explicit-continuous in the doc).
- Hypothetical 'mid-doc explicit-continuous + body omitted single-page
  2-col': body now correctly skipped.

All 644 layout-engine tests pass.
Copy link
Copy Markdown
Contributor

@luccas-harbour luccas-harbour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luccas-harbour luccas-harbour enabled auto-merge May 5, 2026 14:53
@luccas-harbour luccas-harbour added this pull request to the merge queue May 5, 2026
@caio-pizzol caio-pizzol removed this pull request from the merge queue due to the queue being cleared May 5, 2026
@luccas-harbour luccas-harbour merged commit e95699b into main May 5, 2026
68 checks passed
@luccas-harbour luccas-harbour deleted the tadeu/sd-2452-feature-implement-column-balancing-for-continuous-section branch May 5, 2026 17:40
@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in @superdoc-dev/mcp v0.3.0-next.56

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in vscode-ext v2.3.0-next.100

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in @superdoc-dev/react v1.2.0-next.98

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in superdoc-cli v0.8.0-next.73

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in superdoc v1.30.0-next.56

The release is available on GitHub release

@superdoc-bot
Copy link
Copy Markdown
Contributor

superdoc-bot Bot commented May 5, 2026

🎉 This PR is included in superdoc-sdk v1.8.0-next.56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants