Skip to content

Escape leading colon markers in HtmlToDjot + roundtrip property net#204

Merged
dereuromark merged 1 commit into
masterfrom
fix/htmltodjot-escape-colon-blocks
Jun 2, 2026
Merged

Escape leading colon markers in HtmlToDjot + roundtrip property net#204
dereuromark merged 1 commit into
masterfrom
fix/htmltodjot-escape-colon-blocks

Conversation

@dereuromark
Copy link
Copy Markdown
Contributor

Follow-up to the HtmlToDjot escaping work (#202, #203).

Problem (G1)

A paragraph of external HTML beginning with a colon was re-parsed as a different block on the next Djot parse:

<p>: just text</p>     ->  : just text     ->  <dl><dt>just text</dt>...   (paragraph -> definition list)
<p>::: not a div</p>   ->  ::: not a div   ->  <div class="not a div"></div>   (text DESTROYED into a class)
<dd>: value</dd>       ->  ...             ->  nested <dl> corruption

The ::: case is the worst: the content is captured into the div class name and the text is lost.

This is the same class as the leading -, #, > and 1. markers already handled; escapeLeadingBlockMarker simply did not cover :.

Fix

Add a leading colon to the block-marker escaping. Escaping the first colon neutralizes both a definition-list item (: term) and a fenced div (::: name), so the text stays literal: \: just text, \::: not a div.

Property net (G4)

Adds a generated round-trip property test that, for externally-authored HTML, sweeps Djot-significant leading tokens (- + * # > : ::: 1. 1) --- ~~~ ``` | ...) and inline markers across paragraph and inline-wrapped (<span>) contexts, asserting:

  • no block is injected (no list, heading, blockquote, div, definition list, rule, table, or section), and
  • the text survives (alphanumeric content preserved, so data loss like the ::: case is caught).

This sweep is what surfaced the colon gap, and it guards the whole "a literal marker silently became a block" class going forward instead of one example at a time. It also re-covers the markers fixed in #202.

A paragraph of external HTML that began with a colon was re-parsed as a
block on the next Djot parse: `: x` opened a definition list and `::: x`
opened a fenced div, the latter capturing the text into a class name and
dropping the content entirely.

Add a leading colon to the block-marker escaping alongside the existing
`-`, `+`, `#`, `>` and ordered-list handling, so `: x` and `::: x` stay
literal text.

Add a generated round-trip property test that sweeps Djot-significant
leading tokens and inline markers across paragraph and inline-wrapped
contexts, asserting the block structure and text are preserved. This net
is what surfaced the colon gap and guards the whole class of "a literal
marker silently became a block" regressions.
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.80%. Comparing base (f8b5407) to head (ad81444).

Additional details and impacted files
@@            Coverage Diff            @@
##             master     #204   +/-   ##
=========================================
  Coverage     91.80%   91.80%           
  Complexity     3450     3450           
=========================================
  Files           104      104           
  Lines          9786     9786           
=========================================
  Hits           8984     8984           
  Misses          802      802           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dereuromark dereuromark marked this pull request as ready for review June 2, 2026 11:51
@dereuromark dereuromark merged commit 637623e into master Jun 2, 2026
6 checks passed
@dereuromark dereuromark deleted the fix/htmltodjot-escape-colon-blocks branch June 2, 2026 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant