Skip to content

[Repo Assist] Improve CommonMark compliance: setext headings, thematic breaks, and block interaction#1046

Merged
dsyme merged 6 commits intomainfrom
repo-assist/fix-191-commonmark-setext-thematic-25affc9f5e6ac001
Feb 27, 2026
Merged

[Repo Assist] Improve CommonMark compliance: setext headings, thematic breaks, and block interaction#1046
dsyme merged 6 commits intomainfrom
repo-assist/fix-191-commonmark-setext-thematic-25affc9f5e6ac001

Conversation

@github-actions
Copy link
Contributor

🤖 This is a pull request created by Repo Assist, an automated AI assistant.

Closes #191 (partial progress)

Summary

This PR improves CommonMark spec compliance in the Markdown parser, building on the ATX heading improvements merged in #1028. It addresses correctness of thematic breaks and setext headings, with a particular focus on how these interact with paragraphs, list items, and blockquotes.


Changes to MarkdownParser.fs

1. Thematic breaks now interrupt paragraphs (TakeParagraphLines)

A thematic break (***, ---, ___, - - -, etc.) now correctly ends a running paragraph. Previously, Foo\n***\nbar was rendered as a single paragraph containing ***; it now produces <p>Foo</p><hr /><p>bar</p>.

CommonMark spec examples fixed: 23, 52

2. Thematic breaks now interrupt list item continuations (LinesUntilListOrWhite)

A thematic break following a list item no longer gets absorbed as lazy continuation content. Previously, - foo\n--- created a setext heading inside the list item; it now produces a list followed by <hr />.

CommonMark spec examples fixed: 22, 57, 62

3. Thematic breaks now interrupt lazy blockquote continuations (LinesUntilBlockquoteEnds)

A thematic break following a blockquote line no longer gets absorbed as lazy continuation content. Previously, > foo\n--- created a setext heading inside the blockquote; it now produces a blockquote followed by <hr />.

CommonMark spec examples fixed: 56, 64

4. Thematic breaks with 4+ leading spaces are rejected (HorizontalRule)

The HorizontalRule pattern now rejects lines with 4 or more leading spaces, matching the CommonMark rule that 4+ spaces create an indented code block instead. Previously, *** was mistakenly recognised as a thematic break.

CommonMark spec examples fixed (regressions prevented): 14, 51

5. Setext heading underlines allow 0–3 leading spaces (isSetextUnderline + Heading)

The setext heading detection now uses a new isSetextUnderline helper that accepts 0–3 leading spaces in the underline row (e.g. === or ---), matching the CommonMark spec. Also adds a non-empty content guard to prevent matching an empty line followed by a setext underline as a heading.

CommonMark spec examples fixed: 48, 50


CommonMark Spec Results

Section Before After
Thematic breaks 16 / 19 18 / 19
Setext headings 10 / 20 17 / 20

Remaining failures are more complex cases requiring further investigation:

  • Thematic breaks example 26: list item whose content * * * needs block-level (not inline) parsing
  • Setext headings examples 55, 65: HTML/backslash attribute quoting issues
  • Setext headings example 58: ambiguity resolution when multi-line paragraph precedes ---

Test Status

✅ All 222 Markdown tests pass (0 failures, 0 regressions).
✅ All 101 Literate tests pass.
✅ Build succeeds.
✅ Code formatted with Fantomas.

Generated by Repo Assist for issue #191

Generated by Repo Assist for issue #191

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@afb00b92a9514fee9a14c583f059a03d05738f70

…aragraph/list/blockquote interactions

- Allow 0-3 leading spaces in setext heading underlines (fixes spec examples 48, 50)
- Thematic breaks now interrupt paragraphs (fixes spec examples 23, 52)
- Thematic breaks now interrupt lazy blockquote continuations (fixes spec examples 56, 64)
- Thematic breaks now interrupt list item continuations (fixes spec examples 22, 57, 62)
- Thematic breaks with 4+ leading spaces no longer recognised as HR (fixes spec examples 14, 51)

Net result: Thematic breaks 16→18/19 passing, Setext headings 10→17/20 passing.
All 222 existing Markdown tests and 101 Literate tests continue to pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dsyme dsyme marked this pull request as ready for review February 26, 2026 19:03
@dsyme
Copy link
Contributor

dsyme commented Feb 26, 2026

/repo-assist Show me the remote heads by running ls .git/refs/remotes/origin/ and indeed run ls -R .git/refs/ too

@dsyme

This comment was marked as off-topic.

@github-actions

This comment has been minimized.

@dsyme

This comment was marked as off-topic.

@github-actions

This comment was marked as off-topic.

@dsyme
Copy link
Contributor

dsyme commented Feb 27, 2026

@nojaf I'll merge this as its just increasing compliance step by step, seems fine as long as more tests pass

@dsyme
Copy link
Contributor

dsyme commented Feb 27, 2026

Oh actually no new tests have been enabled!

@dsyme
Copy link
Contributor

dsyme commented Feb 27, 2026

/repo-assist please add or enable more commonmark tests corresponding to this change

…gs' sections

Add the two sections to enabledSections, covering 35 additional spec
examples (18 Thematic breaks + 17 Setext headings). The four known
remaining failures (examples 26, 55, 58, 65) are listed in a
skippedExamples set so they are excluded until the underlying parser
issues are resolved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor Author

Commit pushed: a4c11c6

Generated by Repo Assist

@dsyme dsyme merged commit ae0afe7 into main Feb 27, 2026
4 checks passed
@dsyme dsyme deleted the repo-assist/fix-191-commonmark-setext-thematic-25affc9f5e6ac001 branch February 27, 2026 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make F# Formatting compatible with CommonMark

1 participant