Fix #671 by Crozzers · Pull Request #678 · trentm/python-markdown2

Crozzers · 2026-02-12T21:18:23Z

This PR fixes #671.

There are a couple of examples from that issue that are valid GFM, so they aren't addressed here.

Fixed issues with the GFM parser not checking if the delim run after was of the same syntax when deciding whether to close a span. Also fixed code-friendly extra not triggering for more than 2 em chars in a row.

I've also made a number of performance tweaks to try and keep the issue_493 ReDoS test fast.
Previously for this test we would parse but not process that input, but with this PR we now attempt to process it, which has slowed down that test case's execution alot, even if it's not quite a DoS.

In most cases, it now takes between 2.4s (on my machine) up to 3.5s (on some GH actions) to run that testcase. Not unreasonable considering that testcase is a 700k character input.

I added a couple of logic shortcuts, quicker checks that prevent more expensive function calls. (see body_crosses_span_borders

I've also tried to reduce the number of calls to specific functions. When profiling this test case I found that len was being called over 3 million times, taking 0.3s in total.
I replaced a bunch of unnecessary len calls with direct string comparisons and brought that down to 1.7 million calls.

Finally, when processing em spans we were making alot of calls to list.append. I've re-written it a bit to reduce that as well.

Even so, I have had to increase the ReDoS test timeout to 4s, up from 3s. I just couldn't find sane perf tweaks to do that would get it under 3s on GH actions runners, even though on my machine it was well under. I think this is still a reasonable threshold

This is for the issue_493 test case. That generates ~700k chars. Previously we would parse but not process that input, but recent changes mean that we do. Nothing wrong with that, it just takes longer. 3s is slightly too short it seems. On my machine I can do it in 2.4s but on GH actions it seems to be slower. Increasing the threshold to 4s as that is still reasonable, but means I don't have to find insane perf improvements to meet this arbitrary threshold

nicholasserra · 2026-02-15T20:01:57Z

LGTM thank you!

Fix trentm#671

a7ffb0a

Crozzers force-pushed the gfm-em-issue671 branch from f41e67b to a7ffb0a Compare February 12, 2026 21:20

Crozzers marked this pull request as draft February 12, 2026 21:21

Crozzers added 3 commits February 14, 2026 16:32

Fix ReDoS regression and optimise performance in GFM emphasis parser

3908fc5

Some further perf tweaks

32321c1

Crozzers marked this pull request as ready for review February 15, 2026 13:16

nicholasserra merged commit a1e159c into trentm:master Feb 15, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #671#678

Fix #671#678
nicholasserra merged 4 commits intotrentm:masterfrom
Crozzers:gfm-em-issue671

Crozzers commented Feb 12, 2026 •

edited

Loading

Uh oh!

nicholasserra commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Crozzers commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicholasserra commented Feb 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Crozzers commented Feb 12, 2026 •

edited

Loading