Skip to content

Fix #671#678

Merged
nicholasserra merged 4 commits intotrentm:masterfrom
Crozzers:gfm-em-issue671
Feb 15, 2026
Merged

Fix #671#678
nicholasserra merged 4 commits intotrentm:masterfrom
Crozzers:gfm-em-issue671

Conversation

@Crozzers
Copy link
Contributor

@Crozzers Crozzers commented Feb 12, 2026

This PR fixes #671.

There are a couple of examples from that issue that are valid GFM, so they aren't addressed here.

Fixed issues with the GFM parser not checking if the delim run after was of the same syntax when deciding whether to close a span. Also fixed code-friendly extra not triggering for more than 2 em chars in a row.


I've also made a number of performance tweaks to try and keep the issue_493 ReDoS test fast.
Previously for this test we would parse but not process that input, but with this PR we now attempt to process it, which has slowed down that test case's execution alot, even if it's not quite a DoS.

In most cases, it now takes between 2.4s (on my machine) up to 3.5s (on some GH actions) to run that testcase. Not unreasonable considering that testcase is a 700k character input.

I added a couple of logic shortcuts, quicker checks that prevent more expensive function calls. (see body_crosses_span_borders

I've also tried to reduce the number of calls to specific functions. When profiling this test case I found that len was being called over 3 million times, taking 0.3s in total.
I replaced a bunch of unnecessary len calls with direct string comparisons and brought that down to 1.7 million calls.

Finally, when processing em spans we were making alot of calls to list.append. I've re-written it a bit to reduce that as well.

Even so, I have had to increase the ReDoS test timeout to 4s, up from 3s. I just couldn't find sane perf tweaks to do that would get it under 3s on GH actions runners, even though on my machine it was well under. I think this is still a reasonable threshold

@Crozzers Crozzers marked this pull request as draft February 12, 2026 21:21
This is for the issue_493 test case. That generates ~700k chars. Previously we would parse but not process that input, but recent
changes mean that we do. Nothing wrong with that, it just takes longer. 3s is slightly too short it seems. On my machine I can do it
in 2.4s but on GH actions it seems to be slower.

Increasing the threshold to 4s as that is still reasonable, but means I don't have to find insane perf improvements to meet this
arbitrary threshold
@Crozzers Crozzers marked this pull request as ready for review February 15, 2026 13:16
@nicholasserra
Copy link
Collaborator

LGTM thank you!

@nicholasserra nicholasserra merged commit a1e159c into trentm:master Feb 15, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multiple regressions since #665

2 participants

Comments