Conversation
f41e67b to
a7ffb0a
Compare
This is for the issue_493 test case. That generates ~700k chars. Previously we would parse but not process that input, but recent changes mean that we do. Nothing wrong with that, it just takes longer. 3s is slightly too short it seems. On my machine I can do it in 2.4s but on GH actions it seems to be slower. Increasing the threshold to 4s as that is still reasonable, but means I don't have to find insane perf improvements to meet this arbitrary threshold
Collaborator
|
LGTM thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR fixes #671.
There are a couple of examples from that issue that are valid GFM, so they aren't addressed here.
Fixed issues with the GFM parser not checking if the delim run after was of the same syntax when deciding whether to close a span. Also fixed code-friendly extra not triggering for more than 2 em chars in a row.
I've also made a number of performance tweaks to try and keep the
issue_493ReDoS test fast.Previously for this test we would parse but not process that input, but with this PR we now attempt to process it, which has slowed down that test case's execution alot, even if it's not quite a DoS.
In most cases, it now takes between 2.4s (on my machine) up to 3.5s (on some GH actions) to run that testcase. Not unreasonable considering that testcase is a 700k character input.
I added a couple of logic shortcuts, quicker checks that prevent more expensive function calls. (see
body_crosses_span_bordersI've also tried to reduce the number of calls to specific functions. When profiling this test case I found that
lenwas being called over 3 million times, taking 0.3s in total.I replaced a bunch of unnecessary
lencalls with direct string comparisons and brought that down to 1.7 million calls.Finally, when processing em spans we were making alot of calls to
list.append. I've re-written it a bit to reduce that as well.Even so, I have had to increase the ReDoS test timeout to 4s, up from 3s. I just couldn't find sane perf tweaks to do that would get it under 3s on GH actions runners, even though on my machine it was well under. I think this is still a reasonable threshold