-
Notifications
You must be signed in to change notification settings - Fork 42
add recursion limit #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add recursion limit #134
Conversation
|
Have you run into this problem in practice? The recursion would only be unbounded if the input message were infinite bytes long, so I'm not sure this is much more than a theoretical concern |
|
Yes, actually mb. Should have mention this in the PR description 😄 |
|
Can you provide a test case? I'm not opposed to adding an absurdly high limit like 1000 and just hard-coding it in. |
|
Sure, but the point of stack overflow depends on the machine stack size. I tested this on my machine which has a stack size of 8mb. On windows it's typically much lower (1mb). So running the test on different machines will produce different results. |
324172f to
1efb9f3
Compare
|
played around a bit, 1000 is too much for github actions. 300 seems to work |
|
Sorry for the delay here. While looking at this PR I discovered that the library is not handling some kinds of multipart boundaries correctly. So now I gotta go fix that first. |
|
Actually the boundary matching is to spec. It's your test file that's wrong. From RFC 2046: When you have a boundary like |
|
Please make the following changes:
|
Ic, well that should be easy enough to fix.
Wdym? 😄 Could you elaborate how you think it works and how it doesnt?
Why the limit of 100? |
I just mean that with the test file you attached it wouldn't actually recurse the way you were expecting, because of the boundary prefix matching. If you print out the depth at the start of the recursive function it doesn't decrease linearly all the way. The B10 delimiter is treated as a part separator for the B1 boundary so it only recurses 10 times before moving to another "part". Sorry if that's confusing but you can see what I mean if you reduce the limit to 100, change the file to have 105 nested, and print out the depth at the start of the recursive function. |
|
Yeah no, ur explanation makes perfect sense. I'll apply the changes tomorrow. Btw. after #136, I might have an idea for rewriting the parser to be iterative rather than recursive. Would you be open to that kind of change, or would you prefer to stick with the current approach and enforce a maximum recursion limit? |
Sorry, forgot to reply to this. I prefer to have a single number for both debug and production code, and honestly any number bigger than 5 or so is probably big enough (as you noted the other mail parser lib sets it to 3). 100 seems like a nice round number.
Depends how readable it ends up being. If you can provide a high-level description of how you'd do it that might help. |
Okey. I updated the
Probably not very readable 😄 At a high level, I’d do another Vec containing partially read message, meaning messages that we know exist but haven’t parsed yet (so basically This essentially turns the parse function into a single while loop that keeps popping from the stack, while also pushing new items onto it inside the loop. The tricky part is pushing the parsed messages back into the correct subpart Vec. We’d need some way to keep track of the “parent” message. |
|
This looks good, thanks!
Ah so fundamentally instead of using the system stack you're making your own stack using a Vec. I'm not really sure it's worth doing, honestly. This recursion depth is unlikely to be hit in practice except in the case of malicious mails and in those cases bailing out with the recursion depth seems fine. |
This PR adds a recursion depth limit to prevent unbounded recursion when parsing deeply nested messages. I chose the limit of 255 rather arbitrarely. Let me know what you think.
I also wasn't sure how best to expose this to the public api. I was thinking maybe a new function
parse_mail_limit?