Merged
Conversation
This was referenced Apr 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
From: alice@example.com <bob@example.com>) were silently dropped, returning[["", ""]]instead of the real valuesbcc,cc,reply-to,delivered-to) appeared in the output as[["", ""]]instead of being omittedRoot cause
Both bugs trace back to Python's
email.utils.getaddresses, which returns[('', '')]— a non-empty list with one empty tuple — for input it cannot parse. This includes both absent headers (empty string input) and headers with RFC-non-compliant display names.The CVE-2023-27043 security hardening (backported to Python 3.9+) made
strict=Truethe default. This correctly rejects unquoted@in display names per RFC 5322 §3.4. The right call for an MTA — but mail-parser is a security/forensics tool: hiding an address because its display name looks like an email address (a common pattern in phishing and impersonation mail) defeats the purpose of the tool.Changes
Bug 1 — Email address as display name is silently dropped (fixes #132)
New
get_addresses()helper inutils.py:strict=Truefirst — CVE hardening preserved for well-formed input('', '')on a non-empty header, falls back to a regex that extracts angle-bracket addresses and display names directly from the raw header valueBefore:
After:
Bug 2 — Absent address headers appear in output as
[["", ""]]One-line guard —
if email_addr— added to the list comprehension in__getattr__. Any tuple with an empty address is filtered out, so absent headers produce[], which is falsy and excluded from the output.Before:
After: keys are absent from the output entirely.
Test plan
tests/mails/mail_test_19— new synthetic email with email-as-display-name inFrom,CC,Reply-To, and mixedToTestEmailAsDisplayName— 14 tests covering both bugs and edge cases:alice@example.com <bob@example.com>→[("alice@example.com", "bob@example.com")]bob@example.com <bob@example.com>→[("", "bob@example.com")](name == addr, suppressed)"alice@example.com" <bob@example.com>→ strict path, no changeAlice Smith <alice@example.com>→ strict path, no changealice@example.com(bare) → no change[], key omitted from output