|
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to this project will be documented in this file. |
| 4 | + |
| 5 | +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), |
| 6 | +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). |
| 7 | + |
| 8 | +## [Unreleased] |
| 9 | + |
| 10 | +### Added |
| 11 | + |
| 12 | +#### New Unit Tests (48 tests across 3 files) |
| 13 | + |
| 14 | +**EmailSyntaxValidatorTests.swift** |
| 15 | +- `testLocalPartExactly63Characters` - Boundary test for 63-character local part |
| 16 | +- `testLocalPartExactlyOneCharacter` - Minimum valid local part |
| 17 | +- `testLocalPartEmptyString` - Empty local part rejection |
| 18 | +- `testUnicodeLocalPartCharacterVsByteCount` - 30 four-byte Unicode chars (120 bytes, 30 chars) |
| 19 | +- `testUnicodeLocalPartExceeds64Characters` - 65+ Unicode character rejection |
| 20 | +- `testEmojiInLocalPart` - Emoji validation in Unicode mode |
| 21 | +- `testCombiningMarksInLocalPart` - Diacritics and combining characters |
| 22 | +- `testHighUnicodeRanges` - Characters beyond BMP (U+1D400+) |
| 23 | +- `testZeroWidthCharacters` - ZWSP, ZWJ, ZWNJ handling |
| 24 | +- `testBidirectionalOverrideCharacters` - RTL/LTR control character rejection |
| 25 | +- `testC1ControlCharactersRejected` - C1 control character rejection (U+0080-U+009F) |
| 26 | +- `testRFC2047EncodedWithIPv4AddressLiteral` - RFC2047 with IPv4 literal |
| 27 | +- `testRFC2047EncodedWithIPv6AddressLiteral` - RFC2047 with IPv6 literal |
| 28 | +- `testQuotedStringWithMultipleAtSymbols` - Multiple @ in quoted strings |
| 29 | +- `testQuotedStringWithRFC2047Decoding` - RFC2047 decoded quoted strings |
| 30 | +- `testAutoEncodeToRfc2047WithAddressLiteral` - Combined options testing |
| 31 | +- `testCustomDomainValidatorAcceptsAnyDomain` - Permissive validator |
| 32 | +- `testCustomDomainValidatorRejectsAllDomains` - Restrictive validator |
| 33 | +- `testCustomDomainValidatorWithSpecificTLDs` - TLD-specific validation |
| 34 | +- `testCustomDomainValidatorReceivesCorrectDomain` - Domain parameter verification |
| 35 | +- `testCustomDomainValidatorWithUnicodeDomain` - IDN domain handling |
| 36 | +- `testMultipleDotsInVariousPositions` - Valid multi-dot local parts |
| 37 | +- `testSingleCharactersBetweenDots` - Minimal segments between dots |
| 38 | +- `testMaxConsecutiveSpecialCharacters` - Consecutive special characters |
| 39 | +- `testSpecialCharactersAtBoundaries` - Special chars at start/end of segments |
| 40 | +- `testExtremelyLongLocalPart` - 1000 character local part rejection |
| 41 | +- `testExtremelyLongDomain` - 500+ character domain handling |
| 42 | +- `testVeryLongRFC2047EncodedString` - Near 76-char limit RFC2047 |
| 43 | +- `testManyUnicodeCharactersInLocalPart` - 64 diverse Unicode characters |
| 44 | + |
| 45 | +**RFC2047CoderTests.swift** |
| 46 | +- `testDecodingUTF16B` - Base64 with UTF-16 charset |
| 47 | +- `testDecodingUTF32B` - Base64 with UTF-32 charset |
| 48 | +- `testDecodingUTF16InvalidData` - Malformed UTF-16 rejection |
| 49 | +- `testDecodingUTF32InvalidData` - Malformed UTF-32 rejection |
| 50 | +- `testEncodeDecodeRoundTripSimpleASCII` - ASCII round-trip |
| 51 | +- `testEncodeDecodeRoundTripUnicode` - Unicode round-trip |
| 52 | +- `testEncodeDecodeRoundTripSpecialCharacters` - Special character round-trip |
| 53 | +- `testDecodingLatin2QPolishCharacters` - Polish special characters |
| 54 | +- `testDecodingLatin2QCzechCharacters` - Czech special characters |
| 55 | +- `testDecodingLatin2InvalidControlCharacter` - Invalid byte handling |
| 56 | +- `testEncodeEmptyString` - Empty string encoding |
| 57 | +- `testDecodeWithMixedCaseCharset` - Case-insensitive charset |
| 58 | +- `testDecodeWithMixedCaseEncoding` - Case-insensitive encoding type |
| 59 | +- `testDecodeWithWhitespaceInEncodedWord` - Whitespace handling |
| 60 | + |
| 61 | +**IPAddressValidatorTests.swift** |
| 62 | +- `testIPv6ZoneIdentifiers` - Zone identifier rejection per RFC 5321 |
| 63 | +- `testIPv6LoopbackVariants` - `::1` variations |
| 64 | +- `testIPv4MappedIPv6Extended` - `::ffff:` mapped addresses |
| 65 | +- `testIPv4LeadingZeros` - Leading zeros handling |
| 66 | +- `testEmptyIPAddressStrings` - Empty/whitespace rejection |
| 67 | + |
| 68 | +### Changed |
| 69 | + |
| 70 | +- **EmailSyntaxValidator.swift**: Reordered CharacterSet construction to work around Foundation bug where `.subtracting()` corrupts supplementary Unicode plane data. Supplementary planes (U+10000-U+10FFFF) are now added last, after all subtractions. |
| 71 | + |
| 72 | +### Fixed |
| 73 | + |
| 74 | +#### RFC 5321 Compliance |
| 75 | +- **IPAddressSyntaxValidator.swift**: IPv6 zone identifiers (e.g., `fe80::1%eth0`) are now correctly rejected. Per RFC 5321 Section 4.1.3, zone identifiers are not valid in email address literals. |
| 76 | + |
| 77 | +#### RFC 5198 Compliance |
| 78 | +- **EmailSyntaxValidator.swift**: C1 control characters (U+0080-U+009F) are now rejected in Unicode mode. Per RFC 5198 Section 2, these control characters should be avoided in network interchange. |
| 79 | + |
| 80 | +#### RFC 6531 Compliance |
| 81 | +- **EmailSyntaxValidator.swift**: Fixed supplementary Unicode plane support (U+10000-U+10FFFF). Emoji, mathematical symbols, and other characters beyond the Basic Multilingual Plane now correctly validate in Unicode mode. |
| 82 | + |
| 83 | +#### Security Improvements |
| 84 | +- **EmailSyntaxValidator.swift**: Bidirectional formatting characters are now rejected: |
| 85 | + - Left-to-Right Mark / Right-to-Left Mark (U+200E-U+200F) |
| 86 | + - Directional embeddings and overrides (U+202A-U+202E) |
| 87 | + - Directional isolates (U+2066-U+2069) |
| 88 | + - Deprecated format characters (U+206A-U+206F) |
| 89 | + |
| 90 | + These characters can be exploited for homograph attacks and email spoofing. |
| 91 | + |
| 92 | +### Technical Notes |
| 93 | + |
| 94 | +#### CharacterSet Bug Workaround |
| 95 | +Foundation's `CharacterSet` has a bug where calling `.subtracting()` on a set that includes supplementary Unicode planes (U+10000+) corrupts the supplementary plane data, even when the subtracted characters don't overlap. The workaround is to add supplementary planes as the final `.union()` call, after all `.subtracting()` operations are complete. |
| 96 | + |
| 97 | +```swift |
| 98 | +// WRONG - supplementary planes get corrupted by subsequent subtractions |
| 99 | +let charset = baseSet |
| 100 | + .union(supplementaryPlanes) // Added here... |
| 101 | + .subtracting(c1Controls) // ...corrupted here |
| 102 | + |
| 103 | +// CORRECT - add supplementary planes last |
| 104 | +let charset = baseSet |
| 105 | + .subtracting(c1Controls) // All subtractions first |
| 106 | + .union(supplementaryPlanes) // Add supplementary planes last |
| 107 | +``` |
0 commit comments