Convert MHTML files to standalone HTML with embedded CSS and resources.
pip install unmhtmlfrom unmhtml import MHTMLConverter
# Convert MHTML file to HTML (secure by default)
converter = MHTMLConverter()
html_content = converter.convert_file('saved_page.mhtml')
# Save as standalone HTML
with open('output.html', 'w') as f:
f.write(html_content)
# Unsafe conversion preserving original content
unsafe_converter = MHTMLConverter(
remove_javascript=False,
sanitize_css=False,
remove_forms=False,
remove_meta_redirects=False
)
html_content = unsafe_converter.convert_file('trusted_page.mhtml')- Pure Python - No external dependencies, uses only standard library
- Standalone HTML - Embeds CSS and converts resources to data URIs
- Secure by Default - All security sanitization enabled by default for safe processing
- Graceful degradation - Handles malformed MHTML files
- Memory efficient - Processes large files without excessive memory usage
The library is secure by default - all security features are enabled automatically to safely display untrusted content:
remove_javascript=True- Removes<script>tags, event handlers (onclick, onload, etc.), and convertsjavascript:URLs to safe#anchorssanitize_css=True- Removes external CSS URLs (http://,https://,//,/absolute) while preserving relative URLs (image.png,../fonts/font.woff) and data URIsremove_forms=True- Removes form elements (<form>,<input>,<textarea>,<select>) that could submit data externallyremove_meta_redirects=True- Removes dangerous meta tags (refresh redirects, set-cookie, dns-prefetch) that could be used maliciously
- Python 3.8+
MIT