A simple and lightweight Python utility that converts Microsoft Word (.docx) files into clean, semantic HTML using the Mammoth library.
This tool is ideal for WordPress publishing, SEO content workflows, content migration, and automation pipelines.
- Convert
.docxfiles to clean HTML - Preserves headings, paragraphs, lists, and structure
- Fast and lightweight
- Easy integration with WordPress & CMS platforms
- Minimal dependencies
- Python 3
- Mammoth
git clone https://github.com/your-username/Word-to-html.git
cd Word-to-html
2๏ธโฃ Install Dependencies
pip install mammoth
Or using requirements file:
pip install -r requirements.txt
โถ๏ธ Usage
Python Example
import mammoth
def convert_docx_to_html(docx_file_path, output_html_file_path):
with open(docx_file_path, "rb") as docx_file:
result = mammoth.convert_to_html(docx_file)
with open(output_html_file_path, "w", encoding="utf-8") as html_file:
html_file.write(result.value)
print("Conversion completed successfully!")
# Usage
input_docx = r"C:\path\to\your\file.docx"
output_html = "output.html"
convert_docx_to_html(input_docx, output_html)
Run the Script
python convert.py
After execution, the converted HTML file will be available at the specified location.
๐ Project Structure
Word-to-html/
โ
โโโ convert.py # Main Python script
โโโ requirements.txt # Dependencies
โโโ README.md # Documentation
โโโ LICENSE # MIT License
โโโ .gitignore
๐ง Use Cases
WordPress blog publishing
Content migration from Word to CMS
SEO content formatting
Automation pipelines
Blog and article processing
โ ๏ธ Notes
Word-specific styling (fonts, colors) is not preserved
Best results with properly structured Word documents
Focused on content structure, not visual design
๐ฎ Future Improvements
Batch DOCX conversion
Command-line interface (CLI)
Custom HTML templates
WordPress REST API integration
๐ค Contributing
Contributions are welcome!
Fork the repository
Create a new branch
Commit your changes
Open a pull request
๐ License
This project is licensed under the MIT License.
You are free to use, modify, and distribute this software.