Skip to content

Commit 02283aa

Browse files
JSv4claude
andcommitted
Restructure into 3 packages for optional, prebuilt engine distribution
Split the project into a monorepo of three separately-published packages so engines can be installed optionally with binaries embedded in PyPI wheels: - python-redlines (packages/core): pure-Python wrapper, py3-none-any wheel - python-redlines-ooxmlpowertools: Open-XML-PowerTools binary, per-platform wheels - python-redlines-docxodus: Docxodus binary, per-platform wheels Engines install via extras: pip install python-redlines[docxodus|ooxmlpowertools|all]. Binaries are prebuilt per platform and embedded in each binary package's wheel, so no .NET SDK or local compilation is needed to install or use the library. - engines.py locates its binary package via importlib.resources, extracts the platform archive into the user cache dir (not site-packages), and raises EngineNotInstalledError with install guidance when a companion package is missing - hatch_build.py stamps each binary wheel with the correct platform tag - build_differ.py builds a given RID into each package's _binaries/ dir - CI builds per-platform wheels across 3 OS runners; release publishes all three - Drop hatch/hatchling runtime deps and the build-on-install hook Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 07d30b9 commit 02283aa

32 files changed

Lines changed: 830 additions & 487 deletions

File tree

.github/workflows/ci.yml

Lines changed: 47 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -33,43 +33,65 @@ jobs:
3333
with:
3434
dotnet-version: '8.0.x'
3535

36-
- name: Install dependencies
36+
- name: Determine runtime identifier
37+
id: rid
38+
shell: bash
3739
run: |
38-
python -m pip install --upgrade pip
39-
pip install hatch
40+
case "${{ runner.os }}" in
41+
Linux) echo "rid=linux-x64" >> "$GITHUB_OUTPUT" ;;
42+
Windows) echo "rid=win-x64" >> "$GITHUB_OUTPUT" ;;
43+
macOS) echo "rid=osx-arm64" >> "$GITHUB_OUTPUT" ;;
44+
esac
4045
41-
- name: Build engine binaries
42-
run: python build_differ.py
46+
- name: Build engine binaries for this platform
47+
run: python build_differ.py ${{ steps.rid.outputs.rid }}
48+
49+
- name: Install packages (editable)
50+
run: pip install -e packages/core -e packages/ooxmlpowertools -e packages/docxodus pytest
4351

4452
- name: Run tests
45-
run: hatch run test
53+
run: python -m pytest tests/ -v
4654

47-
build:
55+
build-core:
4856
runs-on: ubuntu-latest
4957
steps:
5058
- uses: actions/checkout@v4
59+
- uses: actions/setup-python@v5
5160
with:
52-
submodules: recursive
61+
python-version: '3.11'
62+
- run: pip install build twine
63+
- name: Build core sdist + wheel
64+
run: python -m build packages/core --outdir dist
65+
- name: Check distributions
66+
run: twine check dist/*
5367

54-
- name: Set up Python
55-
uses: actions/setup-python@v5
68+
build-engine-wheels:
69+
runs-on: ${{ matrix.os }}
70+
strategy:
71+
fail-fast: false
72+
matrix:
73+
include:
74+
- { os: ubuntu-latest, rids: "linux-x64 linux-arm64" }
75+
- { os: windows-latest, rids: "win-x64 win-arm64" }
76+
- { os: macos-latest, rids: "osx-x64 osx-arm64" }
77+
steps:
78+
- uses: actions/checkout@v4
79+
with:
80+
submodules: recursive
81+
- uses: actions/setup-python@v5
5682
with:
5783
python-version: '3.11'
58-
59-
- name: Set up .NET
60-
uses: actions/setup-dotnet@v4
84+
- uses: actions/setup-dotnet@v4
6185
with:
6286
dotnet-version: '8.0.x'
63-
64-
- name: Install build dependencies
65-
run: |
66-
python -m pip install --upgrade pip
67-
pip install hatch hatchling
68-
69-
- name: Build package
70-
run: hatch build
71-
72-
- name: Check package
87+
- run: pip install build hatchling twine
88+
- name: Build per-platform engine wheels
89+
shell: bash
7390
run: |
74-
pip install twine
75-
twine check dist/*
91+
for rid in ${{ matrix.rids }}; do
92+
python build_differ.py "$rid"
93+
python -m build --wheel --no-isolation packages/ooxmlpowertools --outdir dist
94+
python -m build --wheel --no-isolation packages/docxodus --outdir dist
95+
done
96+
- name: Check wheels
97+
run: twine check dist/*
Lines changed: 63 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
name: Upload Python Package
22

3+
# Builds and publishes all three packages on a tagged release:
4+
# python-redlines (core, pure-Python sdist + wheel)
5+
# python-redlines-ooxmlpowertools (per-platform engine wheels)
6+
# python-redlines-docxodus (per-platform engine wheels)
7+
38
on:
49
release:
510
types: [published]
@@ -8,28 +13,70 @@ permissions:
813
contents: read
914

1015
jobs:
11-
deploy:
12-
16+
build-core:
1317
runs-on: ubuntu-latest
18+
steps:
19+
- uses: actions/checkout@v4
20+
- uses: actions/setup-python@v5
21+
with:
22+
python-version: '3.11'
23+
- run: pip install build
24+
- name: Build core sdist + wheel
25+
run: python -m build packages/core --outdir dist
26+
- uses: actions/upload-artifact@v4
27+
with:
28+
name: dist-core
29+
path: dist/*
1430

31+
build-engine-wheels:
32+
runs-on: ${{ matrix.os }}
33+
strategy:
34+
fail-fast: false
35+
matrix:
36+
include:
37+
- { os: ubuntu-latest, rids: "linux-x64 linux-arm64" }
38+
- { os: windows-latest, rids: "win-x64 win-arm64" }
39+
- { os: macos-latest, rids: "osx-x64 osx-arm64" }
1540
steps:
16-
- uses: actions/checkout@v3
41+
- uses: actions/checkout@v4
1742
with:
1843
submodules: recursive
19-
- name: Set up Python
20-
uses: actions/setup-python@v3
44+
- uses: actions/setup-python@v5
2145
with:
22-
python-version: '3.x'
23-
- name: Setup .NET
24-
uses: actions/setup-dotnet@v3
46+
python-version: '3.11'
47+
- uses: actions/setup-dotnet@v4
2548
with:
2649
dotnet-version: '8.0.x'
27-
- name: Install dependencies
28-
run: |
29-
python -m pip install --upgrade pip
30-
pip install hatch hatchling
31-
- name: Build package
32-
run: hatch build
33-
- name: Publish package
50+
- run: pip install build hatchling
51+
- name: Build per-platform engine wheels
52+
shell: bash
3453
run: |
35-
hatch publish -u "__token__" -a ${{ secrets.PYPI_API_TOKEN }}
54+
for rid in ${{ matrix.rids }}; do
55+
python build_differ.py "$rid"
56+
python -m build --wheel --no-isolation packages/ooxmlpowertools --outdir dist
57+
python -m build --wheel --no-isolation packages/docxodus --outdir dist
58+
done
59+
- uses: actions/upload-artifact@v4
60+
with:
61+
name: dist-${{ matrix.os }}
62+
path: dist/*
63+
64+
publish:
65+
needs: [build-core, build-engine-wheels]
66+
runs-on: ubuntu-latest
67+
steps:
68+
- uses: actions/download-artifact@v4
69+
with:
70+
path: dist
71+
merge-multiple: true
72+
- uses: actions/setup-python@v5
73+
with:
74+
python-version: '3.11'
75+
- run: pip install twine
76+
- name: Check distributions
77+
run: twine check dist/*
78+
- name: Publish to PyPI
79+
env:
80+
TWINE_USERNAME: __token__
81+
TWINE_PASSWORD: ${{ secrets.PYPI_API_TOKEN }}
82+
run: twine upload dist/*

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ csproj/obj/*
99
docxodus/**/bin/*
1010
docxodus/**/obj/*
1111

12+
# Engine binary archives (built by build_differ.py, embedded in wheels by CI)
13+
packages/*/src/*/_binaries/*.tar.gz
14+
packages/*/src/*/_binaries/*.zip
15+
1216
# C extensions
1317
*.so
1418

CLAUDE.md

Lines changed: 71 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -4,68 +4,102 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
44

55
## Project Overview
66

7-
Python-Redlines is a Python wrapper around compiled C# binaries that generate `.docx` redline/tracked-changes documents by comparing two Word files. The Python layer handles platform detection, binary extraction, temp file management, and subprocess execution.
7+
Python-Redlines generates `.docx` redline/tracked-changes documents by comparing two Word files. A pure-Python wrapper drives compiled C# (.NET 8) engine binaries; the Python layer handles platform detection, binary extraction, temp file management, and subprocess execution.
88

99
Two comparison engines are available:
1010
- **XmlPowerToolsEngine** — wraps Open-XML-PowerTools WmlComparer (original engine)
1111
- **DocxodusEngine** — wraps Docxodus, a modernized .NET 8.0 fork with better move detection
1212

13-
## Commands
14-
15-
```bash
16-
# Run tests
17-
hatch run test
13+
## Monorepo structure — three published packages
1814

19-
# Run a single test
20-
hatch run test tests/test_openxml_differ.py::test_run_redlines_with_real_files
15+
This repo publishes **three** PyPI packages, each with its own `pyproject.toml` under `packages/`:
2116

22-
# Run tests with coverage
23-
hatch run cov
17+
| Directory | PyPI name | Contents | Wheel |
18+
|---|---|---|---|
19+
| `packages/core` | `python-redlines` | Pure-Python wrapper (`engines.py`) | `py3-none-any` |
20+
| `packages/ooxmlpowertools` | `python-redlines-ooxmlpowertools` | Open-XML-PowerTools binary | per-platform |
21+
| `packages/docxodus` | `python-redlines-docxodus` | Docxodus binary | per-platform |
2422

25-
# Type checking
26-
hatch run types:check
23+
Engine binaries are **optional dependencies**. Users install an engine via an extra:
24+
`pip install python-redlines[docxodus]`, `[ooxmlpowertools]`, or `[all]`. The core
25+
package has no binaries; each binary package ships one platform's compiled binary as a
26+
prebuilt wheel, so end users never compile anything.
2727

28-
# Build C# binaries for all platforms (requires .NET 8.0 SDK)
29-
hatch run build
28+
The repo root is **not** an installable project — its `pyproject.toml` holds only
29+
shared pytest/coverage config.
3030

31-
# Build Python package (triggers C# build via custom hook)
32-
hatch build
31+
## Commands
3332

34-
# Initialize Docxodus submodule (required before building)
33+
```bash
34+
# Initialize the Docxodus submodule (required before building its engine)
3535
git submodule update --init --recursive
36-
```
3736

38-
## Architecture
37+
# Build engine binaries for one or more platforms (requires .NET 8.0 SDK).
38+
# RIDs: linux-x64 linux-arm64 win-x64 win-arm64 osx-x64 osx-arm64
39+
python build_differ.py linux-x64
40+
python build_differ.py --all
41+
42+
# Install all three packages editable for development
43+
pip install -e packages/core -e packages/ooxmlpowertools -e packages/docxodus pytest
3944

40-
The system uses a two-layer wrapper pattern with a shared base class:
45+
# Run tests (from repo root)
46+
python -m pytest tests/
47+
python -m pytest tests/test_openxml_differ.py::test_run_redlines_with_real_files
4148

42-
1. **Python layer** (`src/python_redlines/engines.py`):
43-
- `BaseEngine` — shared logic for binary extraction, subprocess invocation, and temp file management
44-
- `XmlPowerToolsEngine(BaseEngine)` — sets constants for the Open-XML-PowerTools binary (`dist/`, `bin/`, `redlines`)
45-
- `DocxodusEngine(BaseEngine)` — sets constants for the Docxodus binary (`dist_docxodus/`, `bin_docxodus/`, `redline`)
49+
# Build a package wheel
50+
python -m build packages/core
51+
python -m build --wheel packages/docxodus # needs an archive in _binaries/ first
52+
```
4653

47-
Both engines expose `run_redline(author_tag, original, modified, **kwargs)`. `DocxodusEngine` overrides `_build_command()` to translate kwargs (e.g. `detect_moves`, `detail_threshold`) into CLI flags for the Docxodus binary. `XmlPowerToolsEngine` uses the legacy 4-positional-arg format and ignores kwargs.
54+
## Architecture
4855

49-
2. **C# binaries**:
56+
1. **Core Python layer** (`packages/core/src/python_redlines/engines.py`):
57+
- `BaseEngine` — locates the engine binary in its companion package via
58+
`importlib.resources`, extracts the platform archive once into a writable
59+
user cache dir (`platformdirs.user_cache_dir`), and runs it via subprocess.
60+
- `XmlPowerToolsEngine` / `DocxodusEngine` — subclasses declaring `BINARY_PACKAGE`,
61+
`BINARY_BASE_NAME`, and `EXTRA_NAME`.
62+
- `EngineNotInstalledError` — raised on instantiation if the companion binary
63+
package is missing, with the `pip install` command to fix it.
64+
65+
Both engines expose `run_redline(author_tag, original, modified, **kwargs)`.
66+
`DocxodusEngine` overrides `_build_command()` to translate kwargs (e.g. `detect_moves`,
67+
`detail_threshold`) into CLI flags. `XmlPowerToolsEngine` uses the legacy
68+
4-positional-arg format and ignores kwargs.
69+
70+
2. **Binary packages** ship one platform archive under
71+
`src/<pkg>/_binaries/<rid>.tar.gz` (or `.zip` for Windows). The archive is
72+
gitignored; CI builds it. The hatchling build hook `hatch_build.py` reads which
73+
RID archive is present and stamps the wheel's platform tag accordingly.
74+
75+
3. **C# sources**:
5076
- `csproj/Program.cs` — Open-XML-PowerTools CLI tool
5177
- `docxodus/tools/redline/Program.cs` — Docxodus CLI tool (git submodule)
5278

53-
Pre-compiled binaries for 6 platform targets (linux/win/osx x x64/arm64) are stored as archives in `src/python_redlines/dist/` and `src/python_redlines/dist_docxodus/`, included in the wheel. The build script `build_differ.py` compiles both engines using `dotnet publish`.
79+
`build_differ.py` compiles an engine for a given RID with `dotnet publish` and
80+
writes a single flat archive into the corresponding binary package's `_binaries/`.
81+
82+
## Build & release flow
83+
84+
- A binary-package wheel must contain **exactly one** platform archive. Each
85+
`build_differ.py <rid>` invocation clears old archives, so build one RID, build
86+
the wheel, repeat.
87+
- `.github/workflows/ci.yml` — tests on each OS (native RID) + builds all wheels.
88+
- `.github/workflows/python-publish.yml` — on release, builds per-platform engine
89+
wheels across 3 OS runners, the core sdist+wheel, and publishes all three packages.
5490

55-
## Key Files
91+
## Version management
5692

57-
- `src/python_redlines/engines.py` — BaseEngine, XmlPowerToolsEngine, and DocxodusEngine classes
58-
- `src/python_redlines/__init__.py` — Exports all engine classes
59-
- `src/python_redlines/__about__.py` — Single source of truth for package version
60-
- `csproj/Program.cs` — Open-XML-PowerTools C# comparison utility
61-
- `docxodus/` — Docxodus git submodule (tools/redline/ contains the CLI)
62-
- `build_differ.py` — Cross-platform C# build orchestration for both engines
63-
- `hatch_run_build_hook.py` — Hatch build hook that triggers C# compilation
64-
- `tests/fixtures/` — Test `.docx` files (original, modified, expected_redline)
93+
`packages/core/src/python_redlines/__about__.py` is the single source of truth.
94+
The two binary packages read it via `[tool.hatch.version] path = "../core/..."`,
95+
so all three always share one version. Bump only that file.
6596

6697
## Testing Notes
6798

68-
Tests must be run from the project root (fixtures use relative paths like `tests/fixtures/original.docx`). The XmlPowerToolsEngine integration test validates that comparing the fixture documents produces exactly 9 revisions. Docxodus uses a different stdout format (`"revision(s) found"` vs `"Revisions found: 9"`).
99+
Tests live in repo-root `tests/` and must be run from the repo root (fixtures use
100+
relative paths like `tests/fixtures/original.docx`). They require all three packages
101+
installed and the binaries built for the current platform. The XmlPowerToolsEngine
102+
integration test validates exactly 9 revisions on the fixture documents.
69103

70104
## Stdout Format Differences
71105

0 commit comments

Comments
 (0)