Skip to content

grey-box/symmetry-project

Project Symmetry - Cross-Language Wikipedia Article Gap Analysis Tool

Grey-box Logo

CI Release Latest Release

Project-Symmetry: Cross-Language Wikipedia Article Semantic Analysis Tool

A semantic analysis tool that compares Wikipedia articles across languages section-by-section and paragraph-by-paragraph to identify content gaps, missing information, and added content. Features word-level diff, revision risk flagging, and language-lag detection.


Prerequisites

Start Everything (Local Development)

# Start both backend and frontend
./start.sh all

# Start with frontend in development mode
./start.sh all --dev

Using Docker Compose

./start.sh docker       # Foreground
./start.sh docker-up    # Detached
./start.sh docker-down  # Stop

Backend

cd symmetry-unified-backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Frontend

cd desktop-electron-frontend
yarn install
yarn start

Access Points


Documentation


Tech Stack

  • Frontend: Electron 26 + React 18 + TypeScript + Vite + Tailwind CSS + shadcn/ui
  • Backend: Python + FastAPI + sentence-transformers + spaCy + MarianMT
  • Comparison Engine: LaBSE sentence embeddings (cosine similarity) + Levenshtein distance for disambiguation

Section Comparison (Primary)

  • POST /symmetry/v1/articles/compare-sections — Compare two Wikipedia articles section-by-section with paragraph-level diffs

Structured Wiki

  • GET /symmetry/v1/wiki/structured-article — Parse article into sections/citations/references
  • GET /symmetry/v1/wiki/paragraph-diff — Word-level semantic diff between two article sections; returns aligned sentence pairs with per-token equal / insert / delete / replace tokens
  • GET /symmetry/v1/wiki/revision-history — Revision history with optional risk flags (include_flags=true)
  • GET /symmetry/v1/wiki/revision-diff — Diff between two revisions with section-level change breakdown

Legacy Comparison

  • POST /symmetry/v1/articles/compare — Plain-text semantic comparison

Models Management

  • GET /models/comparison — List comparison models

Testing

# Backend unit tests (CI-equivalent)
cd symmetry-unified-backend
source venv/bin/activate
python -m pytest -m "not slow and not external" --tb=short

# Frontend E2E tests (requires backend + frontend running)
cd desktop-electron-frontend
npm run test:e2e              # Playwright headless
npm run test:e2e:report       # Open HTML report

CI/CD

The project uses GitHub Actions for continuous integration and automated releases.

Workflows

Workflow Trigger What it does
CI (.github/workflows/ci.yml) Push/PR to main or develop Runs backend tests, builds frontend web bundle, builds & smoke-tests frontend Docker image, runs docker-compose integration
Release (.github/workflows/release.yml) Push to main Runs full CI → bumps version (semver) → creates git tag → creates GitHub release → publishes Docker images to GHCR

Docker Images

Released images are available from GHCR:

# Pull specific version
docker pull ghcr.io/grey-box/symmetry-project/backend:1.1.0
docker pull ghcr.io/grey-box/symmetry-project/frontend:1.1.0

# Pull latest
docker pull ghcr.io/grey-box/symmetry-project/backend:latest
docker pull ghcr.io/grey-box/symmetry-project/frontend:latest

Note: Frontend and backend are versioned independently. See CHANGELOG.md for the version history of each component.


Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Install dependencies (see Quick Start)
  4. Make changes and run tests
  5. Use Conventional Commits for your commit messages
  6. Submit a pull request to develop
  7. After review, PRs are merged to develop, then promoted to main for release

Community


License

This project is licensed under the appropriate license. See the LICENSE file for details.


Acknowledgments

  • Grey Box: Project development and maintenance
  • Wikipedia: Source content and API access
  • Open Source Community: Libraries and tools

Last Updated: March 2026 Version: 2.0.0 Maintainers: grey-box

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors