Code Review: Math-To-Manim

Date: 2026-03-13 Branch: claude/code-review-Vi9bL Reviewer: Claude (automated deep review)

Executive Summary

Math-To-Manim is a mature research/production hybrid project with a novel six-agent reverse knowledge tree pipeline across three AI backends (Claude, Gemini, Kimi). The codebase has strong fundamentals but accumulated technical debt: root directory clutter, broken showcase assets, stale references, duplicate code, and 52MB of tracked media artifacts.

Severity breakdown: 8 critical, 14 moderate, 16 low-priority findings across structure, code quality, security, and testing.

Critical Findings

1. All Showcase GIFs Are Broken (131-134 bytes)

Location: giffolder/, public/ Impact: The "See It In Action" section of the README — the most important part for attracting users — displays nothing. Every GIF is a Git LFS pointer that was never resolved or is a corrupt placeholder.

Files affected:

public/BrownianFinance.gif (131 bytes)
public/Rhombicosidodecahedron.gif (134 bytes)
public/TeachingHopf.gif (133 bytes)
public/WhiskeringExchangeScene.gif (132 bytes)
All 12 files in giffolder/ (131-134 bytes each)

Fix: Regenerate real GIF/MP4 previews from the working examples, or replace with static PNG screenshots. The current showcase is effectively invisible.

2. README Contains Multiple Broken References

Location: README.md

Line	Issue
33	Typo: "CLaude" → "Claude"
192	Pipeline 3 path: `KimiK2Thinking/` → `KimiK2.5Swarm/`
207-211	All Kimi file paths reference wrong directory
276	`skill/` directory doesn't exist; plugin is at `.claude/plugins/math-to-manim/`
289	Another `KimiK2Thinking/` reference
371	`skill/README.md` link is broken

Fix: Update all path references to match actual directory structure.

3. 52MB of Media Tracked in Git

Location: media/review_frames/, media/images/ Impact: Repository size bloat. The .gitignore catches media/videos/ and media/Tex/ but misses:

media/review_frames/ — 586 PNG frame files
media/images/ — miscellaneous images
media/*.html — 2 HTML player files

Fix: Add these patterns to .gitignore and remove from tracking.

4. `output/` Directory Contains Generated Artifacts

Location: output/ Impact: 352KB of JSON trees, narratives, Python scripts, and HTML files from pipeline test runs are committed. These are ephemeral outputs.

Fix: Add output/ to .gitignore and remove from tracking.

Moderate Findings

5. Root Directory Clutter

Files that don't belong at root level:

File	Size	Recommendation
`ClaudeCodeManim.jpeg`	2.7MB	Move to `public/hero.jpeg`, compress
`fourier_epicycles.py`	18KB	Move to `examples/mathematics/`
`EulerTest.md`	4KB	Move to `docs/specs/` or delete
`skilltest.md`	4KB	Move to `docs/specs/` or delete

6. `src/` Directory Is Cluttered With Scratch Files

Location: src/ 20+ loose files mixed with production code:

Prompt drafts: complex_prompt.txt, geodesic_prompt.txt, taylor_prompt.txt, whiskering_prompt.txt
Generated outputs: finance_pipeline_output.json, teaching_pipeline_output.json
Config files: codingagent.json, planningagent.json
LaTeX notes: Taylor_Topology_Notes.tex
Temp files: raw_output.txt, STATUS.txt
Standalone scenes: finance_scene.py, geodesic_scene.py, taylor_scene.py, whiskering_scene.py

Fix: Move prompts to src/prompts/, outputs to output/, scenes to examples/, delete temp files.

7. Duplicate Agent Implementations

Location: src/agents/ Four versions of the prerequisite explorer exist:

prerequisite_explorer.py — original
enhanced_prerequisite_explorer.py — enhanced
improved_prerequisite_explorer.py — improved
prerequisite_explorer_claude.py — Claude SDK version

Plus two orchestrators: agent_orchestrator.py and orchestrator.py.

Fix: Consolidate to one canonical version per component, archive or delete the rest.

8. Duplicate Mermaid Generators

Location: tools/

generate_mermaid.py
generate_mermaid_v2.py
generate_mermaid_v3.py

Fix: Keep only v3, delete older versions.

9. Two Separate `scripts/` Directories

Root scripts/ — check_env.py, debug_ffmpeg.py, regenerate_test.py
tools/scripts/ — generate_manim_from_tree.py, remove_emojis.py, run_pipeline_from_latex.py

Fix: Merge into a single scripts/ directory.

10. `giffolder/` and `public/` Serve Duplicate Purpose

Both directories store the same GIF files. giffolder/ appears to be a testing artifact.

Fix: Consolidate into public/ only, delete giffolder/.

11. `requirements.txt` Issues

claude-agent-sdk>=0.1.0 — package name may not match PyPI
nomic>=3.0.0 — commented as "optional" but will fail pip install -r for users who don't want it
latex>=0.7.0 — niche package that may conflict with system LaTeX
Separate src/requirements.txt exists — conflicting dependency specs
No pyproject.toml — modern Python projects should use one

Fix: Create pyproject.toml with optional dependency groups: pip install .[claude], pip install .[gemini], pip install .[kimi].

12. Documentation Sprawl in `docs/`

30+ files including:

14 PDF files (research papers, generated docs) — bloat the repo
Overlapping guides: ARCHITECTURE.md, AGENT_ARCHITECTURE.md, AGENT_PIPELINE_GUIDE.md, AGENT_INSPECTION_GUIDE.md, PROJECT_STRUCTURE.md, REPOSITORY_ORGANIZATION.md
Internal planning docs in a public repo: MIGRATION_TO_CLAUDE.md, COMMUNICATION_STRATEGY.md, ROADMAP.md
CLAUDE.md is in docs/ but Claude Code looks for it at repo root

Fix: Consolidate overlapping docs, move CLAUDE.md to repo root, consider moving PDFs to releases or an external host.

Low-Priority Findings

13. No CI/CD Pipeline

No .github/workflows/ directory exists. Tests are only runnable locally.

Fix: Add a basic GitHub Actions workflow for linting and testing.

14. `KimiK2.5Swarm/legacy/` Contains Old Code

Legacy code tracked in the main branch adds confusion.

Fix: Remove or archive in a separate branch.

15. `CONTRIBUTING.md` References DeepSeek and Placeholder Links

References Discord/Twitter links that appear to be placeholders
Mentions DeepSeek API setup (removed from active pipelines)

Fix: Update to current pipelines and real links.

16. `pytest.ini` Ignores `KimiK2Thinking` (Old Name)

Should ignore KimiK2.5Swarm instead.

Fix: Update norecursedirs in pytest.ini.

17. No Version Management

No VERSION file, no __version__ constant, no pyproject.toml version field.

Fix: Add version tracking.

18. Star History Chart Position

The star history SVG at the top of the README pushes actual content down.

Fix: Move to the bottom or into a collapsible section.

Branding & Visual Recommendations

Hero image: Replace the 2.7MB screenshot JPEG with a clean, compressed branded graphic (SVG or optimized PNG)
Showcase GIFs: Regenerate real animated previews — this is the #1 thing that sells the project to visitors
Logo: Consider adding a simple SVG logo for brand recognition
Badge cleanup: 10 badges across 2 rows is cluttered; keep Python, Manim, License, Stars
README flow: Lead with the showcase (working GIFs), then architecture, then installation

Proposed Implementation Phases

Phase 1: Root Cleanup & Fix Broken Assets

Move stray files out of root
Fix/regenerate showcase GIFs
Add missing .gitignore patterns
Remove tracked generated artifacts

Phase 2: Fix README & Documentation

Fix all broken path references
Fix typos
Consolidate overlapping documentation
Move CLAUDE.md to repo root
Update CONTRIBUTING.md

Phase 3: Code Organization

Clean up src/ scratch files
Consolidate duplicate agent implementations
Merge scripts/ and tools/scripts/
Delete old mermaid generator versions

Phase 4: Build & Dependencies

Create pyproject.toml with optional dependency groups
Remove duplicate src/requirements.txt
Add GitHub Actions CI workflow
Add version management

Python Code Quality Findings

Critical: Error Handling

19. Bare Exception Handlers Multiple files use bare except: or except Exception that silently swallow all errors including KeyboardInterrupt and SystemExit:

KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py — multiple bare except: statements
KimiK2.5Swarm/agents/prerequisite_explorer.py — bare except: statements
KimiK2.5Swarm/kimi_client.py:256-263 — catches Exception as e and detects auth errors via fragile string matching
KimiK2.5Swarm/kimi_client.py:433-435 — swallows all errors into a string dict without logging

Fix: Replace bare excepts with specific exception types. Add structured logging.

20. Silent Failures in Tool Calling Loop KimiK2.5Swarm/kimi_client.py:420-451 — tool execution errors are caught and converted to JSON silently. Tool failures won't bubble up, making debugging impossible.

Critical: Security

21. Unsafe API Key Handling

src/app.py:10 — api_key=os.getenv("DEEPSEEK_API_KEY") passed directly without validation
src/app_claude.py:29 — key used directly, could appear in error messages or logs
KimiK2.5Swarm/config.py:212-213 — MOONSHOT_API_KEY stored as module-level constant (accessible via import config; config.MOONSHOT_API_KEY)

22. Hardcoded Model Names

src/app_claude.py:32 — CLAUDE_MODEL = "claude-opus-4-5-20251101" hardcoded
src/app.py:84 — model="deepseek-reasoner" hardcoded in function call

Fix: Move model names to config or environment variables.

Critical: Incomplete Implementations

23. Unimplemented Pipeline Stages src/agents/agent_orchestrator.py:149-169 — four of seven pipeline stages are marked with # TODO:

Mathematical enrichment
Visual design
Narrative composition
Code generation

The pipeline appears complete from the outside but 4/7 stages don't actually work.

24. Placeholder Code

src/agents/threejs_code_generator.py:247 — pass in visualize_threejs() method
src/app.py:15-16 — smolagent commented out as unavailable
src/agents/claude_agent_runtime.py — incomplete implementation

Moderate: Code Architecture

25. Five Prerequisite Explorer Implementations Zero code sharing between them despite ~80% logic overlap:

src/agents/prerequisite_explorer.py (392 lines) — uses DeepSeek API
src/agents/prerequisite_explorer_claude.py (467 lines) — uses Anthropic Claude
src/agents/enhanced_prerequisite_explorer.py — enhanced version
src/agents/improved_prerequisite_explorer.py — improved version
KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py — Kimi K2.5 version

Fix: Consolidate into one implementation with a pluggable API client pattern.

26. sys.path Hacks Instead of Proper Packages

KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py:19-24 — direct sys.path.insert() manipulations
src/agents/agent_orchestrator.py:33-50 — try/except cascade for imports

Fix: Use proper package structure with __init__.py and relative imports.

27. Global Mutable State (Thread Safety)

KimiK2.5Swarm/kimi_client.py:581 — _kimi_client: Optional[KimiClient] = None global singleton
KimiK2.5Swarm/tools/tool_registry.py:321 — _default_registry global singleton
src/agents/prerequisite_explorer_claude.py:46 — global CLI_CLIENT
src/app_claude.py:108 — global video_review_agent in a Gradio web app

Fix: Use dependency injection or proper scoped instances, especially in the web app where concurrent requests will collide.

28. Mixed Sync/Async Patterns Some classes have both execute() and execute_async(), some have only sync, some only async. No consistency across the codebase.

Moderate: Testing

29. ~2% Test Coverage Only 6 test files for 159 Python files (43,242 lines of code):

tests/test_prerequisite_explorer.py — 100 lines, partial coverage
tests/test_agent_pipeline.py — script-style, not proper pytest
tests/test_tool.py — incomplete
tests/conftest.py — only 3 fixtures
tests/unit/ and tests/integration/ and tests/e2e/ — directories exist but appear empty

30. Tests Require Live API Keys tests/conftest.py skips tests if ANTHROPIC_API_KEY not set. No mocking or fixtures exist. All tests are effectively integration tests.

Fix: Add mocks for API calls, separate unit tests from integration tests.

31. No Error Case Testing No tests for:

Invalid inputs or malformed JSON responses
API failures, timeouts, rate limits
Empty prerequisites, circular dependencies, max depth reached

Low: Code Quality

32. Dead Imports and Commented-Out Code Multiple files have unused imports and commented-out code blocks throughout.

33. Magic Numbers

src/agents/prerequisite_explorer_claude.py:90 — max_depth=4 hardcoded
KimiK2.5Swarm/config.py:59 — max_tokens=4096 hardcoded

34. Long Functions Violating Single Responsibility

src/agents/orchestrator.py — 650+ lines
src/agents/narrative_composer.py — 450+ lines

35. Inconsistent Logging Some files use logger.debug(), others use print(). KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py has 15+ print statements mixed with logger calls.

36. No Caching Between API Calls KimiK2.5Swarm/agents/prerequisite_explorer.py:103 has a TTL cache but it's not used consistently. Exploring the same concept at different depths re-fetches from the API.

Full Findings Summary

Category	Critical	Moderate	Low	Total
Repository Structure	4	8	6	18
Error Handling	2	0	0	2
Security	2	0	0	2
Incomplete Code	2	0	0	2
Architecture	0	4	0	4
Testing	0	3	0	3
Code Quality	0	0	5	5
Total	8	14	14	36

Proposed Implementation Phases (Updated)

Phase 1: Root Cleanup & Fix Broken Assets

Move stray files out of root
Fix/regenerate showcase GIFs
Add missing .gitignore patterns
Remove tracked generated artifacts

Phase 2: Fix README & Documentation

Fix all broken path references and typos
Consolidate overlapping documentation
Move CLAUDE.md to repo root
Update CONTRIBUTING.md

Phase 3: Code Organization & Quality

Clean up src/ scratch files
Consolidate 5 prerequisite explorers into 1 with pluggable client
Fix bare exception handlers with specific types + logging
Replace sys.path hacks with proper package imports
Remove global mutable state in web app (thread safety)
Merge scripts/ and tools/scripts/
Delete old mermaid generator versions and dead code

Phase 4: Testing & CI

Add mocks for API calls
Write unit tests for core logic (knowledge tree, enrichment)
Add error case tests
Create GitHub Actions workflow for lint + test

Phase 5: Build & Dependencies

Create pyproject.toml with optional dependency groups
Remove duplicate src/requirements.txt
Add version management
Move model names to config/environment

This review was generated by analyzing the full repository structure (159 Python files, 43,242 lines), reading all key files, and cross-referencing documentation against actual code.

FilesExpand file tree

CODE_REVIEW.md

Latest commit

History

CODE_REVIEW.md

File metadata and controls

Code Review: Math-To-Manim

Executive Summary

Critical Findings

1. All Showcase GIFs Are Broken (131-134 bytes)

2. README Contains Multiple Broken References

3. 52MB of Media Tracked in Git

4. output/ Directory Contains Generated Artifacts

Moderate Findings

5. Root Directory Clutter

6. src/ Directory Is Cluttered With Scratch Files

7. Duplicate Agent Implementations

8. Duplicate Mermaid Generators

9. Two Separate scripts/ Directories

10. giffolder/ and public/ Serve Duplicate Purpose

11. requirements.txt Issues

12. Documentation Sprawl in docs/

Low-Priority Findings

13. No CI/CD Pipeline

14. KimiK2.5Swarm/legacy/ Contains Old Code

15. CONTRIBUTING.md References DeepSeek and Placeholder Links

16. pytest.ini Ignores KimiK2Thinking (Old Name)

17. No Version Management

18. Star History Chart Position

Branding & Visual Recommendations

Proposed Implementation Phases

Phase 1: Root Cleanup & Fix Broken Assets

Phase 2: Fix README & Documentation

Phase 3: Code Organization

Phase 4: Build & Dependencies

Python Code Quality Findings

Critical: Error Handling

Critical: Security

Critical: Incomplete Implementations

Moderate: Code Architecture

Moderate: Testing

Low: Code Quality

Full Findings Summary

Proposed Implementation Phases (Updated)

Phase 1: Root Cleanup & Fix Broken Assets

Phase 2: Fix README & Documentation

Phase 3: Code Organization & Quality

Phase 4: Testing & CI

Phase 5: Build & Dependencies

4. `output/` Directory Contains Generated Artifacts

6. `src/` Directory Is Cluttered With Scratch Files

9. Two Separate `scripts/` Directories

10. `giffolder/` and `public/` Serve Duplicate Purpose

11. `requirements.txt` Issues

12. Documentation Sprawl in `docs/`

14. `KimiK2.5Swarm/legacy/` Contains Old Code

15. `CONTRIBUTING.md` References DeepSeek and Placeholder Links

16. `pytest.ini` Ignores `KimiK2Thinking` (Old Name)