Skip to content

Latest commit

 

History

History
398 lines (276 loc) · 15.4 KB

File metadata and controls

398 lines (276 loc) · 15.4 KB

Code Review: Math-To-Manim

Date: 2026-03-13 Branch: claude/code-review-Vi9bL Reviewer: Claude (automated deep review)


Executive Summary

Math-To-Manim is a mature research/production hybrid project with a novel six-agent reverse knowledge tree pipeline across three AI backends (Claude, Gemini, Kimi). The codebase has strong fundamentals but accumulated technical debt: root directory clutter, broken showcase assets, stale references, duplicate code, and 52MB of tracked media artifacts.

Severity breakdown: 8 critical, 14 moderate, 16 low-priority findings across structure, code quality, security, and testing.


Critical Findings

1. All Showcase GIFs Are Broken (131-134 bytes)

Location: giffolder/, public/ Impact: The "See It In Action" section of the README — the most important part for attracting users — displays nothing. Every GIF is a Git LFS pointer that was never resolved or is a corrupt placeholder.

Files affected:

  • public/BrownianFinance.gif (131 bytes)
  • public/Rhombicosidodecahedron.gif (134 bytes)
  • public/TeachingHopf.gif (133 bytes)
  • public/WhiskeringExchangeScene.gif (132 bytes)
  • All 12 files in giffolder/ (131-134 bytes each)

Fix: Regenerate real GIF/MP4 previews from the working examples, or replace with static PNG screenshots. The current showcase is effectively invisible.

2. README Contains Multiple Broken References

Location: README.md

Line Issue
33 Typo: "CLaude" → "Claude"
192 Pipeline 3 path: KimiK2Thinking/KimiK2.5Swarm/
207-211 All Kimi file paths reference wrong directory
276 skill/ directory doesn't exist; plugin is at .claude/plugins/math-to-manim/
289 Another KimiK2Thinking/ reference
371 skill/README.md link is broken

Fix: Update all path references to match actual directory structure.

3. 52MB of Media Tracked in Git

Location: media/review_frames/, media/images/ Impact: Repository size bloat. The .gitignore catches media/videos/ and media/Tex/ but misses:

  • media/review_frames/ — 586 PNG frame files
  • media/images/ — miscellaneous images
  • media/*.html — 2 HTML player files

Fix: Add these patterns to .gitignore and remove from tracking.

4. output/ Directory Contains Generated Artifacts

Location: output/ Impact: 352KB of JSON trees, narratives, Python scripts, and HTML files from pipeline test runs are committed. These are ephemeral outputs.

Fix: Add output/ to .gitignore and remove from tracking.


Moderate Findings

5. Root Directory Clutter

Files that don't belong at root level:

File Size Recommendation
ClaudeCodeManim.jpeg 2.7MB Move to public/hero.jpeg, compress
fourier_epicycles.py 18KB Move to examples/mathematics/
EulerTest.md 4KB Move to docs/specs/ or delete
skilltest.md 4KB Move to docs/specs/ or delete

6. src/ Directory Is Cluttered With Scratch Files

Location: src/ 20+ loose files mixed with production code:

  • Prompt drafts: complex_prompt.txt, geodesic_prompt.txt, taylor_prompt.txt, whiskering_prompt.txt
  • Generated outputs: finance_pipeline_output.json, teaching_pipeline_output.json
  • Config files: codingagent.json, planningagent.json
  • LaTeX notes: Taylor_Topology_Notes.tex
  • Temp files: raw_output.txt, STATUS.txt
  • Standalone scenes: finance_scene.py, geodesic_scene.py, taylor_scene.py, whiskering_scene.py

Fix: Move prompts to src/prompts/, outputs to output/, scenes to examples/, delete temp files.

7. Duplicate Agent Implementations

Location: src/agents/ Four versions of the prerequisite explorer exist:

  • prerequisite_explorer.py — original
  • enhanced_prerequisite_explorer.py — enhanced
  • improved_prerequisite_explorer.py — improved
  • prerequisite_explorer_claude.py — Claude SDK version

Plus two orchestrators: agent_orchestrator.py and orchestrator.py.

Fix: Consolidate to one canonical version per component, archive or delete the rest.

8. Duplicate Mermaid Generators

Location: tools/

  • generate_mermaid.py
  • generate_mermaid_v2.py
  • generate_mermaid_v3.py

Fix: Keep only v3, delete older versions.

9. Two Separate scripts/ Directories

  • Root scripts/check_env.py, debug_ffmpeg.py, regenerate_test.py
  • tools/scripts/generate_manim_from_tree.py, remove_emojis.py, run_pipeline_from_latex.py

Fix: Merge into a single scripts/ directory.

10. giffolder/ and public/ Serve Duplicate Purpose

Both directories store the same GIF files. giffolder/ appears to be a testing artifact.

Fix: Consolidate into public/ only, delete giffolder/.

11. requirements.txt Issues

  • claude-agent-sdk>=0.1.0 — package name may not match PyPI
  • nomic>=3.0.0 — commented as "optional" but will fail pip install -r for users who don't want it
  • latex>=0.7.0 — niche package that may conflict with system LaTeX
  • Separate src/requirements.txt exists — conflicting dependency specs
  • No pyproject.toml — modern Python projects should use one

Fix: Create pyproject.toml with optional dependency groups: pip install .[claude], pip install .[gemini], pip install .[kimi].

12. Documentation Sprawl in docs/

30+ files including:

  • 14 PDF files (research papers, generated docs) — bloat the repo
  • Overlapping guides: ARCHITECTURE.md, AGENT_ARCHITECTURE.md, AGENT_PIPELINE_GUIDE.md, AGENT_INSPECTION_GUIDE.md, PROJECT_STRUCTURE.md, REPOSITORY_ORGANIZATION.md
  • Internal planning docs in a public repo: MIGRATION_TO_CLAUDE.md, COMMUNICATION_STRATEGY.md, ROADMAP.md
  • CLAUDE.md is in docs/ but Claude Code looks for it at repo root

Fix: Consolidate overlapping docs, move CLAUDE.md to repo root, consider moving PDFs to releases or an external host.


Low-Priority Findings

13. No CI/CD Pipeline

No .github/workflows/ directory exists. Tests are only runnable locally.

Fix: Add a basic GitHub Actions workflow for linting and testing.

14. KimiK2.5Swarm/legacy/ Contains Old Code

Legacy code tracked in the main branch adds confusion.

Fix: Remove or archive in a separate branch.

15. CONTRIBUTING.md References DeepSeek and Placeholder Links

  • References Discord/Twitter links that appear to be placeholders
  • Mentions DeepSeek API setup (removed from active pipelines)

Fix: Update to current pipelines and real links.

16. pytest.ini Ignores KimiK2Thinking (Old Name)

Should ignore KimiK2.5Swarm instead.

Fix: Update norecursedirs in pytest.ini.

17. No Version Management

No VERSION file, no __version__ constant, no pyproject.toml version field.

Fix: Add version tracking.

18. Star History Chart Position

The star history SVG at the top of the README pushes actual content down.

Fix: Move to the bottom or into a collapsible section.


Branding & Visual Recommendations

  1. Hero image: Replace the 2.7MB screenshot JPEG with a clean, compressed branded graphic (SVG or optimized PNG)
  2. Showcase GIFs: Regenerate real animated previews — this is the #1 thing that sells the project to visitors
  3. Logo: Consider adding a simple SVG logo for brand recognition
  4. Badge cleanup: 10 badges across 2 rows is cluttered; keep Python, Manim, License, Stars
  5. README flow: Lead with the showcase (working GIFs), then architecture, then installation

Proposed Implementation Phases

Phase 1: Root Cleanup & Fix Broken Assets

  • Move stray files out of root
  • Fix/regenerate showcase GIFs
  • Add missing .gitignore patterns
  • Remove tracked generated artifacts

Phase 2: Fix README & Documentation

  • Fix all broken path references
  • Fix typos
  • Consolidate overlapping documentation
  • Move CLAUDE.md to repo root
  • Update CONTRIBUTING.md

Phase 3: Code Organization

  • Clean up src/ scratch files
  • Consolidate duplicate agent implementations
  • Merge scripts/ and tools/scripts/
  • Delete old mermaid generator versions

Phase 4: Build & Dependencies

  • Create pyproject.toml with optional dependency groups
  • Remove duplicate src/requirements.txt
  • Add GitHub Actions CI workflow
  • Add version management

Python Code Quality Findings

Critical: Error Handling

19. Bare Exception Handlers Multiple files use bare except: or except Exception that silently swallow all errors including KeyboardInterrupt and SystemExit:

  • KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py — multiple bare except: statements
  • KimiK2.5Swarm/agents/prerequisite_explorer.py — bare except: statements
  • KimiK2.5Swarm/kimi_client.py:256-263 — catches Exception as e and detects auth errors via fragile string matching
  • KimiK2.5Swarm/kimi_client.py:433-435 — swallows all errors into a string dict without logging

Fix: Replace bare excepts with specific exception types. Add structured logging.

20. Silent Failures in Tool Calling Loop KimiK2.5Swarm/kimi_client.py:420-451 — tool execution errors are caught and converted to JSON silently. Tool failures won't bubble up, making debugging impossible.

Critical: Security

21. Unsafe API Key Handling

  • src/app.py:10api_key=os.getenv("DEEPSEEK_API_KEY") passed directly without validation
  • src/app_claude.py:29 — key used directly, could appear in error messages or logs
  • KimiK2.5Swarm/config.py:212-213MOONSHOT_API_KEY stored as module-level constant (accessible via import config; config.MOONSHOT_API_KEY)

22. Hardcoded Model Names

  • src/app_claude.py:32CLAUDE_MODEL = "claude-opus-4-5-20251101" hardcoded
  • src/app.py:84model="deepseek-reasoner" hardcoded in function call

Fix: Move model names to config or environment variables.

Critical: Incomplete Implementations

23. Unimplemented Pipeline Stages src/agents/agent_orchestrator.py:149-169 — four of seven pipeline stages are marked with # TODO:

  • Mathematical enrichment
  • Visual design
  • Narrative composition
  • Code generation

The pipeline appears complete from the outside but 4/7 stages don't actually work.

24. Placeholder Code

  • src/agents/threejs_code_generator.py:247pass in visualize_threejs() method
  • src/app.py:15-16 — smolagent commented out as unavailable
  • src/agents/claude_agent_runtime.py — incomplete implementation

Moderate: Code Architecture

25. Five Prerequisite Explorer Implementations Zero code sharing between them despite ~80% logic overlap:

  • src/agents/prerequisite_explorer.py (392 lines) — uses DeepSeek API
  • src/agents/prerequisite_explorer_claude.py (467 lines) — uses Anthropic Claude
  • src/agents/enhanced_prerequisite_explorer.py — enhanced version
  • src/agents/improved_prerequisite_explorer.py — improved version
  • KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py — Kimi K2.5 version

Fix: Consolidate into one implementation with a pluggable API client pattern.

26. sys.path Hacks Instead of Proper Packages

  • KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py:19-24 — direct sys.path.insert() manipulations
  • src/agents/agent_orchestrator.py:33-50 — try/except cascade for imports

Fix: Use proper package structure with __init__.py and relative imports.

27. Global Mutable State (Thread Safety)

  • KimiK2.5Swarm/kimi_client.py:581_kimi_client: Optional[KimiClient] = None global singleton
  • KimiK2.5Swarm/tools/tool_registry.py:321_default_registry global singleton
  • src/agents/prerequisite_explorer_claude.py:46global CLI_CLIENT
  • src/app_claude.py:108global video_review_agent in a Gradio web app

Fix: Use dependency injection or proper scoped instances, especially in the web app where concurrent requests will collide.

28. Mixed Sync/Async Patterns Some classes have both execute() and execute_async(), some have only sync, some only async. No consistency across the codebase.

Moderate: Testing

29. ~2% Test Coverage Only 6 test files for 159 Python files (43,242 lines of code):

  • tests/test_prerequisite_explorer.py — 100 lines, partial coverage
  • tests/test_agent_pipeline.py — script-style, not proper pytest
  • tests/test_tool.py — incomplete
  • tests/conftest.py — only 3 fixtures
  • tests/unit/ and tests/integration/ and tests/e2e/ — directories exist but appear empty

30. Tests Require Live API Keys tests/conftest.py skips tests if ANTHROPIC_API_KEY not set. No mocking or fixtures exist. All tests are effectively integration tests.

Fix: Add mocks for API calls, separate unit tests from integration tests.

31. No Error Case Testing No tests for:

  • Invalid inputs or malformed JSON responses
  • API failures, timeouts, rate limits
  • Empty prerequisites, circular dependencies, max depth reached

Low: Code Quality

32. Dead Imports and Commented-Out Code Multiple files have unused imports and commented-out code blocks throughout.

33. Magic Numbers

  • src/agents/prerequisite_explorer_claude.py:90max_depth=4 hardcoded
  • KimiK2.5Swarm/config.py:59max_tokens=4096 hardcoded

34. Long Functions Violating Single Responsibility

  • src/agents/orchestrator.py — 650+ lines
  • src/agents/narrative_composer.py — 450+ lines

35. Inconsistent Logging Some files use logger.debug(), others use print(). KimiK2.5Swarm/agents/prerequisite_explorer_kimi.py has 15+ print statements mixed with logger calls.

36. No Caching Between API Calls KimiK2.5Swarm/agents/prerequisite_explorer.py:103 has a TTL cache but it's not used consistently. Exploring the same concept at different depths re-fetches from the API.


Full Findings Summary

Category Critical Moderate Low Total
Repository Structure 4 8 6 18
Error Handling 2 0 0 2
Security 2 0 0 2
Incomplete Code 2 0 0 2
Architecture 0 4 0 4
Testing 0 3 0 3
Code Quality 0 0 5 5
Total 8 14 14 36

Proposed Implementation Phases (Updated)

Phase 1: Root Cleanup & Fix Broken Assets

  • Move stray files out of root
  • Fix/regenerate showcase GIFs
  • Add missing .gitignore patterns
  • Remove tracked generated artifacts

Phase 2: Fix README & Documentation

  • Fix all broken path references and typos
  • Consolidate overlapping documentation
  • Move CLAUDE.md to repo root
  • Update CONTRIBUTING.md

Phase 3: Code Organization & Quality

  • Clean up src/ scratch files
  • Consolidate 5 prerequisite explorers into 1 with pluggable client
  • Fix bare exception handlers with specific types + logging
  • Replace sys.path hacks with proper package imports
  • Remove global mutable state in web app (thread safety)
  • Merge scripts/ and tools/scripts/
  • Delete old mermaid generator versions and dead code

Phase 4: Testing & CI

  • Add mocks for API calls
  • Write unit tests for core logic (knowledge tree, enrichment)
  • Add error case tests
  • Create GitHub Actions workflow for lint + test

Phase 5: Build & Dependencies

  • Create pyproject.toml with optional dependency groups
  • Remove duplicate src/requirements.txt
  • Add version management
  • Move model names to config/environment

This review was generated by analyzing the full repository structure (159 Python files, 43,242 lines), reading all key files, and cross-referencing documentation against actual code.