Skip to content

OpenCut v1.4.0 — Full Competitive Upgrade

Choose a tag to compare

@SysAdminDoc SysAdminDoc released this 19 Mar 06:26
· 166 commits to main since this release

What's New in v1.4.0

Major release: 74 bug fixes + 25 competitive upgrades across every feature area. OpenCut is now using cutting-edge 2025-2026 AI models throughout.

Bug Fixes (Batches 24-27 — 74 fixes)

  • Batch 25 (25 fixes): Crash fix for batch denoise/normalize, rate limit slot leak on TooManyJobsError (4 routes), XSS via innerHTML, queue allowlist +15 endpoints, SSE CORS, CSS animation GPU optimization, MutationObserver leaks, event delegation, version sync expanded to 17 targets
  • Batch 24 (20 fixes): styled_captions FFmpeg timeout/kill, animated_captions VideoWriter+fps guard, motion_graphics ffprobe+semicolon injection, color_management merged filters, style_transfer intensity clamp, installer Close button + path/disk validation
  • Batch 26 (12 fixes): Dynamic button type (5 sites), ExtendScript project guards, showRecentClips race, ProcessKiller wmic→PowerShell CIM, job history delegation, captions rate limit async fix
  • Batch 27 (17 fixes): MCP server path injection + CSRF refresh, SeamlessM4T per-segment OOM fix, transcript cache Segment attrs, BasicVSR++ missing weights error, ACE-Step result validation, queue allowlist +4, TTS engine allowlist, LUT blend size check, PySceneDetect start boundary

Phase 1 — Quick Wins

  • Background removal default → BiRefNet — dramatically sharper edges
  • Whisper default → Turbo — 6x faster transcription out of the box
  • Distil-Whisper models — distil-large-v3.5 and distil-large-v3 (6x faster, 49% smaller)

Phase 2 — Dependency Upgrades

  • Audio separation: python-audio-separator with Mel-Band RoFormer, BS-RoFormer, SCNet (Demucs archived)
  • Speech enhancement: ClearerVoice-Studio (MossFormer2/FRCRN) — 48kHz, actively maintained
  • Style transfer: Arbitrary style from ANY reference image via AdaIN color transfer
  • Object removal: ProPainter video inpainting — temporally coherent
  • Face enhancement: CodeFormer alongside GFPGAN — tunable fidelity slider
  • Face detection: InsightFace buffalo_l — highest accuracy detector

Phase 3 — New Capabilities

  • ACE-Step 1.5 music generation — full songs WITH vocals+lyrics, 10x faster, <4GB VRAM, Apache 2.0
  • Chatterbox TTS + voice cloning — zero-shot from 5s audio, emotion control, 23 languages. Three-tier: edge-tts / Kokoro / Chatterbox
  • PySceneDetect — fast heuristic scene detection
  • SeamlessM4T v2 translation — 20% BLEU improvement, ~100 languages
  • BasicVSR++ video denoising — GPU temporal propagation across frames
  • NLP keyword auto-emphasis — TF-IDF frequency analysis for caption emphasis
  • AI color grading — LAB perceptual percentile matching for LUT generation
  • LUT blending — interpolate between any two .cube LUTs with a slider
  • Remotion motion graphics — premium animated titles via React/Remotion CLI

Phase 4 — Architecture

  • MCP server — 10-tool Model Context Protocol server for MCP-compatible editor integration
  • Vision-augmented highlights — keyframe sampling + LLM analysis
  • Transcript caching — transcribe once, reuse across all operations
  • Version syncsync_version.py covers 17 files

Full Changelog: v1.3.1...v1.4.0