feat: add ModelsLab as TTS and Video generation provider by adhikjoshi · Pull Request #3288 · simstudioai/sim

adhikjoshi · 2026-02-21T16:04:50Z

Summary

This PR adds ModelsLab as a new provider for both Text-to-Speech (TTS) and Video generation in Sim Studio.

Changes

New Files

apps/sim/tools/tts/modelslab.ts — TTS tool routing to /api/tools/tts/unified
apps/sim/tools/video/modelslab.ts — Video tool routing to /api/tools/video

Modified Files

apps/sim/tools/tts/index.ts — Export modelsLabTtsTool
apps/sim/tools/tts/types.ts — Add 'modelslab' to TtsProvider union type
apps/sim/tools/video/index.ts — Export modelsLabVideoTool
apps/sim/tools/video/types.ts — Add 'modelslab' to VideoParams.provider + new fields (imageUrl, width, height, num_frames)
apps/sim/blocks/blocks/tts.ts — Add ModelsLab to provider dropdown with voice, language, and speed sub-blocks
apps/sim/blocks/blocks/video_generator.ts — Add ModelsLab to provider dropdown with mode, imageUrl, width, height sub-blocks (both V1 and V2 blocks)
apps/sim/app/api/tools/tts/unified/route.ts — Add synthesizeWithModelsLab() with async polling
apps/sim/app/api/tools/video/route.ts — Add generateWithModelsLab() with async polling
apps/sim/tools/registry.ts — Register tts_modelslab and video_modelslab

ModelsLab API

Base URL: https://modelslab.com/api/v6/
Auth: { "key": "API_KEY" } in JSON request body
TTS endpoint: POST /voice/text_to_speech → async polling via POST /voice/fetch
Video (text2video): POST /video/text2video → async polling via POST /video/fetch/{id}
Video (img2video): POST /video/img2video → async polling via POST /video/fetch/{id}

TTS Features

Voice selection (Madison, Joanna, Matthew, Salli, etc.)
Language support (English, Spanish, French, German, Italian, Portuguese, Hindi, Japanese, Chinese)
Speed control (0.5–2.0)

Video Features

Text-to-Video mode
Image-to-Video mode (with image URL input)
Configurable width and height (256/512/768/1024px)
Async job polling with timeout

- Add tools/tts/modelslab.ts: TTS tool using ModelsLab voice API - Add tools/video/modelslab.ts: Video generation tool (text2video/img2video) - Update tools/tts/index.ts: export modelsLabTtsTool - Update tools/video/index.ts: export modelsLabVideoTool - Update tools/tts/types.ts: add 'modelslab' to TtsProvider union - Update tools/video/types.ts: add 'modelslab' to VideoParams provider + ModelsLab fields - Update blocks/blocks/tts.ts: add ModelsLab provider option + voice/language/speed sub-blocks - Update blocks/blocks/video_generator.ts: add ModelsLab provider + mode/imageUrl/width/height sub-blocks - Update app/api/tools/tts/unified/route.ts: add synthesizeWithModelsLab() with async polling - Update app/api/tools/video/route.ts: add generateWithModelsLab() with async polling - Update tools/registry.ts: register tts_modelslab and video_modelslab tools ModelsLab API: https://modelslab.com/api/v6/ - TTS: POST /voice/text_to_speech with async polling via /voice/fetch - Video: POST /video/text2video or /video/img2video with async polling via /video/fetch/{id} - Auth: 'key' field in JSON body

vercel · 2026-02-21T16:04:55Z

@adhikjoshi is attempting to deploy a commit to the Sim Team on Vercel.

A member of the Team first needs to authorize it.

greptile-apps · 2026-02-21T16:07:40Z

Greptile Summary

Added ModelsLab as a new provider for both TTS and video generation. The integration follows established patterns with async polling for job completion.

TTS: supports voice selection, language, and speed control via /api/v6/voice/text_to_speech endpoint
Video: supports text-to-video and image-to-video modes via /api/v6/video/text2video and /api/v6/video/img2video
Both implementations use async polling with timeouts (30 attempts for TTS, 60 for video)
API keys correctly use user-only visibility per custom rule 2851870a
Properly registered in tool registry and exported from index files

Issue found: img2video mode missing required validation - when mode === 'img2video', imageUrl parameter is required but not validated before making the API call, which could result in API errors.

Confidence Score: 4/5

Safe to merge with one validation fix needed
Implementation follows existing patterns and adheres to style guides. API keys use correct visibility. One logical issue exists: missing validation for required imageUrl in img2video mode could cause runtime errors. Otherwise, code is well-structured with proper error handling and async polling.
Pay close attention to apps/sim/app/api/tools/video/route.ts - add validation for img2video mode

Important Files Changed

Filename	Overview
apps/sim/tools/tts/modelslab.ts	new ModelsLab TTS tool with correct API key visibility and proper error handling
apps/sim/tools/video/modelslab.ts	new ModelsLab video tool with proper structure and API key visibility
apps/sim/app/api/tools/tts/unified/route.ts	added ModelsLab TTS synthesis with async polling, proper error handling
apps/sim/app/api/tools/video/route.ts	added ModelsLab video generation but missing required validation for img2video mode
apps/sim/blocks/blocks/tts.ts	added ModelsLab provider option with voice, language, and speed configuration
apps/sim/blocks/blocks/video_generator.ts	added ModelsLab provider to both V1 and V2 blocks with mode, dimensions, and imageUrl config

Sequence Diagram

sequenceDiagram
    participant User
    participant Block as TTS/Video Block
    participant Tool as ModelsLab Tool
    participant API as API Route
    participant ModelsLab as ModelsLab API

    User->>Block: Configure provider=modelslab
    Block->>Tool: Execute with params
    Tool->>API: POST /api/tools/tts/unified or /api/tools/video
    API->>ModelsLab: POST text_to_speech or text2video/img2video
    ModelsLab-->>API: {status: processing, id: xxx}
    
    loop Poll until complete (30-60 attempts)
        API->>ModelsLab: POST /voice/fetch or /video/fetch/{id}
        alt Success
            ModelsLab-->>API: {status: success, output: url}
            API->>ModelsLab: Download audio/video from URL
            ModelsLab-->>API: Binary data
        else Still Processing
            ModelsLab-->>API: {status: processing}
        else Error
            ModelsLab-->>API: {status: error/failed}
        end
    end
    
    API-->>Tool: audioUrl/videoUrl
    Tool-->>Block: Response with file
    Block-->>User: Generated audio/video

_{Last reviewed commit: 30de635}

greptile-apps

_{11 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-21T16:07:43Z

apps/sim/app/api/tools/video/route.ts

+  if (isImg2Video && imageUrl) {
+    requestBody.init_image = imageUrl
+  }


missing validation for img2video mode - imageUrl is required when mode === 'img2video' but not validated before API call

Suggested change

if (isImg2Video && imageUrl) {

requestBody.init_image = imageUrl

}

if (isImg2Video && !imageUrl) {

throw new Error('imageUrl is required for img2video mode')

}

if (isImg2Video && imageUrl) {

adhikjoshi · 2026-02-23T22:45:03Z

Thanks for the review! I will add the img2video validation fix shortly.

adhikjoshi · 2026-02-23T22:49:50Z

I created a fix PR for the img2video validation: adhikjoshi#2

The fix adds:

if (isImg2Video && !imageUrl) {
  throw new Error(\"imageUrl is required for img2video mode\")
}

Would you like me to close that PR and you can cherry-pick/merge the commit, or would you prefer another approach?

adhikjoshi · 2026-02-23T22:51:20Z

Fix Applied

I have pushed the fix for the img2video validation to my fork: adhikjoshi#2

The fix adds this validation before the API call:

if (isImg2Video && !imageUrl) {
  throw new Error('imageUrl is required for img2video mode')
}

You can either:

Pull commit 77b846b from adhikjoshi/sim@fix/modelslab-img2video-validation
Or let me know if you'd like me to update the PR directly

Sorry for the delay - I had to work around GitHub branch restrictions.

When model is set to 'img2video', the imageUrl parameter is required. This validation was flagged in the Greptile review and is now fixed.

cursor · 2026-03-06T05:37:20Z

PR Summary

Medium Risk
Adds new third-party ModelsLab integrations in the TTS and video proxy APIs, including async job polling and downloading remote media, which can affect reliability/timeouts and error handling paths.

Overview
ModelsLab is now supported as a new provider for both Text-to-Speech and video generation.

The unified TTS and video API routes add modelslab branches that call ModelsLab endpoints, poll async jobs to completion, and download the resulting media before storing it like other providers.

Tooling/UI is extended to expose ModelsLab in the TTS and Video Generator blocks (voice/language/speed for TTS; text2video/img2video mode plus image URL and dimensions for video), and the new tts_modelslab/video_modelslab tools are added to exports and registered in the tool registry/types.

^{Written by Cursor Bugbot for commit 3f9cda1. This will update automatically on new commits. Configure here.}

vercel · 2026-03-06T05:37:22Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
docs	Skipped		Mar 6, 2026 5:37am

adhikjoshi · 2026-03-06T05:37:24Z

Fixed! I've pushed the img2video validation fix to the branch (commit 3f9cda1).

The validation now throws a clear error: imageUrl is required for img2video mode when model is set to 'img2video' but no imageUrl is provided.

The fix is a simple validation check before making the API call:

if (isImg2Video && !imageUrl) {
  throw new Error('imageUrl is required for img2video mode')
}

Let me know if any other changes are needed!

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

cursor · 2026-03-06T05:39:46Z

apps/sim/app/api/tools/tts/unified/route.ts

+      }
+
+      attempts++
+    }


TTS polling timeout exceeds route max duration

High Severity

The synthesizeWithModelsLab polling loop can run for up to 90 seconds (30 attempts × 3-second intervals), but the TTS unified route has maxDuration = 60 seconds. The serverless function will be terminated before the polling loop completes, causing a silent failure or platform timeout error for any ModelsLab TTS request that goes async. The video route correctly uses maxDuration = 600 for its 5-minute polling window.

Additional Locations (1)

apps/sim/app/api/tools/tts/unified/route.ts#L23-L24

cursor · 2026-03-06T05:39:46Z

apps/sim/app/api/tools/tts/unified/route.ts

+      format: 'mp3',
+      mimeType: 'audio/mpeg',
+    }
+  }


Duplicate code in TTS and video success handlers

Low Severity

The audio/video download-and-return logic is fully duplicated in both the polling-success and immediate-success branches of synthesizeWithModelsLab (TTS) and generateWithModelsLab (video). Each function contains two identical blocks that fetch the output URL, convert to a Buffer, and return the result. Extracting a small helper within each function would eliminate this duplication.

Additional Locations (2)

apps/sim/app/api/tools/tts/unified/route.ts#L823-L836

apps/sim/app/api/tools/video/route.ts#L1018-L1034

adhikjoshi changed the title ~~feat: Add ModelsLab as TTS and Video generation provider~~ feat: add ModelsLab as TTS and Video generation provider Feb 21, 2026

greptile-apps bot reviewed Feb 21, 2026

View reviewed changes

adhikjoshi mentioned this pull request Feb 23, 2026

fix: add imageUrl validation for img2video mode adhikjoshi/sim#2

Open

adhikjoshi mentioned this pull request Feb 23, 2026

fix: add imageUrl validation for img2video mode #3314

Closed

fix: add imageUrl validation for img2video mode

3f9cda1

When model is set to 'img2video', the imageUrl parameter is required. This validation was flagged in the Greptile review and is now fixed.

vercel bot temporarily deployed to Preview March 6, 2026 05:37 Inactive

cursor bot reviewed Mar 6, 2026

View reviewed changes

Conversation

adhikjoshi commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

New Files

Modified Files

ModelsLab API

TTS Features

Video Features

Uh oh!

vercel bot commented Feb 21, 2026

Uh oh!

greptile-apps bot commented Feb 21, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

adhikjoshi commented Feb 23, 2026

Uh oh!

adhikjoshi commented Feb 23, 2026

Uh oh!

adhikjoshi commented Feb 23, 2026

Fix Applied

Uh oh!

cursor bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

vercel bot commented Mar 6, 2026

Uh oh!

adhikjoshi commented Mar 6, 2026

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

TTS polling timeout exceeds route max duration

Uh oh!

cursor bot Mar 6, 2026

Choose a reason for hiding this comment

Duplicate code in TTS and video success handlers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adhikjoshi commented Feb 21, 2026 •

edited

Loading

cursor bot commented Mar 6, 2026 •

edited

Loading