Skip to content

fix: bypass broken workflow package for self-hosted transcription and AI generation#1630

Open
oaris-dev wants to merge 2 commits intoCapSoftware:mainfrom
oaris-dev:feature/fix-self-hosted-transcription
Open

fix: bypass broken workflow package for self-hosted transcription and AI generation#1630
oaris-dev wants to merge 2 commits intoCapSoftware:mainfrom
oaris-dev:feature/fix-self-hosted-transcription

Conversation

@oaris-dev
Copy link

@oaris-dev oaris-dev commented Feb 24, 2026

Summary

Fixes self-hosted transcription and AI generation by bypassing the broken workflow package [local world] mode. Resolves #1550.

  • lib/transcribe.ts: Replace start(transcribeVideoWorkflow, ...) with direct transcribeVideoDirect() that validates, extracts audio, calls Deepgram, saves VTT, and cleans up
  • lib/generate-ai.ts: Replace start(generateAiWorkflow, ...) with direct generateAiDirect() that fetches transcript, calls AI APIs (Groq → OpenAI fallback), and saves metadata
  • actions/videos/get-status.ts: Set PROCESSING in DB before firing transcription (prevents re-trigger loops on every 2s poll), add 3-minute stale PROCESSING → ERROR timeout
  • HomePage components: Fix TypeScript strict mode errors in 4 files (unrelated, split into separate commit)

Context

The workflow package (4.0.1-beta.42) [local world] mode crashes with TypeError: Cannot perform ArrayBuffer.prototype.slice on a detached ArrayBuffer on all Node versions (20, 22, 24). This breaks transcription and AI generation for every self-hosted Docker deployment (7+ users confirmed in #1550).

The root cause: start() resolves successfully, but the workflow crashes asynchronously in the background queue. The catch block in the caller never fires, so transcriptionStatus stays null → the 2-second polling loop re-triggers transcription indefinitely.

Cap Cloud uses a distributed workflow runner (web-cluster via WORKFLOWS_RPC_URL) which doesn't have this issue. The workflow files are preserved so Cap Cloud continues to work unchanged.

We've tested this on a self-hosted staging instance (ghcr.io/oaris-dev/cap-web:staging) with Deepgram + OpenAI configured — transcription, AI generation, NO_AUDIO detection, and stale timeout all work correctly.

Test plan

  • Deploy self-hosted Docker image with DEEPGRAM_API_KEY configured
  • Upload a video with audio → verify transcription completes (VTT generated, status COMPLETE)
  • Verify AI generation completes if GROQ_API_KEY or OPENAI_API_KEY is set
  • Upload a video without audio → verify NO_AUDIO status (not stuck at PROCESSING)
  • Verify stale PROCESSING timeout triggers after 3 minutes

🤖 Generated with Claude Code

Greptile Summary

Bypasses broken workflow package by replacing async workflow calls with direct synchronous implementations for transcription and AI generation in self-hosted deployments. The workflow package's [local world] mode crashes with ArrayBuffer detachment errors on all Node versions, breaking all self-hosted Docker instances.

Major changes:

  • transcribe.ts: Direct implementation validates video, extracts audio (via media server or FFmpeg), calls Deepgram API, saves VTT to S3, and updates DB status
  • generate-ai.ts: Direct implementation fetches VTT from S3, chunks transcript, calls Groq (with OpenAI fallback), generates title/summary/chapters, and updates metadata
  • get-status.ts: Sets PROCESSING status in DB before firing transcription (prevents infinite polling loops), adds 3-minute stale timeout for stuck PROCESSING records
  • HomePage components: TypeScript strict mode fixes for array access safety (unrelated to workflow fix)

Issues found:

  • Missing cleanup in error path for extracted audio file (transcribe.ts:198-203)
  • Race condition with duplicate PROCESSING status writes between get-status and transcribe function
  • Timeout logic uses generic updatedAt which could trigger incorrectly if video metadata changes
  • Synchronous processing blocks server actions for 30-60+ seconds (acceptable for self-hosted, but consider background jobs)

Confidence Score: 3/5

  • Safe for self-hosted deployments but has minor resource cleanup and race condition issues
  • Core workflow bypass logic is sound and solves the critical self-hosted breakage. However, missing error cleanup in audio extraction could leak temp files, race condition creates redundant DB writes, and timeout logic could misfire. Synchronous processing is acceptable for self-hosted but may cause connection timeouts on slow networks. TypeScript fixes are clean.
  • Pay close attention to apps/web/lib/transcribe.ts (cleanup leak) and apps/web/actions/videos/get-status.ts (race condition, timeout logic)

Important Files Changed

Filename Overview
apps/web/lib/transcribe.ts Replaces workflow with direct transcription; missing error cleanup for audio buffer in media server path
apps/web/lib/generate-ai.ts Replaces workflow with direct AI generation; comprehensive implementation with proper fallback handling
apps/web/actions/videos/get-status.ts Adds PROCESSING status before transcription trigger and 3-minute timeout; potential race condition on status update

Sequence Diagram

sequenceDiagram
    participant Client as Client (2s poll)
    participant GetStatus as get-status.ts
    participant Transcribe as transcribe.ts
    participant AI as generate-ai.ts
    participant Deepgram
    participant Groq
    participant S3
    participant DB

    Client->>GetStatus: getVideoStatus(videoId)
    GetStatus->>DB: Check transcriptionStatus
    
    alt Status is null
        GetStatus->>DB: Set PROCESSING
        GetStatus-->>Client: Return PROCESSING
        GetStatus->>Transcribe: transcribeVideo(..., _isRetry=true)
        Note over GetStatus,Transcribe: Fire-and-forget (catch)
    end
    
    Transcribe->>DB: Set PROCESSING (redundant)
    Transcribe->>S3: getSignedObjectUrl(video)
    Transcribe->>Transcribe: Extract audio (FFmpeg/MediaServer)
    Transcribe->>Deepgram: transcribeFile(audioBuffer)
    Deepgram-->>Transcribe: DeepgramResult
    Transcribe->>Transcribe: formatToWebVTT
    Transcribe->>S3: putObject(transcription.vtt)
    Transcribe->>DB: Set COMPLETE
    
    alt aiGenerationEnabled
        Transcribe->>AI: startAiGeneration(videoId)
        AI->>DB: Set aiGenerationStatus=PROCESSING
        AI->>S3: getObject(transcription.vtt)
        AI->>AI: parseVTT, chunkTranscript
        
        loop for each chunk
            AI->>Groq: chat.completions.create
            alt Groq fails
                AI->>Groq: Fallback to OpenAI
            end
            Groq-->>AI: AI summary chunk
        end
        
        AI->>AI: Merge chunks, dedupe chapters
        AI->>DB: Update metadata (title, summary, chapters)
        AI->>DB: Set aiGenerationStatus=COMPLETE
    end
    
    Client->>GetStatus: Poll again (2s later)
    alt PROCESSING > 3 minutes
        GetStatus->>DB: Set ERROR (timeout)
        GetStatus-->>Client: Return ERROR
    else
        GetStatus-->>Client: Return current status
    end
Loading

Last reviewed commit: ce2819d

(2/5) Greptile learns from your feedback when you react with thumbs up/down!

oaris-dev and others added 2 commits February 24, 2026 11:24
… AI generation

The `workflow` package (4.0.1-beta.42) `[local world]` mode crashes with
`TypeError: Cannot perform ArrayBuffer.prototype.slice on a detached ArrayBuffer`
on all Node versions (20, 22, 24), breaking transcription and AI generation
for every self-hosted Docker deployment. See CapSoftware#1550.

This replaces `start(transcribeVideoWorkflow, ...)` and
`start(generateAiWorkflow, ...)` with direct async function calls that
perform the same operations without workflow/step directives.

Changes:
- lib/transcribe.ts: Replace workflow dispatch with transcribeVideoDirect()
  that validates, extracts audio, calls Deepgram, saves VTT, and cleans up
- lib/generate-ai.ts: Replace workflow dispatch with generateAiDirect()
  that fetches transcript, calls AI APIs, and saves metadata
- actions/videos/get-status.ts: Set PROCESSING before firing transcription
  to prevent re-trigger loops, add 3-minute stale PROCESSING timeout

The workflow files are preserved so Cap Cloud's distributed execution
(via web-cluster/WORKFLOWS_RPC_URL) continues to work unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix 'possibly undefined' errors in IntersectionObserver callbacks and
array indexing across four HomePage components:

- InstantModeDetail: guard IntersectionObserver entry and TABS indexing
- RecordingModePicker: guard IntersectionObserver entry and modes indexing
- ScreenshotModeDetail: guard IntersectionObserver entry and AUTO_CONFIGS
- StudioModeDetail: guard AUTO_CONFIGS indexing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 24, 2026 10:32
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 7 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +198 to +203
const extracted = await extractAudioFromUrl(videoUrl);
try {
audioBuffer = await fs.readFile(extracted.filePath);
} finally {
await extracted.cleanup();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing try-finally for cleanup if error occurs after extractAudioFromUrl but before VTT upload

Suggested change
const extracted = await extractAudioFromUrl(videoUrl);
try {
audioBuffer = await fs.readFile(extracted.filePath);
} finally {
await extracted.cleanup();
}
const extracted = await extractAudioFromUrl(videoUrl);
try {
audioBuffer = await fs.readFile(extracted.filePath);
} catch (error) {
await extracted.cleanup();
throw error;
}
await extracted.cleanup();
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/lib/transcribe.ts
Line: 198-203

Comment:
missing try-finally for cleanup if error occurs after `extractAudioFromUrl` but before VTT upload

```suggestion
		const extracted = await extractAudioFromUrl(videoUrl);
		try {
			audioBuffer = await fs.readFile(extracted.filePath);
		} catch (error) {
			await extracted.cleanup();
			throw error;
		}
		await extracted.cleanup();
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +81 to +84
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

race condition: transcribeVideo on line 86 also sets PROCESSING at line 143, creating duplicate DB writes. Since transcribeVideo is called with _isRetry=true, it will skip the PROCESSING check anyway

Suggested change
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId));
transcribeVideo(videoId, video.ownerId, false, true).catch((error) => {
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/actions/videos/get-status.ts
Line: 81-84

Comment:
race condition: `transcribeVideo` on line 86 also sets PROCESSING at line 143, creating duplicate DB writes. Since `transcribeVideo` is called with `_isRetry=true`, it will skip the PROCESSING check anyway

```suggestion
		transcribeVideo(videoId, video.ownerId, false, true).catch((error) => {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +104 to +105
const threeMinutesAgo = new Date(Date.now() - 3 * 60 * 1000);
if (video.updatedAt < threeMinutesAgo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout based on updatedAt will incorrectly trigger if video record is updated for unrelated reasons (e.g. metadata changes, view count). Consider using a dedicated processingStartedAt timestamp or checking transcriptionStatus update time specifically

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/actions/videos/get-status.ts
Line: 104-105

Comment:
timeout based on `updatedAt` will incorrectly trigger if video record is updated for unrelated reasons (e.g. metadata changes, view count). Consider using a dedicated `processingStartedAt` timestamp or checking `transcriptionStatus` update time specifically

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +140 to +143
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId as Video.VideoId));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant DB write - caller in get-status.ts:81-84 already sets PROCESSING before calling this function with _isRetry=true

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/lib/transcribe.ts
Line: 140-143

Comment:
redundant DB write - caller in `get-status.ts:81-84` already sets PROCESSING before calling this function with `_isRetry=true`

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +213 to +215
const hasDatePattern = /\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}/.test(
video.name || "",
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regex could match partial timestamp patterns in user-provided video names. Consider anchoring the regex or adding word boundaries to avoid false positives

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/lib/generate-ai.ts
Line: 213-215

Comment:
regex could match partial timestamp patterns in user-provided video names. Consider anchoring the regex or adding word boundaries to avoid false positives

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

aiGenerationEnabled,
},
]);
await transcribeVideoDirect(videoId, userId, aiGenerationEnabled);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronous call blocks server action completion. If transcription takes 30+ seconds, this will hold the connection open. Consider moving the await to background or using a job queue pattern

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/lib/transcribe.ts
Line: 114

Comment:
synchronous call blocks server action completion. If transcription takes 30+ seconds, this will hold the connection open. Consider moving the await to background or using a job queue pattern

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

.where(eq(videos.id, videoId));

await start(generateAiWorkflow, [{ videoId, userId }]);
await generateAiDirect(videoId, userId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronous call blocks server action completion for entire AI generation (potentially 60+ seconds for multi-chunk processing). Consider using background processing

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/web/lib/generate-ai.ts
Line: 87

Comment:
synchronous call blocks server action completion for entire AI generation (potentially 60+ seconds for multi-chunk processing). Consider using background processing

How can I resolve this? If you propose a fix, please make it concise.

.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId));

transcribeVideo(videoId, video.ownerId, false, true).catch((error) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transcribeVideo(videoId, video.ownerId, false, true) is pretty hard to read/maintain (easy to swap booleans accidentally). Would you be open to at least naming the flags at the callsite?

Suggested change
transcribeVideo(videoId, video.ownerId, false, true).catch((error) => {
const aiGenerationEnabled = false;
const isRetry = true;
transcribeVideo(videoId, video.ownerId, aiGenerationEnabled, isRetry).catch((error) => {


if (video.transcriptionStatus === "PROCESSING") {
const threeMinutesAgo = new Date(Date.now() - 3 * 60 * 1000);
if (video.updatedAt < threeMinutesAgo) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick sanity check: does updatedAt definitely bump when you update transcriptionStatus? If it doesn’t, this timeout might never trip for a stuck PROCESSING row.

method: "GET",
headers: { range: "bytes=0-0" },
});
if (!headResponse.ok) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a GET with a body (even if it’s 1 byte), it might be worth cancelling the response body to avoid keeping sockets open longer than needed.

Suggested change
if (!headResponse.ok) {
headResponse.body?.cancel();
if (!headResponse.ok) {

Comment on lines +316 to +326
const aiRes = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${serverEnv().OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
}),
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a timeout here. If the OpenAI request hangs, aiGenerationStatus will stay PROCESSING indefinitely (and there’s no stale timeout like transcription has).

Suggested change
const aiRes = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${serverEnv().OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
}),
});
const aiRes = await fetch("https://api.openai.com/v1/chat/completions", {
method: "POST",
signal: AbortSignal.timeout(60_000),
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${serverEnv().OPENAI_API_KEY}`,
},
body: JSON.stringify({
model: "gpt-4o-mini",
messages: [{ role: "user", content: prompt }],
}),
});

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses critical failures in self-hosted Docker deployments by replacing the broken workflow package (local world mode) with direct function calls for transcription and AI generation. The PR also adds a 3-minute stale processing timeout and fixes TypeScript strict mode errors in HomePage components.

Changes:

  • Replace workflow-based async queue with synchronous direct transcription and AI generation functions
  • Add timeout detection to prevent videos from being stuck in PROCESSING state indefinitely
  • Fix TypeScript strict mode errors in HomePage animation components

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
apps/web/lib/transcribe.ts Replaces workflow call with direct transcribeVideoDirect function that handles audio extraction, Deepgram transcription, and VTT generation
apps/web/lib/generate-ai.ts Replaces workflow call with direct generateAiDirect function that fetches transcripts and calls AI APIs (Groq with OpenAI fallback)
apps/web/actions/videos/get-status.ts Sets PROCESSING status before firing transcription to prevent re-trigger loops; adds 3-minute timeout for stale PROCESSING states
apps/web/components/pages/HomePage/StudioModeDetail.tsx Adds null check for AUTO_CONFIGS array access and IntersectionObserver entry
apps/web/components/pages/HomePage/ScreenshotModeDetail.tsx Adds null checks for array access and IntersectionObserver entry
apps/web/components/pages/HomePage/RecordingModePicker.tsx Adds null checks for modes array access and IntersectionObserver entry
apps/web/components/pages/HomePage/InstantModeDetail.tsx Adds null checks for IntersectionObserver entry and TABS array access

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +81 to +91
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId));

transcribeVideo(videoId, video.ownerId, false, true).catch((error) => {
console.error(
`[Get Status] Error triggering transcription for video ${videoId}:`,
`[Get Status] Error starting transcription for video ${videoId}:`,
error,
);
});
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: Setting PROCESSING status in the database before firing the async transcription creates a window where multiple concurrent calls to getVideoStatus could all pass the check at line 59 (where transcriptionStatus is null), then all set PROCESSING and trigger multiple transcription attempts.

Consider using a database-level constraint (e.g., optimistic locking with a version field, or a compare-and-set operation) to ensure only one transcription is triggered. Alternatively, check the status again after the update to verify this instance "won" the race.

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +144
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId as Video.VideoId));

Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout logic depends on updatedAt, but transcribeVideoDirect sets PROCESSING status again at line 140-143, which will update the updatedAt timestamp and reset the timeout window. This means the 3-minute timeout won't work correctly - it will keep getting extended each time the status update happens.

The timeout should either: (1) use a separate startedAt timestamp that's only set once, or (2) only set PROCESSING in get-status and skip setting it again in transcribeVideoDirect.

Suggested change
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId as Video.VideoId));

Copilot uses AI. Check for mistakes.
Comment on lines +140 to +143
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId as Video.VideoId));
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _isRetry parameter is passed as true from get-status.ts line 86, which allows retrying even if status is PROCESSING. However, transcribeVideoDirect immediately sets the status back to PROCESSING at line 140-143. This means the retry logic doesn't provide any actual benefit - if the original transcription is genuinely stuck at PROCESSING, the retry will just set it to PROCESSING again and get stuck in the same state.

Consider removing this redundant status update in transcribeVideoDirect, or revising the retry logic to handle PROCESSING states differently.

Copilot uses AI. Check for mistakes.
Comment on lines +103 to 120
if (video.transcriptionStatus === "PROCESSING") {
const threeMinutesAgo = new Date(Date.now() - 3 * 60 * 1000);
if (video.updatedAt < threeMinutesAgo) {
await db()
.update(videos)
.set({ transcriptionStatus: "ERROR" })
.where(eq(videos.id, videoId));

return {
transcriptionStatus: "ERROR",
aiGenerationStatus:
(metadata.aiGenerationStatus as AiGenerationStatus) || null,
aiTitle: metadata.aiTitle || null,
summary: metadata.summary || null,
chapters: metadata.chapters || null,
error: "Failed to start transcription",
error: "Transcription timed out",
};
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout check uses video.updatedAt from the database query at the beginning of getVideoStatus, but updatedAt is automatically updated by the database via onUpdateNow() whenever the video row is modified. Between the SELECT and this timeout check, other operations (like status updates in transcribeVideoDirect) can modify the row and update the timestamp. This creates a time-of-check-to-time-of-use (TOCTOU) race condition where the timeout logic may use stale data.

Consider re-querying the video record immediately before this timeout check to get the current updatedAt value, or use a dedicated processingStartedAt timestamp field.

Copilot uses AI. Check for mistakes.
Comment on lines 109 to 132
try {
console.log(
`[transcribeVideo] Triggering transcription workflow for video ${videoId}`,
`[transcribeVideo] Starting direct transcription for video ${videoId}`,
);

await start(transcribeVideoWorkflow, [
{
videoId,
userId,
aiGenerationEnabled,
},
]);
await transcribeVideoDirect(videoId, userId, aiGenerationEnabled);

return {
success: true,
message: "Transcription workflow started",
message: "Transcription completed",
};
} catch (error) {
console.error("[transcribeVideo] Failed to trigger workflow:", error);
console.error("[transcribeVideo] Transcription failed:", error);

await db()
.update(videos)
.set({ transcriptionStatus: null })
.set({ transcriptionStatus: "ERROR" })
.where(eq(videos.id, videoId));

return {
success: false,
message: "Failed to start transcription workflow",
message: "Transcription failed",
};
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If transcribeVideo throws an exception before or during transcribeVideoDirect (e.g., from validation checks at lines 31-107), the catch block won't execute and the transcriptionStatus in get-status.ts will remain stuck at PROCESSING. The fire-and-forget pattern with .catch() at line 86 in get-status.ts only logs errors but doesn't update the database status.

Consider either: (1) wrapping the entire transcribeVideo call (including validation) in a try-catch that sets ERROR status, or (2) ensuring transcribeVideo always returns a result instead of throwing exceptions for validation failures.

Copilot uses AI. Check for mistakes.
Comment on lines +135 to +239
async function transcribeVideoDirect(
videoId: string,
userId: string,
aiGenerationEnabled: boolean,
): Promise<void> {
await db()
.update(videos)
.set({ transcriptionStatus: "PROCESSING" })
.where(eq(videos.id, videoId as Video.VideoId));

const query = await db()
.select({
bucket: s3Buckets,
})
.from(videos)
.leftJoin(s3Buckets, eq(videos.bucket, s3Buckets.id))
.where(eq(videos.id, videoId as Video.VideoId));

const row = query[0];
if (!row) {
throw new Error("Video does not exist");
}

const bucketId = (row.bucket?.id ?? null) as S3Bucket.S3BucketId | null;

const [s3Bucket] = await S3Buckets.getBucketAccess(
Option.fromNullable(bucketId),
).pipe(runPromise);

const videoKey = `${userId}/${videoId}/result.mp4`;
const videoUrl = await s3Bucket.getSignedObjectUrl(videoKey).pipe(runPromise);

const headResponse = await fetch(videoUrl, {
method: "GET",
headers: { range: "bytes=0-0" },
});
if (!headResponse.ok) {
throw new Error("Video file not accessible");
}

const useMediaServer = isMediaServerConfigured();
let hasAudio: boolean;
let audioBuffer: Buffer;

if (useMediaServer) {
hasAudio = await checkHasAudioTrackViaMediaServer(videoUrl);
if (!hasAudio) {
await db()
.update(videos)
.set({ transcriptionStatus: "NO_AUDIO" })
.where(eq(videos.id, videoId as Video.VideoId));
return;
}
audioBuffer = await extractAudioViaMediaServer(videoUrl);
} else {
hasAudio = await checkHasAudioTrack(videoUrl);
if (!hasAudio) {
await db()
.update(videos)
.set({ transcriptionStatus: "NO_AUDIO" })
.where(eq(videos.id, videoId as Video.VideoId));
return;
}
const extracted = await extractAudioFromUrl(videoUrl);
try {
audioBuffer = await fs.readFile(extracted.filePath);
} finally {
await extracted.cleanup();
}
}

const deepgram = createClient(serverEnv().DEEPGRAM_API_KEY as string);

const { result, error } = await deepgram.listen.prerecorded.transcribeFile(
audioBuffer,
{
model: "nova-3",
smart_format: true,
detect_language: true,
utterances: true,
mime_type: "audio/mpeg",
},
);

if (error) {
throw new Error(`Deepgram transcription failed: ${error.message}`);
}

const transcription = formatToWebVTT(result as unknown as DeepgramResult);

await s3Bucket
.putObject(`${userId}/${videoId}/transcription.vtt`, transcription, {
contentType: "text/vtt",
})
.pipe(runPromise);

await db()
.update(videos)
.set({ transcriptionStatus: "COMPLETE" })
.where(eq(videos.id, videoId as Video.VideoId));

if (aiGenerationEnabled) {
await startAiGeneration(videoId as Video.VideoId, userId);
}
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synchronous call makes the function block until transcription completes, which can take minutes for long videos. The calling code at line 86 in get-status.ts uses fire-and-forget with .catch(), but if the server or process restarts during transcription, the work is lost and the status will remain stuck at PROCESSING until the 3-minute timeout.

For production resilience, consider using a persistent job queue (like BullMQ, pg-boss, or similar) instead of in-memory async execution, especially for long-running operations like video transcription.

Copilot uses AI. Check for mistakes.
Comment on lines +138 to +226
async function generateAiDirect(
videoId: string,
userId: string,
): Promise<void> {
const query = await db()
.select({ video: videos, bucket: s3Buckets })
.from(videos)
.leftJoin(s3Buckets, eq(videos.bucket, s3Buckets.id))
.where(eq(videos.id, videoId as Video.VideoId));

if (query.length === 0 || !query[0]?.video) {
throw new Error("Video does not exist");
}

const { video, bucket } = query[0];
const metadata = (video.metadata as VideoMetadata) || {};
const bucketId = (bucket?.id ?? null) as S3Bucket.S3BucketId | null;

if (video.transcriptionStatus !== "COMPLETE") {
throw new Error("Transcription not complete");
}

const vtt = await Effect.gen(function* () {
const [s3Bucket] = yield* S3Buckets.getBucketAccess(
Option.fromNullable(bucketId),
);
return yield* s3Bucket.getObject(`${userId}/${videoId}/transcription.vtt`);
}).pipe(runPromise);

if (Option.isNone(vtt)) {
await db()
.update(videos)
.set({ metadata: { ...metadata, aiGenerationStatus: "SKIPPED" } })
.where(eq(videos.id, videoId as Video.VideoId));
return;
}

const segments = parseVttWithTimestamps(vtt.value);
const text = segments
.map((s) => s.text)
.join(" ")
.trim();

if (text.length < 10) {
await db()
.update(videos)
.set({ metadata: { ...metadata, aiGenerationStatus: "SKIPPED" } })
.where(eq(videos.id, videoId as Video.VideoId));
return;
}

const transcript: TranscriptData = { segments, text };
const groqClient = getGroqClient();
const chunks = chunkTranscriptWithTimestamps(transcript.segments);

let aiResult: AiResult;
if (chunks.length === 1) {
aiResult = await generateSingleChunk(transcript.text, groqClient);
} else {
aiResult = await generateMultipleChunks(chunks, groqClient);
}

const updatedMetadata: VideoMetadata = {
...metadata,
aiTitle: aiResult.title || metadata.aiTitle,
summary: aiResult.summary || metadata.summary,
chapters: aiResult.chapters || metadata.chapters,
aiGenerationStatus: "COMPLETE",
};

await db()
.update(videos)
.set({ metadata: updatedMetadata })
.where(eq(videos.id, videoId as Video.VideoId));

const hasDatePattern = /\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}/.test(
video.name || "",
);

if (
(video.name?.startsWith("Cap Recording -") || hasDatePattern) &&
aiResult.title
) {
await db()
.update(videos)
.set({ name: aiResult.title })
.where(eq(videos.id, videoId as Video.VideoId));
}
}
Copy link

Copilot AI Feb 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to transcription, this synchronous call blocks until AI generation completes. If the server restarts during execution, the work is lost and the status remains PROCESSING. The fire-and-forget pattern at line 158 in get-status.ts provides no recovery mechanism.

Consider using a persistent job queue for resilience against server restarts and for better observability of long-running AI operations.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Transcription workflow fails on Docker image with Node 24

2 participants