feat(ai-groq): transcription#649
Conversation
📝 WalkthroughWalkthroughThis PR adds Groq as a transcription provider to the TanStack AI library. It introduces a ChangesGroq Transcription Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install failed: lockfile failed supply-chain policy check. Run Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/ai-groq/src/adapters/transcription.ts`:
- Around line 161-170: The adapter currently forwards
TranscriptionOptions.responseFormat (effectiveFormat) including 'srt'/'vtt' to
form.append('response_format', effectiveFormat) and later calls response.json()
for non-verbose_json/text branches; change this to explicitly reject unsupported
formats for Groq: add a check after computing effectiveFormat (the variable
named effectiveFormat in transcription.ts) and if effectiveFormat === 'srt' ||
effectiveFormat === 'vtt' throw a clear Error like "responseFormat 'srt'/'vtt'
not supported by Groq" (or alternatively map them to a supported value such as
'text' or 'json' if you prefer), and ensure form.append('response_format', ...)
and the subsequent response handling (the branches that call response.text() or
response.json(), e.g., the block that currently uses response.json() and
generateId(this.name)) only run for supported formats
('json','text','verbose_json').
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f2e8fc62-7779-4e78-9f21-0e407ea9246f
📒 Files selected for processing (7)
.changeset/feat-groq-ai-transcription.mdpackages/ai-groq/src/adapters/transcription.tspackages/ai-groq/src/audio/transcription-provider-options.tspackages/ai-groq/src/index.tspackages/ai-groq/src/model-meta.tstesting/e2e/src/lib/feature-support.tstesting/e2e/src/lib/media-providers.ts
| } else if (effectiveFormat === 'text') { | ||
| const text = await response.text() | ||
| return { | ||
| id: generateId(this.name), | ||
| model, | ||
| text, | ||
| ...(language !== undefined && { language }), | ||
| } | ||
| } else { | ||
| const data = (await response.json()) as GroqJsonTranscriptionResponse |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Verify which transcription response formats are exposed in shared types
rg -nP --type=ts "interface\\s+TranscriptionOptions|responseFormat|verbose_json|srt|vtt|json|text" packages
# Compare how other transcription adapters branch on response formats
rg -nP --type=ts "responseFormat|effectiveFormat|await response\\.text\\(|await response\\.json\\(" packages/ai-*/src/adapters/*transcription*.tsRepository: TanStack/ai
Length of output: 50368
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Show the exact response format union/type and any mapping for transcription
rg -n "responseFormat" packages/ai/src/types.ts packages/ai/src/**/transcription* packages/ai/src/**/transcrib* || true
# 2) Inspect Groq transcription adapter around effectiveFormat + parsing
FILE="packages/ai-groq/src/adapters/transcription.ts"
if [ -f "$FILE" ]; then
echo "---- $FILE (relevant section) ----"
# Print the function/section around the line numbers mentioned in the original comment
nl -ba "$FILE" | sed -n '130,230p'
else
echo "Missing $FILE"
fi
# 3) Find any other adapters for transcription and how they handle srt/vtt/json/text
rg -n "effectiveFormat|responseFormat" packages/ai-*/src/adapters/*transcription*.ts || trueRepository: TanStack/ai
Length of output: 435
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate Groq transcription adapter and inspect around the reported section without using nl
GROQ_FILE="packages/ai-groq/src/adapters/transcription.ts"
echo "=== Groq adapter file exists? ==="
test -f "$GROQ_FILE" && echo "yes: $GROQ_FILE" || (echo "missing: $GROQ_FILE" && exit 0)
echo "=== Groq adapter around lines 145-210 ==="
cat -n "$GROQ_FILE" | sed -n '145,210p'
echo "=== Groq adapter response_format request parameter and effectiveFormat derivation ==="
rg -n "response_format|effectiveFormat|useVerbose|verbose_json|srt|vtt" "$GROQ_FILE"
echo "=== Groq adapter parsing branches (json/text) ==="
rg -n "await response\\.json\\(|await response\\.text\\(|if \\(effectiveFormat|else if \\(effectiveFormat|else \\{" "$GROQ_FILE"
echo "=== Groq adapter non-ok (error handling) branch ==="
rg -n "if \\(!response\\.ok\\)|response\\.ok|throw" "$GROQ_FILE"
echo "=== Find other transcription adapters to compare srt/vtt handling ==="
fd -t f -e ts -e tsx "*transcription*.ts" packages \
| sed -n '1,200p'
# For each adapter found, quickly extract how it branches on responseFormat/effectiveFormat
fd -t f -e ts -e tsx "*transcription*.ts" packages \
| xargs -I{} sh -c 'echo "---- {} ----"; rg -n "responseFormat|effectiveFormat|verbose_json|srt|vtt" "{}" || true'Repository: TanStack/ai
Length of output: 4412
🏁 Script executed:
#!/bin/bash
set -euo pipefail
GROQ_FILE="packages/ai-groq/src/adapters/transcription.ts"
echo "=== Groq adapter occurrences of srt/vtt ==="
rg -n "srt|vtt" "$GROQ_FILE" || true
echo "=== Groq adapter full response_format send (near form.append) ==="
rg -n "form\\.append\\('response_format'|effectiveFormat|useVerbose" "$GROQ_FILE" -n
echo "=== Other transcription adapters: files matching *transcription*.ts ==="
FILES=$(fd -t f "*transcription*.ts" packages || true)
echo "$FILES" | head -n 50
echo "=== Other transcription adapters: look for srt/vtt/text/json handling ==="
echo "$FILES" | xargs -I{} sh -c 'echo "---- {} ----"; rg -n "responseFormat|effectiveFormat|verbose_json|srt|vtt|await response\\.json\\(|await response\\.text\\(" "{}" || true'Repository: TanStack/ai
Length of output: 1002
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect the responseFormat type definition around the reported line
TFILE="packages/ai/src/types.ts"
echo "=== responseFormat type in $TFILE ==="
test -f "$TFILE" && sed -n '1680,1710p' "$TFILE" || echo "missing $TFILE"
# 2) See if Groq adapter validates/filters srt/vtt anywhere besides parsing branch
GROQ_FILE="packages/ai-groq/src/adapters/transcription.ts"
echo "=== Groq adapter: validate/map srt/vtt/response_format ==="
rg -n "responseFormat|effectiveFormat|srt|vtt" "$GROQ_FILE"
# 3) Check other transcription adapter files (fix fd usage with --glob)
echo "=== Other transcription adapter files ==="
FILES=$(fd -t f --glob '*transcription*.ts' packages)
echo "$FILES" | head -n 80
echo "=== Other transcription adapters: parsing branches ==="
echo "$FILES" | xargs -I{} sh -c '
f="{}";
echo "---- $f ----";
rg -n "responseFormat|effectiveFormat|verbose_json|await response\.json\(|await response\.text\(|srt|vtt" "$f" || true
' | head -n 200Repository: TanStack/ai
Length of output: 5060
🏁 Script executed:
#!/bin/bash
set -euo pipefail
GROQ_FILE="packages/ai-groq/src/adapters/transcription.ts"
echo "=== Groq adapter error handling branch (around !response.ok) ==="
cat -n "$GROQ_FILE" | sed -n '90,150p'
echo "=== Groq transcription provider options docs ==="
PROV_OPTS="packages/ai-groq/src/audio/transcription-provider-options.ts"
test -f "$PROV_OPTS" && cat -n "$PROV_OPTS" | sed -n '1,120p' || echo "missing $PROV_OPTS"
echo "=== Groq transcription tests mentioning responseFormat ==="
fd -t f --glob '*groq*transcription*test*.ts' packages | head -n 50
fd -t f --glob '*transcription*test*.ts' packages/ai-groq | xargs -I{} sh -c 'echo "---- {} ----"; rg -n "responseFormat|response_format|srt|vtt|verbose_json|text|json" "{}" || true'Repository: TanStack/ai
Length of output: 3659
🌐 Web query:
Groq transcription API response_format supported values json text verbose_json srt vtt
💡 Result:
Supported Groq transcription API response_format values are: json, text, verbose_json.[1][2] Notes: - response_format defaults to json.[2] - verbose_json includes timestamps for audio segments; timestamp_granularities requires response_format = verbose_json.[1][2]
Citations:
Fix Groq responseFormat handling for srt/vtt (don’t fall back to JSON parsing / don’t forward unsupported formats).
TranscriptionOptions.responseFormat includes 'srt' | 'vtt', but the Groq transcription endpoint only supports response_format values json, text, and verbose_json. The Groq adapter forwards effectiveFormat directly to form.append('response_format', effectiveFormat) and then uses response.json() for all non-verbose_json/non-text cases—so requesting srt/vtt should be rejected (or mapped to a supported format) instead of relying on a “text-like formats” parse fallback.
Update the adapter to explicitly handle 'srt'/'vtt' (e.g., throw a clear “unsupported for Groq” error, or map to a supported format), and remove/avoid the proposed srt/vtt→response.text() parsing change since Groq won’t return those formats successfully.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-groq/src/adapters/transcription.ts` around lines 161 - 170, The
adapter currently forwards TranscriptionOptions.responseFormat (effectiveFormat)
including 'srt'/'vtt' to form.append('response_format', effectiveFormat) and
later calls response.json() for non-verbose_json/text branches; change this to
explicitly reject unsupported formats for Groq: add a check after computing
effectiveFormat (the variable named effectiveFormat in transcription.ts) and if
effectiveFormat === 'srt' || effectiveFormat === 'vtt' throw a clear Error like
"responseFormat 'srt'/'vtt' not supported by Groq" (or alternatively map them to
a supported value such as 'text' or 'json' if you prefer), and ensure
form.append('response_format', ...) and the subsequent response handling (the
branches that call response.text() or response.json(), e.g., the block that
currently uses response.json() and generateId(this.name)) only run for supported
formats ('json','text','verbose_json').
🎯 Changes
Adds Groq as a transcription provider. Groq's API is mostly OpenAI SDK-compatible, but its transcription endpoint additionally accepts HTTP URLs as input, so this is implemented as a custom integration rather than going through the SDK.
✅ Checklist
pnpm run test:pr.🚀 Release Impact
Summary by CodeRabbit
Release Notes
New Features