feat(fetchers): enhance YouTubeFetcher with transcript extraction by chaliy · Pull Request #88 · everruns/fetchkit

chaliy · 2026-04-03T02:55:39Z

What

Enhance YouTubeFetcher with transcript/captions extraction via the timedtext API.

Why

Closes #56 — Agents encounter YouTube links but can't watch video. Extracting transcripts turns video content into LLM-consumable text. The existing implementation only had oEmbed metadata with no transcript support.

How

Added transcript extraction via YouTube timedtext XML API (English captions)
Parse timedtext XML segments and join into continuous text
Truncate very long transcripts (>15k chars) with indicator
Gracefully handle videos without transcripts
Added mobile URL support (m.youtube.com)
Comprehensive tests: XML parsing, entity decoding, truncation, formatting

Risk

Low
Transcript API is undocumented but widely used; graceful fallback when unavailable

Checklist

Unit tests are passed
Smoke tests are passed
Specs are up to date and not in conflict

- Add transcript extraction via YouTube timedtext API - Parse timedtext XML format into joined transcript text - Truncate very long transcripts (>15k chars) with indicator - Show "No transcript available" when captions are unavailable - Add mobile URL support (m.youtube.com) - Add comprehensive tests: timedtext parsing, entity decoding, transcript truncation, formatting with/without all fields Closes #56

chaliy merged commit 0960c35 into main Apr 3, 2026
11 checks passed

chaliy deleted the fix/issue-56-youtube-fetcher branch April 3, 2026 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(fetchers): enhance YouTubeFetcher with transcript extraction#88

feat(fetchers): enhance YouTubeFetcher with transcript extraction#88
chaliy merged 1 commit intomainfrom
fix/issue-56-youtube-fetcher

chaliy commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chaliy commented Apr 3, 2026

What

Why

How

Risk

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant