Add per-frame timestamp embedding to the VLM video path by amazloumi · Pull Request #128 · KempnerInstitute/KempnerForge

amazloumi · 2026-06-26T14:03:46Z

Summary

Add FrameTimeEmbedding (kempnerforge/model/frame_time.py): sinusoidal features of a frame's timestamp (seconds) at log-spaced periods → a zero-initialized projection (identity at step 0).
decode_video_frames returns (frames, times); WebVidVideoDataset emits frame_times (F,), VideoCollator stacks to (B, F).
Applied per frame in _project_visual_features as a VLMWrapper sibling submodule (video only; None for the image path); built + FSDP-sharded + meta-materialized at both build sites.
scripts/train.py threads frame_times; docs + CHANGELOG updated.
Make the time embedding registry-driven: [time_embedding] selects the implementation (type = "sinusoidal" default, "none" disables) via @registry.register_time_embedding; new techniques drop in as small additions. Sequence-modifying encodings (Molmo2-style text time-tokens) are flagged as a separate future hook (needs interleaved-sequence support).

Testing

uv run ruff check kempnerforge/ tests/ passes
uv run ruff format --check kempnerforge/ tests/ scripts/ passes
uv run pyright kempnerforge/ passes (0 errors)
uv run pytest tests/unit/ -v --timeout=60 passes (1527 passed, 2 skipped)
Distributed (parallel.py changed): uv run torchrun --nproc_per_node=4 -m pytest tests/distributed/ -v ← running this now
2-GPU FSDP smoke on vlm_video_webvid.toml (random encoder): trains, +33,792 params confirms the module is sharded/trainable

Closes #127

codecov · 2026-06-26T22:57:50Z

Codecov Report

❌ Patch coverage is 95.16129% with 6 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
kempnerforge/distributed/parallel.py	58.33%	5 Missing ⚠️
kempnerforge/data/video_io.py	88.88%	1 Missing ⚠️

Files with missing lines	Coverage Δ
kempnerforge/config/job.py	`88.79% <100.00%> (+0.19%)`	⬆️
kempnerforge/config/registry.py	`100.00% <100.00%> (ø)`
kempnerforge/config/schema.py	`100.00% <100.00%> (ø)`
kempnerforge/config/time_embedding.py	`100.00% <100.00%> (ø)`
kempnerforge/data/video_dataset.py	`93.16% <100.00%> (+0.17%)`	⬆️
kempnerforge/model/frame_time.py	`100.00% <100.00%> (ø)`
kempnerforge/model/vlm.py	`99.15% <100.00%> (+0.08%)`	⬆️
kempnerforge/data/video_io.py	`81.96% <88.88%> (+1.96%)`	⬆️
kempnerforge/distributed/parallel.py	`58.79% <58.33%> (-0.28%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR adds absolute per-frame timestamp conditioning to the VLM video pathway by propagating decoded frame presentation times through the data pipeline and injecting a registry-configurable, zero-initialized time embedding into each frame’s visual tokens.

Changes:

decode_video_frames now returns (frames, times); datasets/collator propagate frame_times as (F,) / (B, F) and training threads it into the model forward.
Introduces a registry-driven time-embedding module (FrameTimeEmbedding default; "none" disables) and applies it per-frame in the VLM visual-token projection (video-only).
Ensures distributed/FSDP build paths materialize and shard the new submodule; adds unit coverage for config, embedding behavior, and build wiring.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/unit/test_vlm.py	Adds unit tests ensuring video wrappers attach the module, image wrappers do not, and forward wiring/shape checks behave as expected.
tests/unit/test_video_io.py	Updates tests for `(frames, times)` return and basic timestamp properties.
tests/unit/test_video_dataset.py	Updates dataset/collator tests to validate `frame_times` padding/stacking behavior.
tests/unit/test_time_embedding_config.py	Adds coverage for `TimeEmbeddingConfig` defaults, validation, and kwargs.
tests/unit/test_frame_time.py	Adds coverage for embedding shape, zero-init behavior, gradient flow, dtype behavior, and registry builder.
tests/unit/test_distributed.py	Verifies distributed build attaches/casts `frame_time_embed` for video and omits it for images.
scripts/train.py	Threads `time_embedding_config` into model build and passes `frame_times` into VLM forward.
kempnerforge/model/vlm.py	Adds `frame_times` plumbing and applies per-frame timestamp embeddings in `_project_visual_features`.
kempnerforge/model/frame_time.py	Introduces `TimeEmbedding` interface, `FrameTimeEmbedding`, and registry-driven `build_time_embedding`.
kempnerforge/distributed/parallel.py	Builds/materializes/casts `frame_time_embed` in meta/CPU paths and FSDP-shards it when present.
kempnerforge/data/video_io.py	Changes `decode_video_frames` to also return matched presentation timestamps.
kempnerforge/data/video_dataset.py	Emits `frame_times` per sample and stacks it in `VideoCollator`.
kempnerforge/config/time_embedding.py	Adds `[time_embedding]` config with validation and builder kwargs.
kempnerforge/config/schema.py	Exposes `TimeEmbeddingConfig` in the config schema surface.
kempnerforge/config/registry.py	Adds time-embedding registry hooks (`register/get/list_time_embedding`).
kempnerforge/config/job.py	Adds optional `time_embedding` field to `JobConfig`.
docs/how-to/train-on-video.md	Documents per-frame timestamp embedding and registry semantics.
CHANGELOG.md	Records the new per-frame timestamp feature and affected components.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…o gradient

amazloumi marked this pull request as draft June 26, 2026 14:46

amazloumi marked this pull request as ready for review June 26, 2026 18:29

Base automatically changed from worktree-video-pipeline to main June 26, 2026 21:05

amazloumi added 2 commits June 26, 2026 18:49

Add per-frame timestamp embedding to the VLM video path

5c87c3d

Make the per-frame time embedding registry-based and config-selectable

1ea9c4b

amazloumi force-pushed the video/per-frame-timestamps branch from bcb435f to 1ea9c4b Compare June 26, 2026 22:54

amazloumi requested review from Naeemkh, Copilot and mmshad June 26, 2026 22:55

Copilot started reviewing on behalf of amazloumi June 26, 2026 22:56 View session

Copilot AI reviewed Jun 26, 2026

View reviewed changes

Comment thread kempnerforge/model/frame_time.py Outdated

amazloumi requested a review from camilobrownpinilla June 27, 2026 00:50

correct FrameTimeEmbedding zero-init note — no-op on outputs, not zer…

c6ff522

…o gradient

Copilot started work on behalf of amazloumi June 27, 2026 01:05 View session

Copilot finished work on behalf of amazloumi June 27, 2026 01:06

Copilot started work on behalf of amazloumi June 27, 2026 01:26 View session

Copilot finished work on behalf of amazloumi June 27, 2026 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add per-frame timestamp embedding to the VLM video path#128

Add per-frame timestamp embedding to the VLM video path#128
amazloumi wants to merge 3 commits into
mainfrom
video/per-frame-timestamps

amazloumi commented Jun 26, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

amazloumi commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

codecov Bot commented Jun 26, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

amazloumi commented Jun 26, 2026 •

edited

Loading