Consistent output shape from `get_image_features` by zucchini-nlp · Pull Request #46405 · huggingface/transformers

zucchini-nlp · 2026-06-04T09:19:20Z

What does this PR do?

as per title, branches off from #45783

The pooled image (OR video, NOT audio yet) feature output will now always satisfy three cond, where image features can be a list or a 3D tensor:

len(image_features) = len(input_images) # NOTE: or num_videos, not number of total video frames
image_features[0].ndim == 2
image_features[0].shape == {{actual seq length of this image, LM hidden size}}

Could make it complete BC and return the "correct" output when a certain flag is passed, but for now decided to update directly smaller utility fn in a breaking way

HuggingFaceDocBuilderDev · 2026-06-04T09:31:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2026-06-04T10:00:15Z

run-slow: aya_vision, cohere2_vision, deepseek_ocr2, gemma4, paddleocr_vl, qwen2_5_omni, qwen3_omni_moe

github-actions · 2026-06-04T10:01:37Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/aya_vision", "models/cohere2_vision", "models/deepseek_ocr2", "models/gemma4", "models/paddleocr_vl", "models/qwen2_5_omni", "models/qwen3_omni_moe"]
quantizations: []

github-actions · 2026-06-04T11:40:20Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	5e5f7bbe	workflow commit (merge commit)
PR	52788433	branch commit (from PR)
main	b07d99be	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

github-actions · 2026-06-04T12:34:48Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: aya_vision, cohere2_vision, deepseek_ocr2, gemma4, paddleocr_vl, qwen2_5_omni, qwen3_omni_moe

github-actions · 2026-06-04T12:50:04Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=46405&sha=80f4bb

github-actions · 2026-06-04T12:52:49Z

CI Dashboard: View test results in Grafana

done

6130e27

fix repo

5278843

and videos

80f4bba

zucchini-nlp requested a review from vasqu June 4, 2026 12:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent output shape from `get_image_features`#46405

Consistent output shape from `get_image_features`#46405
zucchini-nlp wants to merge 3 commits into
huggingface:mainfrom
zucchini-nlp:image-output-shapes

zucchini-nlp commented Jun 4, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jun 4, 2026

Uh oh!

zucchini-nlp commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zucchini-nlp commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jun 4, 2026

Uh oh!

zucchini-nlp commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

CI Results

Commit Info

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zucchini-nlp commented Jun 4, 2026 •

edited

Loading