Skip to content

InternVideo-NeXt clip_projector training procedure: eval only? from scratch? #314

@MichaelRamamonjisoa

Description

@MichaelRamamonjisoa

Dear authors,

Thanks for your interesting paper!
I see you mention in your paper in section B.2 of the appendix that you train an attention pooling head for evaluation.

At the same time, I see that the model you released on HuggingFace has a clip_projector module (AttentionPoolingBlock) attached to it, and in the forward of the model you set projected=True (HF code here)

What I would like to know is:

  1. on what data was this release clip_projector trained?
  2. if I want to evaluate your model on action recognition, should I:
    a. train the attentive probe from scratch on the target dataset
    b. finetune the attentive probe on the target dataset (if so, from which weights?)
  3. which setup did you use in your paper, 2.a. or 2.b., or none of them?

You released some action recognition code for InternVideo2 and for linear probing / attention probing there is this --open_clip_projector parameter which controls whether you're finetuning the head, or not. But this doesn't say on which data the head was trained before the finetuning.

You mention in the paper in section 4.2 (Video Classification)

 We test the model in an ‘Attentive Probing’ setting
where the encoders are frozen and a single-layer attention
pooling head is trained. Such Frozen Encoder settings can
test representation’s quality in an unbiased way. Our methods achieve the best results with only public data and less
computation cost on these foundation tasks.

What would make sense to me is that you're not using the Internvideo2 evaluation script anymore and you're training the attentive probe from scratch (2.a. mentioned above). But if so, I'd like to know on which evaluation experiment the weights of the release clip_projector were obtained.

Thanks in advance for your clarifications!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions