feat: support video nsa#463
Conversation
|
/test |
| import os | ||
|
|
||
|
|
||
| def normalize_hf_config_for_infinilm(config_dict, model_path): |
There was a problem hiding this comment.
normalize_hf_config_for_infinilm(config_dict, model_path)函数的model_path参数是不是没有用到
| from infinilm.llm.llm import LLM | ||
|
|
||
|
|
||
| def decode_video_frames(video_path, num_frames): |
There was a problem hiding this comment.
decode_video_frames这个函数重复定义了,可以从examples/bench_videonsa.py导入么
There was a problem hiding this comment.
感觉test_infer不应该对这个脚本有依赖
| struct MultiModalMetadata { | ||
| std::optional<std::vector<size_t>> image_req_ids; | ||
| // Flattened [start, end) token ranges in the current packed language sequence. | ||
| std::optional<std::vector<size_t>> visual_token_ranges; |
There was a problem hiding this comment.
之前的attnmeta都在python端计算号的,然后传递给c++中,存储到全局的attnmeta的位置。
但这个visual_token_ranges是在c++中被赋值的。和之前的默认行为不一样
| auto batched_grids = grid_tensors.size() == 1 ? grid_tensors.front() : infinicore::op::cat(grid_tensors, 0); | ||
| auto batched_vision_hidden = visual_->forward(batched_pixels, batched_grids); | ||
|
|
||
| std::vector<size_t> visual_token_ranges; |
There was a problem hiding this comment.
再确认下visual_token_ranges变量是不是得在python/infinilm/processors/videonsa_processor.py中计算好,传递过来得,而不是在c++中to_cpu现算
There was a problem hiding this comment.
如果是的话,得修改bind的Input结构体,最后的位置新增一个变量。
| def normalize_hf_config_for_infinilm(config_dict, model_path): | ||
| model_type = config_dict.get("model_type") | ||
|
|
||
| if model_type == "qwen2_5_vl" and config_dict.get("architectures") == [ |
There was a problem hiding this comment.
这是啥意思,为什么要吧"model_type"从qwen2_5_vl修改为videonsa。
| normalized["text_config"] = text_config | ||
| return normalized | ||
|
|
||
| return config_dict |
There was a problem hiding this comment.
这个模型适配了的话,是不是之后qwen2_5_vl模型的适配,能复用csrc/models/videonsa/文件夹中的绝大部分文件
pengcheng888
left a comment
There was a problem hiding this comment.
修改后的代码,(1)给出测试命令和测试截图;(2)已经有多模态模型也能跑
443e44c to
92d2ee4
Compare
|
/retest |
|
⛔ Only repository members can run |
4121275 to
24c8323
Compare
|
/test |
|
✅ Started CI workflow run 28448940484 for commit |
| /// Target patch sizes for each image (MiniCPM-V). | ||
| std::optional<std::vector<infinicore::Tensor>> tgt_sizes; | ||
| /// Flattened [start, end) visual token ranges in the packed language sequence. | ||
| std::optional<std::vector<size_t>> visual_token_ranges; |
There was a problem hiding this comment.
image_bound tgt_sizes image_req_ids这几个之前是连着的。
visual_token_ranges变量是新增的,放到最后一个位置比较好感觉。
| std::optional<std::vector<infinicore::Tensor>> pixel_values, | ||
| std::optional<std::vector<infinicore::Tensor>> image_bound, | ||
| std::optional<std::vector<infinicore::Tensor>> tgt_sizes, | ||
| std::optional<std::vector<size_t>> visual_token_ranges, |
There was a problem hiding this comment.
同上,visual_token_ranges 位置要不要往后放
24c8323 to
04ba524
Compare
Summary
Motivation
Closes #
Type of Change
feat— new feature / new modelfix— bug fixperf— performance improvement (no behavioral change)refactor— code restructuring without behavior changetest— adding or fixing tests onlydocs— documentation onlybuild/ci— build system or CI configurationchore— tooling, formatting, or other non-code changesTest Results of Involved Models on Supported Platforms (Please attach screenshots)
Benchmark / Performance Impact
Notes for Reviewers
CI / ChatOps
Checklist
Title, Branch, and Commits
feat(nvidia): …,fix(cuda/gemm): …).<type>/xxx-yyyy-zzzzwhere<type>matches the PR title's Conventional Commits type and words are joined with hyphens (seeCONTRIBUTING.md§Branches).CONTRIBUTING.md§Pull Requests).main— the branch is rebased cleanly on top of the currentmain.fixup!/squash!/wipcommits remain.Scope and Design
CONTRIBUTING.md§Code/General).printf/std::cout/print(...)left behind, orTODOwithout an owner and issue link.General Code Hygiene (applies to all languages)
CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).the `seqlens_k` tensor) (CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General).CONTRIBUTING.md§Code/General; §Python).C++ Specific (if C++ files changed)
CONTRIBUTING.md§C++).CONTRIBUTING.md§C++).new/delete; RAII / smart pointers / existing allocators are used.scripts/format.py.csrc/models/llama_legacy/.Python Specific (if Python files changed)
CONTRIBUTING.md§Python).CONTRIBUTING.md§Python).scripts/format.py.python/infinilm/auto_config.py.Testing
examples/test_infer.py), or specify the reason for skipping.examples/bench.py), or specify the reason for skipping.test/bench/test_benchmark.py), or specify the reason for skipping.python/infinilm/server/inference_server.py+scripts/test_perf.py), or specify the reason for skipping.Build, CI, and Tooling
/retestwas requested.Documentation
README.md,CONTRIBUTING.md, or inline docs updated when behavior, build flags, or developer workflow changed.!orBREAKING CHANGE:footer.Security and Safety