Fix tensor type mismatch by pre-scanning file metadata before allocation#6
Draft
Fix tensor type mismatch by pre-scanning file metadata before allocation#6
Conversation
…ation This change addresses the issue where allocated device memory doesn't match the tensor type from the file. The fix implements a two-pass approach: 1. Pre-scan: Read all tensor metadata and data from the file upfront 2. Create tensors using the actual types from the file (not assumed types) 3. Load weights from the pre-scanned buffer This ensures that memory allocation on the device corresponds to the correct tensor type from the file, fixing mixed precision model loading. Co-authored-by: lhpqaq <63844184+lhpqaq@users.noreply.github.com>
- Move EOF check before reading partial header data - Add comment explaining temporary memory doubling tradeoff - Add warning when tensor not found in pre-scanned data - Use name.c_str() consistently instead of name.data() Co-authored-by: lhpqaq <63844184+lhpqaq@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Refactor memory management for tensor types
Fix tensor type mismatch by pre-scanning file metadata before allocation
Jan 18, 2026
Owner
|
@copilot 扫面两遍肯定不行,这样会导致模型加载时间翻倍,请思考别的思路。能不能把Allocated device memory的过程后置 |
Instead of pre-scanning all tensor data into memory (which doubles memory usage), this change: 1. Adds skip, seek, and tell functions to whisper_model_loader interface 2. First pass: scans tensor metadata and skips data using seek (no memory) 3. Creates tensors with correct types from file before buffer allocation 4. Seeks back to tensor section for second pass data loading 5. Maintains backwards compatibility for loaders without seek support This addresses the reviewer feedback to avoid doubling memory usage while still fixing the tensor type mismatch issue. Co-authored-by: lhpqaq <63844184+lhpqaq@users.noreply.github.com>
Author
Implemented a better approach that defers device memory allocation. Instead of loading all tensor data into memory (which doubles memory), this solution:
This avoids both the memory doubling and maintains backwards compatibility for loaders without seek support. Commit: 9422867 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
skip,seek,tellfunction pointers towhisper_model_loaderinterfaceOriginal prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.