fix(hub): prevent path traversal in sharded checkpoint loading#46392
fix(hub): prevent path traversal in sharded checkpoint loading#46392goingforstudying-ctrl wants to merge 1 commit into
Conversation
…ngface#46097) Add validation in to sanitize shard filenames from the checkpoint index JSON before constructing filesystem paths. Changes: - Reject filenames containing '..' or starting with '/' or '\' - After os.path.join, resolve paths with os.path.realpath and verify they remain within the model directory - Add comprehensive tests for path traversal, absolute paths, symlink escapes, and valid local shards This fixes a security vulnerability (CWE-22) where a malicious could reference arbitrary files outside the model directory.
|
Hi @vasqu, this PR fixes a path traversal vulnerability (CWE-22) in sharded checkpoint loading. The issue is that |
|
cc @Cyrilvallez But tbh it cannot lead to any exploit afaik #46097 (comment) so three shouldnt be any need to do anything |
|
@vasqu Thanks for the feedback and for linking the discussion. I understand the argument that malicious paths will simply fail at load time rather than succeed. However, I still believe this fix has value for a few reasons:
That said, if the maintainers' consensus is that this is unnecessary noise, I'm happy to close the PR. Just wanted to make the case for the fix. |
|
@vasqu @Cyrilvallez Thanks for the feedback. I understand the position that the current behavior is safe in practice since malicious paths will fail at load time. However, I'd still argue for merging this as a defense-in-depth measure:
That said, I'm happy to close this if the maintainers' consensus is that it's unnecessary. Just wanted to make the case. |
Fixes #46097
This PR addresses a path traversal vulnerability (CWE-22) in
get_checkpoint_shard_fileswhere a maliciousmodel.safetensors.index.jsoncould reference arbitrary files outside the model directory.Changes
..or starting with path separators (/or\)os.path.join, useos.path.realpathto resolve symlinks and verify the resolved path stays within the model directory..Security Impact
This prevents arbitrary file reads when loading models from untrusted local directories. An attacker who can provide a malicious checkpoint index could previously access:
/etc/passwd, etc.)~/.cache/huggingface/tokenTesting
All new tests pass:
4 passed in 0.28s