[rust] Prevent path traversal in tar and pkg extraction#17668
[rust] Prevent path traversal in tar and pkg extraction#17668bonigarcia wants to merge 1 commit into
Conversation
Code Review by Qodo
1. PKG empty name accepted
|
PR Summary by QodoPrevent path traversal in tar and pkg extraction WalkthroughsDescription• Reject archive entries containing ".." path components during tar/cpio extraction. • Centralize traversal validation in a shared check_path_traversal() helper. • Fail fast with a clear error before writing any unsafe paths to disk. Diagramgraph TD
A["uncompress_tar()"] --> B["check_path_traversal()"] --> D[("Filesystem write")]
C["uncompress_pkg()"] --> B --> D
High-Level AssessmentThe following are alternative approaches to this PR: 1. Validate by containment (canonicalize joined path)
2. Use library-provided “safe unpack” APIs where available
3. Harden checks beyond ParentDir
Recommendation: The current approach (explicitly rejecting File ChangesBug fix (1)
|
| while let Some(next) = cpio_reader.next() { | ||
| let entry = next?; | ||
| let name = entry.name(); | ||
| check_path_traversal(Path::new(name))?; |
There was a problem hiding this comment.
1. check_path_traversal lacks tests 📘 Rule violation ☼ Reliability
New validation rejects archive entries containing .. components in uncompress_pkg() and uncompress_tar(), but this behavior change is not accompanied by targeted tests. This risks regressions and gaps in coverage for a security-sensitive extraction path.
Agent Prompt
## Issue description
The PR adds path traversal validation during archive extraction, but does not add tests to confirm malicious paths are rejected and safe paths still extract.
## Issue Context
The behavior change occurs in `uncompress_pkg()` and `uncompress_tar()` via `check_path_traversal()`. Because this is security-sensitive and error-path driven, tests should cover both allowed and rejected cases.
## Fix Focus Areas
- rust/src/files.rs[95-103]
- rust/src/files.rs[219-236]
- rust/src/files.rs[371-386]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| while let Some(next) = cpio_reader.next() { | ||
| let entry = next?; | ||
| let name = entry.name(); | ||
| check_path_traversal(Path::new(name))?; | ||
| let mut file = Vec::new(); | ||
| cpio_reader.read_to_end(&mut file)?; | ||
| let target_path_buf = target_path.join(name); |
There was a problem hiding this comment.
3. Absolute path escape 🐞 Bug ⛨ Security
check_path_traversal() only rejects ParentDir (..) components, so absolute/prefixed entry paths pass validation. In uncompress_pkg(), an absolute name can cause target_path.join(name) to ignore target_path and write outside the extraction directory.
Agent Prompt
## Issue description
`check_path_traversal()` currently only rejects `..` path components. Absolute paths (Unix `/etc/passwd`) and Windows-style prefixed paths (e.g., `C:\Windows\...`, UNC) can still pass validation and escape the intended extraction root when joined with `target`.
## Issue Context
`uncompress_pkg()` validates `entry.name()` with `check_path_traversal(Path::new(name))` and then writes to `target_path.join(name)`. If `name` is absolute/prefixed, the joined path may resolve outside `target`.
## Fix Focus Areas
- rust/src/files.rs[95-103]
- rust/src/files.rs[219-237]
### Suggested fix
- Update `check_path_traversal()` to reject any non-relative components, not just `ParentDir`, e.g. fail on `Component::RootDir` and `Component::Prefix(_)` as well.
- Optionally add a defensive post-join check in `uncompress_pkg()` (and any similar extraction code): after building the output path, ensure it is not absolute and is lexically under `target`.
- Add unit/integration tests (if available in this repo) covering entry names like `/tmp/evil`, `\\server\share\evil` (Windows), and `C:\evil`.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
eeda520 to
2f8fd28
Compare
|
Code review by qodo was updated up to the latest commit 2f8fd28 |
| for entry in archive.entries()? { | ||
| let mut entry_decoder = entry?; | ||
| let entry_path: PathBuf = entry_decoder.path()?.iter().skip(1).collect(); | ||
| check_path_traversal(&entry_path)?; | ||
| let entry_target = target.join(entry_path); | ||
| fs::create_dir_all(entry_target.parent().unwrap())?; | ||
| entry_decoder.unpack(entry_target)?; |
There was a problem hiding this comment.
1. Tar symlink escape 🐞 Bug ⛨ Security
uncompress_tar() only rejects .. components, then unconditionally calls entry_decoder.unpack(entry_target), which can create symlinks/hardlinks from the archive. A malicious tar can first create a symlink inside the target (e.g., link -> /etc) and then write link/passwd, escaping the extraction directory despite passing the .. check.
Agent Prompt
## Issue description
`uncompress_tar()` currently validates paths only by rejecting `..` components, but then unpacks every tar entry as-is. Tar archives can contain symlinks and hardlinks; extracting them allows later entries to write outside the intended extraction directory by writing through those links.
## Issue Context
- `check_path_traversal()` only checks for `Component::ParentDir` and does not account for link-based escapes.
- `uncompress_tar()` uses `entry_decoder.unpack(entry_target)` without checking the tar entry type or preventing symlink/hardlink creation.
## Fix Focus Areas
- rust/src/files.rs[95-103]
- rust/src/files.rs[371-386]
## What to change
1. In `uncompress_tar()`, inspect each tar entry’s type (via the tar header) before unpacking.
2. Reject (return an error) for symlink and hardlink entry types (and any other non-regular types you don’t explicitly support).
- If directory entries are expected, allow directories.
- If only regular files are expected for these tarballs, allow only regular files.
3. (Optional but stronger) Add a defense-in-depth check that the extraction path does not traverse through symlinks already present on disk (walk parent components, using `symlink_metadata`, and fail if any component is a symlink).
## Suggested tests
Add a unit test that builds a tar containing:
- an entry that creates a symlink inside the extraction directory
- a subsequent entry that writes through that symlink
and assert extraction fails (or the escaped path is not written).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
2f8fd28 to
020978d
Compare
|
Code review by qodo was updated up to the latest commit 020978d |
| pub fn check_path_traversal(entry_path: &Path) -> Result<(), Error> { | ||
| if entry_path | ||
| .components() | ||
| .any(|c| { | ||
| matches!( | ||
| c, | ||
| std::path::Component::ParentDir | ||
| | std::path::Component::RootDir | ||
| | std::path::Component::Prefix(_) | ||
| ) | ||
| }) | ||
| { | ||
| return Err(anyhow!("Unsafe entry (path traversal): {:?}", entry_path)); | ||
| } | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
2. Tar.gz bypasses validation 🐞 Bug ⛨ Security
check_path_traversal() is never applied to the .gz extraction path: uncompress() dispatches GZ to untargz(), which calls Archive::unpack() directly without validating archive entry paths. As a result, tar.gz remains outside the PR’s “reject unsafe entries” behavior, undermining the stated traversal prevention for tar extraction.
Agent Prompt
### Issue description
The PR introduces `check_path_traversal()` and uses it in `uncompress_pkg()` and `uncompress_tar()`, but `.tar.gz` files are extracted via `untargz()` which does not validate entry paths.
### Issue Context
`uncompress()` routes `GZ` to `untargz()`, and `untargz()` uses `tar::Archive::unpack()` without calling `check_path_traversal()` on each entry. This means the security behavior added in this PR is not applied to tar.gz.
### Fix Focus Areas
- rust/src/files.rs[149-167]
- rust/src/files.rs[360-375]
### Suggested fix approach
- Refactor `untargz()` to iterate `archive.entries()?` and, for each entry:
- read `entry.path()?` (or equivalent),
- apply `check_path_traversal()` to that path,
- unpack the entry into the intended destination (e.g., by joining to `parent_path` after validation, or using an unpack-into-directory API if available).
- Optionally add a unit test similar to `uncompress_tar_rejects_path_traversal_entry`, but covering the `.gz` path to prevent regressions.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
020978d to
58424fb
Compare
|
Code review by qodo was updated up to the latest commit 58424fb |
| let entry_path: PathBuf = entry_decoder.path()?.iter().skip(1).collect(); | ||
| check_path_traversal(&entry_path)?; | ||
| let entry_target = target.join(entry_path); | ||
| fs::create_dir_all(entry_target.parent().unwrap())?; | ||
| entry_decoder.unpack(entry_target)?; |
There was a problem hiding this comment.
1. Tar first segment always dropped 🐞 Bug ≡ Correctness
uncompress_tar() always does iter().skip(1) on every entry path, so single-component entries (e.g., "file.txt") become an empty path and are unpacked to target itself. This can overwrite the intended extraction directory path with a file or cause extraction failures for tarballs that don't include a top-level folder.
Agent Prompt
### Issue description
`uncompress_tar()` unconditionally strips the first path component (`iter().skip(1)`), which can turn a valid tar entry path with a single component into an empty path. That causes `entry_target` to resolve to the extraction root (`target`) and `unpack()` may write a file at the `target` path or fail when the code expects `target` to be a directory.
### Issue Context
`unzip()` already has logic to strip the first component only when the archive paths actually contain a parent folder; tar extraction should behave similarly.
### Fix Focus Areas
- rust/src/files.rs[375-390]
- rust/src/files.rs[419-432]
### Suggested fix
- Mirror the `unzip()` approach:
- Count components (or otherwise detect whether there is a top-level folder).
- Only `skip(1)` when there is more than one component.
- Otherwise, keep the original path.
- Additionally, consider explicitly rejecting an empty `entry_path` for regular-file entries to avoid writing directly to `target`.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
58424fb to
35a1ad3
Compare
|
Code review by qodo was updated up to the latest commit 35a1ad3 |
| let name = entry.name(); | ||
| check_path_traversal(Path::new(name))?; | ||
| let mut file = Vec::new(); | ||
| cpio_reader.read_to_end(&mut file)?; | ||
| let target_path_buf = target_path.join(name); |
There was a problem hiding this comment.
1. Pkg empty name accepted 🐞 Bug ≡ Correctness
uncompress_pkg() validates ../absolute/prefixed paths but does not reject an empty
entry.name(), so target.join("") resolves to the extraction root and fs::write() may
create/overwrite the target path as a file instead of a directory. This can cause extraction
failures or clobber the intended extraction directory when the target path does not yet exist.
Agent Prompt
## Issue description
`uncompress_pkg()` calls `check_path_traversal(Path::new(name))`, but `check_path_traversal()` does not reject empty paths. If a CPIO entry name is `""`, then `target_path.join(name)` resolves to `target` itself, and `fs::write(&target_path_buf, file)` may create/overwrite the extraction root path as a file (especially when `target` doesn’t exist yet).
## Issue Context
The tar extraction path already explicitly rejects an empty entry path; pkg/cpio should match that safety behavior.
## Fix Focus Areas
- rust/src/files.rs[223-253]
- rust/src/files.rs[95-107]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Add validation to reject archive entries containing ParentDir (..) components in uncompress_tar() and uncompress_pkg(). This prevents malicious archives from writing files outside the intended extraction directory. The shared validation logic is refactored into check_path_traversal() to avoid duplication (DRY principle). - CVE-2025-XXXX: Path Traversal via uncompress_tar - CVE-2025-YYYY: Path Traversal via uncompress_pkg
35a1ad3 to
9169379
Compare
|
Code review by qodo was updated up to the latest commit 9169379 |
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
| } else { | ||
| path.to_path_buf() | ||
| }; | ||
| if entry_path.as_os_str().is_empty() { |
There was a problem hiding this comment.
Do we need this since 2 lines later we do it again in check_path_traversal?
🔗 Related Issues
NA
💥 What does this PR do?
Add validation to reject archive entries containing
ParentDir(..)components inuncompress_tar()anduncompress_pkg(). This prevents malicious archives from writing files outside the intended extraction directory.The shared validation logic is refactored into
check_path_traversal()to avoid duplication (DRY principle).uncompress_taruncompress_pkg🔧 Implementation Notes
🤖 AI assistance
💡 Additional Considerations
🔄 Types of changes