When creating CPIO archives, symlinks are discarded.
How to reproduce?
I created this simple test:
import libarchive
from ctypes import c_void_p
def print_archive_info(test_name, archive_bytes):
print(test_name)
with libarchive.memory_reader(archive_bytes) as archive:
for entry in archive:
print(f" entry.path: {entry.path}")
print(f" entry.size: {entry.size}")
print(f" entry.filetype: 0o{entry.filetype:o}")
print(f" entry.linkpath: {entry.linkpath}")
packed_chunks = []
def write_func(data):
packed_chunks.append(bytes(data))
return len(data)
path = "sbin"
linkpath = "usr/sbin"
entry_size = len(linkpath)
with libarchive.custom_writer(write_func, "cpio_newc") as archive:
archive.add_file_from_memory(
entry_path=path,
entry_size=len(linkpath),
entry_data=b"",
filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
linkpath=linkpath.encode("utf-8"),
)
packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive standard method:", packed_bytes)
# fix that works:
packed_chunks = []
with libarchive.custom_writer(write_func, "cpio_newc") as archive:
new_entry = libarchive.entry.ArchiveEntry(
pathname=path,
size=len(linkpath),
filetype=libarchive.entry.FileType.SYMBOLINK_LINK,
linkpath=linkpath.encode("utf-8"),
)
libarchive.ffi.ffi('entry_set_symlink', [c_void_p], None)
libarchive.ffi.entry_set_symlink(new_entry._entry_p, linkpath.encode("utf-8"))
libarchive.ffi.write_header(archive._pointer, new_entry._entry_p)
libarchive.ffi.write_finish_entry(archive._pointer)
packed_bytes = b"".join(packed_chunks)
print_archive_info("libarchive fix method:", packed_bytes)
Which outputs the following:
$ python3 libarchive_symlink.py
libarchive standard method:
entry.path: sbin
entry.size: 0
entry.filetype: 0o120000
entry.linkpath:
libarchive fix method:
entry.path: sbin
entry.size: 8
entry.filetype: 0o120000
entry.linkpath: usr/sbin
You can see that using add_file_from_memory with the linkpath argument, the resulting symlink size is 0, and the symlink doesn't actually point to anything.
Investigations
I spent some time investigated this. Here is what happens when using add_file_from_memory:
- in the @linkpath.setter,
ffi.entry_update_link_utf8(self._entry_p, value) is called
- in libarchive, this corresponds to archive_entry_update_link_utf8. As some other comments hint: "Set symlink if symlink is already set, else set hardlink". So at this point it sets the file to a hardlink.
- when the entry header is created, the CPIO writer runs into this in write_header:
/* Non-regular files don't store bodies. */
if (archive_entry_filetype(entry) != AE_IFREG)
archive_entry_set_size(entry, 0);
- then only if the file is a symlink, it will get its size and write the linkpath, etc
- as we mentioned, our file is a hardlink at this point, so its size gets set to
0, and no linkpath is inserted
Fix
As you can see in the test script above, I found a fix for this issue: if we call ffi.entry_set_symlink before writing the header file, then the file is converted from hardlink to symlink, and the CPIO writer actually writes the right size and linkpath as you would expect.
This entry_set_symlink needs to be imported from the C libarchive library. entry_set_link_to_symlink is also available in the latest libarchive, which would avoid having to copy the linkpath again. But older libarchive binaries won't have this function (was testing on MacOS 15.7.1 and the system libarchive didn't have it), which is why I used entry_set_symlink.
I am not sure if this issue also affect other format writers other than CPIO. If so, I think the call to entry_set_symlink could happen at the end of @linkpath.setter. If this issue is specific to the CPIO writers, then a more targeted fix should be developed.
When creating CPIO archives, symlinks are discarded.
How to reproduce?
I created this simple test:
Which outputs the following:
You can see that using
add_file_from_memorywith thelinkpathargument, the resulting symlink size is0, and the symlink doesn't actually point to anything.Investigations
I spent some time investigated this. Here is what happens when using
add_file_from_memory:ffi.entry_update_link_utf8(self._entry_p, value)is called0, and no linkpath is insertedFix
As you can see in the test script above, I found a fix for this issue: if we call
ffi.entry_set_symlinkbefore writing the header file, then the file is converted from hardlink to symlink, and the CPIO writer actually writes the right size and linkpath as you would expect.This
entry_set_symlinkneeds to be imported from the C libarchive library.entry_set_link_to_symlinkis also available in the latest libarchive, which would avoid having to copy thelinkpathagain. But olderlibarchivebinaries won't have this function (was testing on MacOS 15.7.1 and the system libarchive didn't have it), which is why I usedentry_set_symlink.I am not sure if this issue also affect other format writers other than CPIO. If so, I think the call to
entry_set_symlinkcould happen at the end of@linkpath.setter. If this issue is specific to the CPIO writers, then a more targeted fix should be developed.