Skip to content

Native ELF binaries shipped with .NET have non-deterministic GNU build-id #128157

@mthalman

Description

@mthalman

Summary

Every native ELF (.so, executable, statically-linked apphost) shipped in the .NET runtime contains a .note.gnu.build-id section whose value is regenerated on every link. This makes the binaries non-reproducible — two builds of the same source produce different bytes in this section (and in cascading sections that reference it).

Why this matters

A reproducibility validator that wants byte equality for shipped binaries today has only one option: pin the toolchain (clang/LLD) version that produced the original build and use the exact same version for the rebuild. That works, but it's a workaround with a structural failure mode:

  • Whenever the .NET runtime's toolchain rotates, there is a synchronization window during which the validator either (a) reproduces with the new toolchain and diffs against the original, or (b) reproduces with the old toolchain and diffs against the next batch of builds. Either way, validation breaks until the validator catches up, and the only signal that anything is wrong is a re-emergence of these well-known sections in the diff.

Evidence

Two byte-for-byte builds of the same dotnet/dotnet VMR commit produce different .note.gnu.build-id contents in every native ELF (~24 files).

Sample: section bytes from libcoreclr.so

Both builds place .note.gnu.build-id at file offset 0x2a8, section size 36 bytes:

Build A: 04 00 00 00 14 00 00 00 03 00 00 00 47 4e 55 00
         f6 22 0d 25 fa 8d eb cb cc fd b0 52 47 4f e8 44 23 2b d0 95

Build B: 04 00 00 00 14 00 00 00 03 00 00 00 47 4e 55 00
         6c 5d 92 f6 51 5f 51 1a 0b 4b 93 6e 10 d7 df c6 03 98 8d 10

The first 16 bytes are the standard ELF Note header (namesz=4, descsz=20, type=NT_GNU_BUILD_ID=3, name="GNU\0"). The trailing 20 bytes are the build-id descriptor — completely different across the two builds, despite identical source.

All build-ids differ across two same-source builds

File Build A build-id Build B build-id
libcoreclr.so f6220d25fa8debcbccfdb052474fe844232bd095 6c5d92f6515f511a0b4b936e10d7dfc603988d10
libclrjit.so 8dd92fab6e507e1b2972ff4721d31b40537471e0 b943004add0b171196bee07c44b04b58a12eccfd
libhostfxr.so 409df3ef5a63b4436c30df983a245589741cfca7 07a65e15afdb7a14a1cb1989ac9863c9a3be6215
libhostpolicy.so d16c0bc440b943a58fb725d32532329e004a1ec4 d8ef29d6dfab3092839c947764a32aa0c18d650f
libnethost.so 4fef6da818c5e88c49084dc8467c47b0cde6aa0d 0392820295d7bfed90769378695a0a2ab8273705
createdump 4dfcec79088227c1919a1ebc50da4a8cac2a1162 1907c9bbb808b1bfa26bff591288f204ff3c311b
apphost ddab5e9549858d73e9758b463953b89cb3171342 547041754334b02d858b44a81d1efefd12d3fd65
singlefilehost dce101890cbe88cb3a6eecef93d3c6e98442af9f 7259ca0c659bd8bd9cac3a5952267b59194c069e

The same pattern holds for the rest of the shipped native ELFs: libSystem.IO.Compression.Native.so, libSystem.Native.so, libSystem.Net.Security.Native.so, libSystem.Globalization.Native.so, libSystem.Security.Cryptography.Native.OpenSsl.so, libcoreclrtraceptprovider.so, libdbgshim.so, libmscordaccore.so, libmscordbi.so, and dotnet.

Cascading effects

The build-id non-determinism cascades into two further sections:

  1. .gnu_debuglink CRC32: This section embeds a CRC32 of the separately-shipped debug file. Because the debug file's build-id differs, its CRC differs, and the live binary's .gnu_debuglink section reflects that.

    Example (libcoreclr.so, .gnu_debuglink section size 24 bytes, name libcoreclr.so.dbg):

    • Build A CRC32: 0x71609212
    • Build B CRC32: 0xe7fb9678
  2. singlefilehost embedded build-ids: singlefilehost bundles other runtime libraries and embeds their build-ids, so any drift in the bundled libraries appears here too.

Context

Found while building the SDK reproducibility validation test for the dotnet/dotnet VMR (dotnet/source-build#5486). Resolution of this issue is important to meet the goal of reproducible builds: dotnet/source-build#4963

Related issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions