linux-user: Avoid PT_LOAD overlap on 16K hosts#304
Open
LaurenIsACoder wants to merge 1 commit into
Open
Conversation
On host pages larger than TARGET_PAGE_SIZE, keep using host-page ELF load granularity only when widening a PT_LOAD down to the host page boundary does not overlap another PT_LOAD. The Claude Code x86_64 binary places the tail of an RX PT_LOAD and start of a later RW PT_LOAD in the same 16K host page. Widening the RW PT_LOAD backward to the host page start overwrites the earlier .plt/.iplt bytes with zeros before guest execution begins. Execution then reaches 0x00 0x00, decodes it as 'add byte ptr [rax], al', and faults when the translated access touches address 0. Fix this by selecting TARGET_PAGE_SIZE only for PT_LOADs whose host-page widening would overlap another PT_LOAD. This preserves the older host-page behavior for aligned non-overlapping data segments while preventing a later writable segment from clobbering an earlier executable tail page. Add unit tests covering the overlap fallback, the aligned non-overlapping host-page case, and the small-p_align fallback.
Contributor
Author
|
@xiangzhai This PR should address the 16K-host-page Claude Code startup The fix is narrowed so that we only fall back to guest-page granularity If convenient, could you help verify it on your setup? |
Contributor
|
Thanks, |
Contributor
|
测试可以在16K页下跑起来了。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes an ELF
PT_LOADloading bug on 16K-page LoongArch hostsrunning 4K-page x86_64 guests.
The issue was reproduced with the official Claude Code Linux x64 release
asset
claude-linux-x64.tar.gzfrom:https://claude.ai/install.shhttps://github.com/anthropics/claude-code/releasesThe root cause is not incorrect instruction translation. A later writable
PT_LOADcould be widened backward to the host-page boundary and overlapthe tail page of an earlier executable
PT_LOAD, zeroing.plt/.ipltbytes before guest execution begins.
This PR keeps host-page behavior for aligned non-overlapping segments,
and falls back to guest-page granularity only when host-page widening
would overlap another
PT_LOAD.It also adds focused regression tests for:
p_alignmust still useTARGET_PAGE_SIZE.English Details
Problem
The issue was reproduced with the official Claude Code x86_64 Linux
native binary release asset
claude-linux-x64.tar.gz.Source:
https://claude.ai/install.shhttps://github.com/anthropics/claude-code/releasesOn a 16K-page LoongArch kernel, the following minimal command crashed
immediately:
export LATX_AOT=0 LATX_KZT=0 ./build64/latx-x86_64 /tmp/claude-code-test/claude --helpBefore this patch:
SIGSEGV139After this patch:
0Symptom
At the crash site, execution reached guest bytes
00 00, which decode as:With
rax == 0, the translated LoongArch memory access faulted onaddress
0.That observation was correct, but it was only the symptom. The more
important question was why the guest reached zero-filled bytes there
at all.
Root Cause
The Claude binary contains two adjacent
PT_LOADsegments:PT_LOAD(R E) ending at0x60ec1a0PT_LOAD(RW) starting at0x60ed1a0From the guest 4K-page point of view, these are distinct guest pages:
On a 16K host, both guest pages fall into the same host page:
The old loader logic could choose host-page ELF loading granularity for
a segment if its
p_alignmatched the host page size. For the laterwritable
PT_LOAD,p_align = 0x4000, so the loader considered iteligible for 16K widening and computed:
instead of the guest-page-granular:
That caused the later writable segment to be loaded starting from the
same 16K host page that already contained the tail of the earlier
executable segment.
As a result, the earlier
.plt/.ipltbytes were overwritten during ELFloading.
This was verified at a breakpoint in
do_init_thread, before guestexecution began: the bytes at
0x60ec080and0x60ec120were alreadyall zeros before the guest ran, proving this was a loader bug rather
than a runtime corruption bug.
Why 4K Hosts Worked but 16K Hosts Failed
On a 4K host, the two guest pages above are also different host pages,
so they do not interfere.
On a 16K host, they share one host page:
If the later writable segment is widened backward to
0x60ec000, itoverwrites the earlier executable tail page.
So:
Design History / Risk Assessment
This area already has a non-trivial history in upstream QEMU.
In 2014,
linux-user: Tell guest about big host page sizes(
a70daba3771) exposed larger host page granularity viaAT_PAGESZso the guest would not assume mappings smaller than the host can honor.
In 2018,
linux-user: fix ELF load alignment error(
33143c446e) added the fallback toTARGET_PAGE_SIZEwhen aPT_LOAD'sp_alignis smaller than the host page size.Later in 2018,
linux-user: elf: mmap all the target-pages of hostpage for data segment(94894ff2d13) extended host-page behavior foraligned data segments so glibc could consume the remainder of the last
host page.
However, the current LATX tree no longer fully preserves those original
upstream assumptions:
AT_PAGESZis forced back toTARGET_PAGE_SIZE, rather than exposingthe larger host page size;
PT_LOADs are no longer mapped by rounding the mappinglength up to the widened host-page length.
So the current LATX tree is not implementing the full original
2014/2018 upstream behavior anymore.
To reduce risk, this patch does not globally force all
host > guestcases back to guest-page granularity. Instead:PT_LOADdown to the host-page boundary does notoverlap another
PT_LOAD, host-page behavior is preserved;PT_LOAD, we fall back toTARGET_PAGE_SIZEfor that segment.Fix
This patch extracts page-granularity selection into:
linux-user/elfload-pagesize.hlinux-user/elfload-pagesize.cThe new logic is:
Determine whether the current
PT_LOADis eligible for host-pagealignment based on
p_align.If it is eligible, check whether widening it downward to the host page
start would overlap the memory range of any other
PT_LOAD.If widening would overlap another
PT_LOAD:TARGET_PAGE_SIZEOtherwise:
This does not remove host-page behavior in general. It only prevents a
later segment from extending into bytes that belong to an earlier one.
16K Host Page Layout
Affected host page:
Before the fix:
Observed broken bytes before guest execution:
After the fix:
Observed correct bytes at
do_init_threadafter the fix:Tests
The current LATX tree did not contain an existing automated regression
test for this historical design area, so this PR adds a focused unit
test:
tests/unit/test-elfload-pagesize.cIt covers three cases:
p_alignmust still useTARGET_PAGE_SIZE.Test result:
The original reproducer was also rerun successfully:
Result
This PR fixes the Claude Code startup crash on 16K-page LoongArch hosts
caused by overlapping
PT_LOADwidening during ELF loading.After the fix:
.plt/.ipltis no longer zeroed during load;00 00 -> add [rax], al;Claude --helpruns successfully;the older non-overlapping host-page-aligned behavior.
中文说明
问题概述
这个 PR 修复了 LoongArch 16K 宿主页环境下,
linux-userELF 装载阶段对相邻
PT_LOAD段处理不当的问题。问题最初是通过 Claude Code 官方发布的 x86_64 Linux 原生二进制复现的,
对应发布资产名为:
claude-linux-x64.tar.gz来源:
https://claude.ai/install.shhttps://github.com/anthropics/claude-code/releases在 16K 页大小的 LoongArch Linux 内核上,运行下面的最小命令会直接段错误:
export LATX_AOT=0 LATX_KZT=0 ./build64/latx-x86_64 /tmp/claude-code-test/claude --help修复前:
SIGSEGV139修复后:
0直接症状
崩溃时执行到了 guest 字节
00 00,它会被解码为:此时
rax == 0,翻译后的 LoongArch 访存在地址0触发异常。这个现象本身没有错,但它只是症状。真正的问题是:
为什么 guest 会执行到一片本不该是 0 的区域。
根因分析
Claude 二进制里有两个相邻的
PT_LOAD:PT_LOAD(R E),结束于0x60ec1a0PT_LOAD(RW),开始于0x60ed1a0从 guest 4K 页视角看,它们分别是:
这是合法布局。
但在 16K host 上,这两个 guest 4K 页都落进同一个 host 页:
旧逻辑在
p_align满足 host 页对齐时,会按 host 页粒度计算PT_LOAD的PAGESTART。对后一个RW PT_LOAD来说,p_align = 0x4000,于是会得到:而不是 guest 4K 语义下更合理的:
这样后一个可写段在装载时就回踩到了前一个可执行段所在的
同一 16K 页前半部分,把
.plt/.iplt覆盖掉了。我在
do_init_thread断点处检查过,确认在 guest 开始执行前:0x60ec0800x60ec120这些本应属于
.plt/.iplt的位置,修复前已经全是0x00。也就是说:
随后执行流进入这页零字节:
按 x86_64 语义正常解码为:
最后在
rax == 0时触发段错误。为什么 4K 内核正常,16K 内核出错
在 4K host 上:
0x60ec000 - 0x60ecfff0x60ed000 - 0x60edfff本来就是两个不同的 host 页,所以不会互相覆盖。
在 16K host 上:
0x60ec000 - 0x60effff这个 host 页
如果后一个
PT_LOAD被错误地向下扩展到0x60ec000,就会覆盖前一个段的尾页内容。
因此:
历史设计背景 / 风险评估
这个区域在 upstream QEMU 里本来就有一段比较复杂的历史。
2014 年,
linux-user: Tell guest about big host page sizes(
a70daba3771) 通过AT_PAGESZ把更大的 host 页粒度告诉 guest。2018 年,
linux-user: fix ELF load alignment error(
33143c446e) 又补了当PT_LOAD的p_align小于 host 页大小时,退回
TARGET_PAGE_SIZE的逻辑。2018 年稍后,
linux-user: elf: mmap all the target-pages of hostpage for data segment(94894ff2d13) 又继续保住 aligned data segment的 host-page 行为,避免 glibc 消费最后一个 host 页时出错。
但是当前 LATX 代码树已经不再完整保留 upstream 当年的那套前提:
AT_PAGESZ已经固定成TARGET_PAGE_SIZE,不再把 host 大页告诉 guest;
PT_LOAD的装载路径也不再按放大后的 host 页长度整体 mmap。所以当前 LATX 树并不是在完整沿用 2014/2018 upstream 那套
“大 host 页语义”设计。
为了降低风险,这个 PR 没有简单粗暴地把所有
host > guest的场景都退回 guest 页,而是更窄地处理:PT_LOAD按 host 页向下扩展后不会与其它PT_LOAD重叠,就继续保留 host 页行为;
PT_LOAD时,才退回
TARGET_PAGE_SIZE。修复方案
这次修复把页粒度判定逻辑抽成了独立 helper:
linux-user/elfload-pagesize.hlinux-user/elfload-pagesize.c新逻辑如下:
先按原有规则判断当前段是否有资格使用 host 大页:
p_align是否满足 host 页对齐要求如果有资格,再额外检查:
PT_LOAD的内存区间相交若相交:
TARGET_PAGE_SIZE若不相交:
也就是说,这个 PR 修复的不是“host 大页行为本身”,
而是防止 host 大页对齐把后一个段错误地扩进前一个段里。
16K 页内布局图
受影响的 host 页是:
修复前:
实际观测到的修复前内容:
修复后:
修复后在
do_init_thread前检查到:这些字节与二进制文件中的
.plt/.iplt一致,说明装载阶段不再错误清零该页。
测试
当前代码树里没有现成覆盖这个历史设计问题的自动化测试,
所以这个 PR 新增了一个最小单元测试:
tests/unit/test-elfload-pagesize.c覆盖了 3 个场景:
p_align小于 host 页大小时仍回退到 guest 页。运行结果:
同时原始复现也重新验证通过:
结果
这个 PR 修复了 16K LoongArch 宿主上,Claude Code x86_64 二进制在启动阶段
因为相邻
PT_LOAD回踩而导致的崩溃问题。修复后:
.plt/.iplt00 00 -> add [rax], alClaude --help可正常运行