Conversation
3f51fdf to
f37bb13
Compare
f37bb13 to
d8d82b1
Compare
|
#21 concerns the system call overhead in kbox, which has been observed to be as high as 33× compared to native (bare-metal) execution. Addressing this overhead is a prerequisite for improving overall system performance. To this end, I plan to prioritize closing #21 by exploring a solution based on binary rewriting. Given current infrastructure constraints and to ensure efficient validation, the initial implementation will focus on x86-64 and arm64 architectures, both of which are fully supported by the machines provided through GitHub Actions. After the proposed solution has been thoroughly implemented and validated on these architectures, we can proceed to extend support to RISC-V. |
src/seccomp-defs.h
Outdated
| { \ | ||
| (unsigned short) (c), (unsigned char) (t), (unsigned char) (f), \ | ||
| (unsigned int) (val) \ | ||
| } |
There was a problem hiding this comment.
Looks like unrelated style changes?
Please keep the changes to a minimum.
Otherwise, this will make git blame messy.
There was a problem hiding this comment.
This change was made because the source code wasn't properly formatted with clang-format, which caused it to be blocked by the new commit-hook. Did you mean I have to separate it to another commit?
There was a problem hiding this comment.
Ideally, a single patch should contain only one logical change.
However, it looks like this patch is doing two things:
- Fixing a pre-existing clang-format error.
- Adding RISC-V support.
So, yes, I would prefer splitting this into two patches.
src/seccomp-supervisor.c
Outdated
| const struct kbox_host_nrs *host_nrs = &HOST_NRS_X86_64; | ||
| #elif defined(__aarch64__) | ||
| #elif defined(__aarch64__) || (defined(__riscv) && __riscv_xlen == 64) | ||
| const struct kbox_host_nrs *host_nrs = &HOST_NRS_AARCH64; |
There was a problem hiding this comment.
Reusing the HOST_NRS_AARCH64 struct directly for RISC-V is semantically confusing for future maintenance. Maybe consider adding a preparatory refactoring patch to rename it to something more generic before wiring up RISC-V support?
d8d82b1 to
0000234
Compare
|
I have implemented the seccomp mode for riscv64 in this commit, and trap mode and rewrite mode are left unimplemented. This can be build with Or using cross-compile toolchain on the x86-64 machine: The seccomp mode is tested for all architectures(aarch64, x86-64, riscv64), and the result is as same as the result before this commit: If the trap mode is specified when running on riscv64 architecture, the following error message will be shown currently: I am currently working on the trap and rewrite modes for riscv64. Would you prefer I include them in this PR, or should I submit them in a follow-up PR? |
0000234 to
0000de4
Compare
This introduces the riscv64 architecture to kbox and just implements seccomp mode first. The trap mode and the rewrite mode are left unimplemented at this moment. Change-Id: Ie95e7d74dbef45a05df8c81c37362b9d119520c4
0000de4 to
0000846
Compare
This implements the trap mode for riscv64 architecture. The rewrite mode is left unimplemented at this moment. Change-Id: I3509e98008c4f7831c086d387d139d4dbab056f3
The are some arch-related implementations in the unit test. This updates them. Change-Id: Ic97b0639aa70f4499d6f5e4b95ff6b512baf0db1
|
@jserv I have implemented both the seccomp mode and the trap mode for riscv64 architecture and update the unit test. Also, I have run the unit tests, integration tests and stress tests for riscv64 architecture. For the unit test, all of the 276 tests passed. For the integration tests, there are two FAILs, signal-test and rewrite-smoke. For the stress tests, there is one FAIL, concurrent-io. The signal-test and concurrent-io failed before these commits (Even on x86-64 machine), so it should be fixed in another PR. The reason that rewrite-smoke failed is that the rewrite mode is unimplemented for riscv64. For not making this PR too big, I think it should be included in another PR. My test process is as follows. $ make ARCH=riscv64 CC=riscv64-linux-gnu-gcc
$ make ARCH=riscv64 CC=riscv64-linux-gnu-gcc check-unitPut $ ./run-tests.sh ./kbox alpine.ext4Use the following command to run the stress tests: $ ./run-stress.sh ./kbox alpine.ext4 |
|
|
||
| - x86_64 | ||
| - aarch64 | ||
| - riscv64 |
There was a problem hiding this comment.
Add notable limitation about RISC-V 64-bit target.
Agree. Update docs/ directory accordingly. |
This PR did the following things:
According to the kernel document[1], syscalls are specified in
arch/*/kernel/Makefile.syscallsandscripts/syscall.tbl.It is found that all the aarch64 host syscalls used in kbox has either common, 64 or rlimit tag in
scripts/syscall.tbl[2], which means riscv64 share the same host syscall table.Also, there are two errors found in the host syscall table. (Also according to
scripts/syscall.tbl[2]):The build script of rootfs is updated, so the following command can produce a usable alpine.ext4:
For the
fetch-lkl.shscript, the riscv64 support is added. However, the pre-built lkl binary has to be updated on GitHub (The github action has to be updated), so for now, it can only be compiled by explicitly assigning the LKL path in the config. The LKL build process is as follows (on x86-64 ubuntu 24.04):With the LKL path set in the config, the following command can successfully build the kbox:
The compiled kbox is successfully run on QEMU and execute /bin/sh.[3]
[1] https://docs.kernel.org/process/adding-syscalls.html#since-6-11
[2] https://github.com/torvalds/linux/blob/master/scripts/syscall.tbl
[3] https://hackmd.io/@rota1001/kbox-rv64#Run-it-on-riscv64
Summary by cubic
Adds riscv64 support for both seccomp and trap modes by sharing a generic 64-bit host syscall table with
aarch64and updating the Alpineriscv64build. Rewrite mode remains unsupported on riscv64.New Features
HOST_NRS_GENERIC; reuse theaarch64BPF deny list; set audit arch0xc00000f3; reportuname -mas "riscv64"; add loader entry/transfer; implement trap syscall6 and fast paths forfutexwait/wake,exit_group,execve/execveat,clone/clone3, andrt_sigprocmaskunblock.riscv64SHA256; pass-Wl,--no-relaxto speed riscv64 linking; listriscv64in README.Bug Fixes
sigaltstack(notrt_sigaltstack) in dispatch and host syscall tables.aarch64syscall numbers:getrlimit=163,statfs=43.p_memszis non-zero and smaller thanp_filesz.riscv64trap decoding, syscall IP ranges, and result handling.Written for commit 000023a. Summary will update on new commits.