This document explores alternative approaches to kernel/user mode interaction beyond traditional syscalls, as implemented in RustOS.
Traditional operating systems rely on syscalls for kernel/user mode interaction, which have several limitations:
- Context Switch Overhead: Each syscall requires expensive context switching
- Serialization: Syscalls are inherently synchronous, blocking the calling thread
- Validation Overhead: Kernel must validate every parameter on each call
- Cache Pollution: Context switches pollute CPU caches
- Large Attack Surface: Hundreds of syscalls provide many potential attack vectors
- Parameter Validation: Complex validation logic is error-prone
- Time-of-Check-Time-of-Use (TOCTOU): Race conditions between validation and use
- Privilege Escalation: Bugs in syscall handlers can lead to privilege escalation
- Rigid Interface: Difficult to extend or modify syscall interfaces
- Global Namespace: All processes share the same syscall namespace
- Coarse-Grained Permissions: File descriptor-based permissions are insufficient
RustOS introduces Capability Channels as a revolutionary alternative to syscalls:
// Instead of: fd = open("/path/to/file", O_RDWR)
// RustOS uses:
let file_cap: FileCapability = capability_runtime
.request_capability(FileSystemCapability::Open {
path: "/path/to/file".into(),
mode: OpenMode::ReadWrite,
})
.await?;┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Application │ │ Capability │ │ Kernel │
│ │ │ Runtime │ │ │
│ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │
│ │Capability │ │◄──►│ │ Channel │ │◄──►│ │ Service │ │
│ │ Handle │ │ │ │ Manager │ │ │ │ Provider │ │
│ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Replace syscalls with message passing between user processes and kernel services.
// Example: L4-style IPC
pub struct Message {
sender: ProcessId,
data: [u64; 8],
capabilities: Vec<Capability>,
}
// Send message to file system service
let response = ipc_send(filesystem_service_id, Message {
sender: current_process_id(),
data: [OPEN_FILE, path_ptr, flags, 0, 0, 0, 0, 0],
capabilities: vec![],
}).await?;- Clean separation between kernel and services
- Well-studied approach (L4, Minix, etc.)
- Flexible and extensible
- High message passing overhead
- Complex error handling
- Difficult to achieve zero-copy semantics
Influences RustOS design but pure message passing is too slow for modern workloads.
Use shared memory regions with lock-free queues for kernel/user communication.
// Shared memory ring buffer
pub struct SharedRingBuffer<T> {
buffer: *mut T,
head: AtomicUsize,
tail: AtomicUsize,
capacity: usize,
}
// Usage
let ring = SharedRingBuffer::new(kernel_shared_memory)?;
ring.push(FileOperation::Open("/path/to/file"))?;
let result = ring.pop_response().await?;- Very high performance
- No context switching for fast operations
- Can batch multiple operations
- Complex synchronization required
- ABA problems and memory ordering issues
- Difficult to debug
- Security implications of shared memory
Used internally in RustOS for high-performance subsystems but not exposed to applications.
Allow user code to run safely in kernel context using sandboxing.
// User-defined packet filter running in kernel
#[ebpf_program]
fn packet_filter(packet: &NetworkPacket) -> bool {
packet.dest_port() == 80 || packet.dest_port() == 443
}
// Register program with kernel
kernel.register_ebpf_program(PacketFilterProgram(packet_filter))?;- Eliminates kernel/user boundary for performance-critical code
- Proven approach (Linux eBPF, Windows Kernel Callback Functions)
- Safe execution through verification
- Limited expressiveness
- Complex verification requirements
- Difficult toolchain
- Not suitable for general-purpose programming
Planned for future RustOS extensions but not the primary interface.
Use modern CPU features to accelerate traditional syscalls.
// Intel CET (Control-flow Enforcement Technology) assisted syscalls
#[intel_cet_syscall]
fn fast_read(fd: i32, buffer: &mut [u8]) -> Result<usize> {
// Hardware-assisted parameter validation
// Reduced context switch overhead
kernel_read_implementation(fd, buffer)
}- Maintains familiar programming model
- Leverages hardware acceleration
- Can be optimized by CPU vendors
- Still requires context switching
- Platform-specific optimizations
- Limited by hardware capabilities
- Doesn't solve security issues
Useful optimization but doesn't address fundamental syscall limitations.
Compile application and OS into single address space, eliminating kernel/user boundary.
// Application directly links with OS services
use rustos_lib::filesystem::File;
fn main() {
// Direct function call, no syscall
let file = File::open("/path/to/file")?;
let content = file.read_to_string()?;
}- Ultimate performance - no syscalls needed
- Simple programming model
- Excellent for specialized applications
- No isolation between application and OS
- Single address space limitations
- Not suitable for multi-tenant environments
- Debugging complexity
Influences RustOS for trusted applications but not the general solution.
RustOS implements a three-tier approach combining the best aspects of multiple alternatives:
For trusted, performance-critical code:
// Zero-copy, direct invocation
let result = file_capability.read_direct(buffer).await?;For untrusted or complex operations:
let response = capability_channel
.send(FileSystemOperation::Read { offset: 0, size: 1024 })
.await?;Minimal set for bootstrapping and debugging:
// Only for initial capability acquisition
unsafe {
syscall!(ACQUIRE_INITIAL_CAPABILITIES, capability_spec_ptr)
}| Approach | Latency | Throughput | Security | Complexity |
|---|---|---|---|---|
| Traditional Syscalls | High | Low | Medium | Low |
| Pure Message Passing | High | Medium | High | High |
| Shared Memory | Low | High | Low | High |
| eBPF Sandboxing | Very Low | Very High | High | Very High |
| Hardware Fast Syscalls | Medium | Medium | Medium | Medium |
| Unikernel | Very Low | Very High | Very Low | Low |
| RustOS Capability Channels | Low | High | High | Medium |
- Principle of Least Privilege: Each capability grants only specific permissions
- Unforgeable: Capabilities cannot be created or modified by user code
- Revocable: Capabilities can be revoked at any time
- Delegatable: Capabilities can be safely shared between processes
- Type-Safe: Rust's type system prevents capability misuse
// Traditional syscall approach
let fd = unsafe { libc::open(path.as_ptr(), O_RDONLY) };
if fd < 0 { return Err(Error::from_errno()); }
let bytes_read = unsafe { libc::read(fd, buffer.as_mut_ptr(), buffer.len()) };
unsafe { libc::close(fd); }
// RustOS capability channel approach
let file_cap = capability_runtime
.request_file_capability(path, FileAccess::ReadOnly)
.await?;
let content = file_cap.read_to_end().await?;
// Capability automatically revoked when dropped- Intel MPK (Memory Protection Keys): Hardware-enforced capability boundaries
- ARM Pointer Authentication: Unforgeable capability tokens
- RISC-V Capability Hardware: Native capability support in hardware
- Capability Correctness: Mathematical proofs of capability system properties
- Information Flow Control: Formal verification of information flow policies
- Temporal Logic: Verification of capability lifecycle properties
- Batched Operations: Multiple operations in single capability invocation
- Speculative Execution: Predictive capability pre-loading
- NUMA-Aware Capabilities: Capability placement for NUMA systems
This document represents ongoing research into kernel/user mode interaction alternatives. The RustOS approach continues to evolve based on implementation experience and performance analysis.