v0.3.0: Multi-Kernel Dispatch, Memory Pools, Global Reductions
RingKernel v0.3.0
GPU-native persistent actor model framework for Rust. This release adds multi-kernel dispatch, memory pools, global reduction primitives, and two new crates.
Highlights
- 21 crates published to crates.io - Full workspace now available
- 825+ tests across the workspace
- cudarc 0.18.2 and wgpu 27.0 support
New Features
Multi-Kernel Dispatch and Persistent Message Routing
#[derive(PersistentMessage)]macro for GPU kernel dispatchKernelDispatchercomponent with builder pattern and metrics- CUDA handler dispatch code generator (
CudaDispatchTable) - Queue tiering system (
QueueTier,QueueFactory,QueueMonitor)
Memory Pool Management
StratifiedMemoryPoolwith 5 size buckets (256B to 64KB)AnalyticsContextfor grouped buffer lifecyclePressureHandlerfor memory pressure monitoring- CUDA
ReductionBufferCacheand WebGPUStagingBufferPool
Global Reduction Primitives
ReductionOpenum: Sum, Min, Max, And, Or, Xor, ProductReductionBuffer<T>using mapped memory (zero-copy host read)- Multi-phase kernel execution with
SyncMode(Cooperative, SoftwareBarrier, MultiLaunch) - PageRank example with dangling node handling
CUDA NVRTC Compilation
compile_ptx()function for runtime CUDA compilation- Downstream crates can compile CUDA without direct cudarc dependency
Domain System
- 20 business domains with reserved type ID ranges
#[message(domain = "FraudDetection")]attribute- Domains: GraphAnalytics, FraudDetection, ProcessIntelligence, Banking, etc.
New Crates
ringkernel-montecarlo- Philox RNG, antithetic variates, control variates, importance samplingringkernel-graph- CSR matrix, BFS, SCC (Tarjan/Kosaraju), Union-Find, SpMV
Breaking Changes
- cudarc API updated to 0.18.2 (module loading, kernel launch builder pattern)
- wgpu API updated to 27.0 (Arc-based resources)
Installation
[dependencies]
ringkernel = "0.3.0"
# Optional backends
ringkernel-cuda = "0.3.0"
ringkernel-wgpu = "0.3.0"Documentation
Full Changelog: v0.2.0...v0.3.0