-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)
Context
I performed comprehensive benchmarks comparing the two primary methods of using apalis-sqlite:
- Shared Storage (Polling) - Using
SharedSqliteStoragewith standard polling - Hooked Storage (Event-driven) - Using
SqliteStorage::new_with_callback()with SQLite update hooks
This analysis was conducted to help choose the best approach for Ryot (a self-hosted tracker for various media types), which currently uses Shared Storage with in-memory SQLite. The benchmarking was done with LLM assistance (Claude) to systematically implement and execute the tests. All benchmarks use apalis-sqlite v1.0.0-rc.2 in release mode with LTO optimization.
Methodology
The benchmarks tested various scenarios:
- Small workloads (100 jobs)
- Medium workloads (1000 jobs with 1, 5, and 10 workers)
- Large bursts (5000 and 10000 jobs)
- Varying job durations (1ms to 100ms)
- Different worker counts (1-10 workers)
Both memory usage and latency/throughput were measured across all scenarios.
Results Summary
Performance (Latency & Throughput)
Winner: Hooked Storage (Event-driven) - 6% faster on average
| Metric | Shared (Polling) | Hooked (Event-driven) | Difference |
|---|---|---|---|
| Average Latency | 122.12 ms | 114.73 ms | -6.1% (faster) |
| Fast Jobs (100) | 17.19 ms | 10.94 ms | -36% (faster) |
| Medium Load (1000 jobs) | 86-100 ms | 79-89 ms | -8-13% (faster) |
| Large Burst (5000 jobs) | 532 ms | 493 ms | -7.3% (faster) |
| Throughput | ~3000-4500 jobs/s | ~3000-5100 jobs/s | Similar to +12% |
Memory Usage
Winner: Shared Storage (Polling) - 2.4% lower memory (negligible difference)
| Metric | Shared (Polling) | Hooked (Event-driven) | Difference |
|---|---|---|---|
| Average Peak Memory | 32.10 MB | 32.89 MB | 0.78 MB (2.4% lower) |
| Average Memory | 31.65 MB | 32.88 MB | 1.23 MB (3.9% lower) |
| 10,000 jobs | 45.64 MB | 46.42 MB | 0.78 MB (1.7% lower) |
Both methods scale linearly and stay under 50 MB even with 10,000 jobs.
Detailed Findings
Latency Comparison
Hooked storage consistently shows lower latency across most workloads:
- Small workloads (100 jobs): 36% faster (10.94ms vs 17.19ms)
- Medium throughput (1000 jobs): 5-13% faster depending on worker count
- Large bursts (5000 jobs): 7% faster with 12% higher throughput
- Slow jobs (50-100ms duration): Nearly identical (<2% difference)
The improvement is most significant for fast, latency-sensitive jobs where immediate notification provides a clear advantage over polling.
Memory Comparison
Memory usage is nearly identical:
- Difference: Less than 1 MB in most scenarios
- Small workloads: Hooked uses ~16% more (2.9 MB for 100 jobs) due to hook infrastructure
- Medium+ workloads: Difference shrinks to 1-3% (0.2-1 MB)
- Scaling: Both methods scale linearly (~2-3 MB per 1000 jobs)
For modern systems, this memory difference is negligible.
Technical Differences
Shared Storage (Polling)
How it works:
- Workers poll SQLite at regular intervals
- Uses
SharedSqliteStorageto share one pool across multiple job types - Simple
:memory:mode support
Pros:
- ✅ Simpler implementation
- ✅ Works with
:memory:databases - ✅ No additional hooks needed
- ✅ Slightly lower memory usage (~1-3%)
Cons:
- ❌ Higher latency (polling interval dependent)
- ❌ May waste cycles polling when idle
Hooked Storage (Event-driven)
How it works:
- Uses SQLite update hooks for immediate notification
- Push-based notification system
- Requires
file:...?mode=memory&cache=sharedfor in-memory databases
Pros:
- ✅ Lower latency (6-36% faster)
- ✅ Immediate job pickup
- ✅ Better for time-sensitive operations
- ✅ Higher burst throughput
Cons:
- ❌ Cannot use simple
:memory:mode - ❌ Slightly more complex
- ❌ ~1-3% higher memory usage
Recommendations
Use Shared Storage (Polling) if:
- ✅ You prioritize simplicity and maintainability
- ✅ Job latency of ~100-500ms is acceptable
- ✅ You want simple
:memory:mode support - ✅ You're sharing one pool across multiple job types
- ✅ Most jobs are not time-critical
Use Hooked Storage (Event-driven) if:
- ✅ You need the lowest possible latency (6-36% improvement)
- ✅ You have many time-sensitive jobs
- ✅ You want better burst performance
- ✅ You're using file-based SQLite
- ✅ Sub-100ms job pickup is important
Hybrid Approach
You could use different strategies for different job types:
- High-priority jobs → Hooked storage for immediate execution
- Low/Medium-priority jobs → Shared storage for simplicity
- Cron jobs → Either works well
Benchmark Code
The complete benchmark implementation is available and can be shared if there's interest. It includes:
- Comprehensive latency testing with percentiles (P50, P95, P99)
- Memory usage tracking over time (peak and average)
- Multiple workload scenarios (100 to 10,000 jobs)
- Varying worker counts (1-10 workers)
- Different job durations (1ms to 100ms)
Conclusion
Both implementations are excellent and production-ready:
- Hooked storage provides measurably better performance (6-36% lower latency)
- Shared storage is simpler and equally memory-efficient
- The choice depends on your specific requirements for latency vs simplicity
The 0.78 MB memory difference is trivial, so performance (latency) should be the primary deciding factor when choosing between these methods.
For Ryot's use case with in-memory SQLite and mixed job priorities, the current Shared Storage implementation is performing excellently, and the 6% average latency improvement from Hooked storage doesn't justify the added complexity—though Hooked storage would be a strong consideration if moving to file-based SQLite or requiring sub-100ms job pickup.
Note: This analysis was conducted for the Ryot project with LLM assistance (Claude) to systematically implement and execute the benchmarks. All measurements are from actual runs on macOS using apalis-sqlite v1.0.0-rc.2 in release mode.