Skip to content

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven) #39

@IgnisDa

Description

@IgnisDa

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)

Context

I performed comprehensive benchmarks comparing the two primary methods of using apalis-sqlite:

  1. Shared Storage (Polling) - Using SharedSqliteStorage with standard polling
  2. Hooked Storage (Event-driven) - Using SqliteStorage::new_with_callback() with SQLite update hooks

This analysis was conducted to help choose the best approach for Ryot (a self-hosted tracker for various media types), which currently uses Shared Storage with in-memory SQLite. The benchmarking was done with LLM assistance (Claude) to systematically implement and execute the tests. All benchmarks use apalis-sqlite v1.0.0-rc.2 in release mode with LTO optimization.

Methodology

The benchmarks tested various scenarios:

  • Small workloads (100 jobs)
  • Medium workloads (1000 jobs with 1, 5, and 10 workers)
  • Large bursts (5000 and 10000 jobs)
  • Varying job durations (1ms to 100ms)
  • Different worker counts (1-10 workers)

Both memory usage and latency/throughput were measured across all scenarios.

Results Summary

Performance (Latency & Throughput)

Winner: Hooked Storage (Event-driven) - 6% faster on average

Metric Shared (Polling) Hooked (Event-driven) Difference
Average Latency 122.12 ms 114.73 ms -6.1% (faster)
Fast Jobs (100) 17.19 ms 10.94 ms -36% (faster)
Medium Load (1000 jobs) 86-100 ms 79-89 ms -8-13% (faster)
Large Burst (5000 jobs) 532 ms 493 ms -7.3% (faster)
Throughput ~3000-4500 jobs/s ~3000-5100 jobs/s Similar to +12%

Memory Usage

Winner: Shared Storage (Polling) - 2.4% lower memory (negligible difference)

Metric Shared (Polling) Hooked (Event-driven) Difference
Average Peak Memory 32.10 MB 32.89 MB 0.78 MB (2.4% lower)
Average Memory 31.65 MB 32.88 MB 1.23 MB (3.9% lower)
10,000 jobs 45.64 MB 46.42 MB 0.78 MB (1.7% lower)

Both methods scale linearly and stay under 50 MB even with 10,000 jobs.

Detailed Findings

Latency Comparison

Hooked storage consistently shows lower latency across most workloads:

  • Small workloads (100 jobs): 36% faster (10.94ms vs 17.19ms)
  • Medium throughput (1000 jobs): 5-13% faster depending on worker count
  • Large bursts (5000 jobs): 7% faster with 12% higher throughput
  • Slow jobs (50-100ms duration): Nearly identical (<2% difference)

The improvement is most significant for fast, latency-sensitive jobs where immediate notification provides a clear advantage over polling.

Memory Comparison

Memory usage is nearly identical:

  • Difference: Less than 1 MB in most scenarios
  • Small workloads: Hooked uses ~16% more (2.9 MB for 100 jobs) due to hook infrastructure
  • Medium+ workloads: Difference shrinks to 1-3% (0.2-1 MB)
  • Scaling: Both methods scale linearly (~2-3 MB per 1000 jobs)

For modern systems, this memory difference is negligible.

Technical Differences

Shared Storage (Polling)

How it works:

  • Workers poll SQLite at regular intervals
  • Uses SharedSqliteStorage to share one pool across multiple job types
  • Simple :memory: mode support

Pros:

  • ✅ Simpler implementation
  • ✅ Works with :memory: databases
  • ✅ No additional hooks needed
  • ✅ Slightly lower memory usage (~1-3%)

Cons:

  • ❌ Higher latency (polling interval dependent)
  • ❌ May waste cycles polling when idle

Hooked Storage (Event-driven)

How it works:

  • Uses SQLite update hooks for immediate notification
  • Push-based notification system
  • Requires file:...?mode=memory&cache=shared for in-memory databases

Pros:

  • ✅ Lower latency (6-36% faster)
  • ✅ Immediate job pickup
  • ✅ Better for time-sensitive operations
  • ✅ Higher burst throughput

Cons:

  • ❌ Cannot use simple :memory: mode
  • ❌ Slightly more complex
  • ❌ ~1-3% higher memory usage

Recommendations

Use Shared Storage (Polling) if:

  • ✅ You prioritize simplicity and maintainability
  • ✅ Job latency of ~100-500ms is acceptable
  • ✅ You want simple :memory: mode support
  • ✅ You're sharing one pool across multiple job types
  • ✅ Most jobs are not time-critical

Use Hooked Storage (Event-driven) if:

  • ✅ You need the lowest possible latency (6-36% improvement)
  • ✅ You have many time-sensitive jobs
  • ✅ You want better burst performance
  • ✅ You're using file-based SQLite
  • ✅ Sub-100ms job pickup is important

Hybrid Approach

You could use different strategies for different job types:

  • High-priority jobs → Hooked storage for immediate execution
  • Low/Medium-priority jobs → Shared storage for simplicity
  • Cron jobs → Either works well

Benchmark Code

The complete benchmark implementation is available and can be shared if there's interest. It includes:

  • Comprehensive latency testing with percentiles (P50, P95, P99)
  • Memory usage tracking over time (peak and average)
  • Multiple workload scenarios (100 to 10,000 jobs)
  • Varying worker counts (1-10 workers)
  • Different job durations (1ms to 100ms)

Conclusion

Both implementations are excellent and production-ready:

  • Hooked storage provides measurably better performance (6-36% lower latency)
  • Shared storage is simpler and equally memory-efficient
  • The choice depends on your specific requirements for latency vs simplicity

The 0.78 MB memory difference is trivial, so performance (latency) should be the primary deciding factor when choosing between these methods.

For Ryot's use case with in-memory SQLite and mixed job priorities, the current Shared Storage implementation is performing excellently, and the 6% average latency improvement from Hooked storage doesn't justify the added complexity—though Hooked storage would be a strong consideration if moving to file-based SQLite or requiring sub-100ms job pickup.


Note: This analysis was conducted for the Ryot project with LLM assistance (Claude) to systematically implement and execute the benchmarks. All measurements are from actual runs on macOS using apalis-sqlite v1.0.0-rc.2 in release mode.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions