Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)

# Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)

## Context

I performed comprehensive benchmarks comparing the two primary methods of using apalis-sqlite:
1. **Shared Storage (Polling)** - Using `SharedSqliteStorage` with standard polling
2. **Hooked Storage (Event-driven)** - Using `SqliteStorage::new_with_callback()` with SQLite update hooks

This analysis was conducted to help choose the best approach for [Ryot](https://github.com/IgnisDa/ryot) (a self-hosted tracker for various media types), which currently uses Shared Storage with in-memory SQLite. The benchmarking was done with LLM assistance (Claude) to systematically implement and execute the tests. All benchmarks use **apalis-sqlite v1.0.0-rc.2** in release mode with LTO optimization.

## Methodology

The benchmarks tested various scenarios:
- Small workloads (100 jobs)
- Medium workloads (1000 jobs with 1, 5, and 10 workers)
- Large bursts (5000 and 10000 jobs)
- Varying job durations (1ms to 100ms)
- Different worker counts (1-10 workers)

Both memory usage and latency/throughput were measured across all scenarios.

## Results Summary

### Performance (Latency & Throughput)

**Winner: Hooked Storage (Event-driven)** - 6% faster on average

| Metric | Shared (Polling) | Hooked (Event-driven) | Difference |
|--------|------------------|----------------------|------------|
| **Average Latency** | 122.12 ms | 114.73 ms | **-6.1% (faster)** |
| **Fast Jobs (100)** | 17.19 ms | 10.94 ms | **-36% (faster)** |
| **Medium Load (1000 jobs)** | 86-100 ms | 79-89 ms | **-8-13% (faster)** |
| **Large Burst (5000 jobs)** | 532 ms | 493 ms | **-7.3% (faster)** |
| **Throughput** | ~3000-4500 jobs/s | ~3000-5100 jobs/s | **Similar to +12%** |

### Memory Usage

**Winner: Shared Storage (Polling)** - 2.4% lower memory (negligible difference)

| Metric | Shared (Polling) | Hooked (Event-driven) | Difference |
|--------|------------------|----------------------|------------|
| **Average Peak Memory** | 32.10 MB | 32.89 MB | **0.78 MB (2.4% lower)** |
| **Average Memory** | 31.65 MB | 32.88 MB | **1.23 MB (3.9% lower)** |
| **10,000 jobs** | 45.64 MB | 46.42 MB | **0.78 MB (1.7% lower)** |

Both methods scale linearly and stay under 50 MB even with 10,000 jobs.

## Detailed Findings

### Latency Comparison

Hooked storage consistently shows lower latency across most workloads:
- **Small workloads (100 jobs):** 36% faster (10.94ms vs 17.19ms)
- **Medium throughput (1000 jobs):** 5-13% faster depending on worker count
- **Large bursts (5000 jobs):** 7% faster with 12% higher throughput
- **Slow jobs (50-100ms duration):** Nearly identical (<2% difference)

The improvement is most significant for fast, latency-sensitive jobs where immediate notification provides a clear advantage over polling.

### Memory Comparison

Memory usage is nearly identical:
- **Difference:** Less than 1 MB in most scenarios
- **Small workloads:** Hooked uses ~16% more (2.9 MB for 100 jobs) due to hook infrastructure
- **Medium+ workloads:** Difference shrinks to 1-3% (0.2-1 MB)
- **Scaling:** Both methods scale linearly (~2-3 MB per 1000 jobs)

For modern systems, this memory difference is negligible.

## Technical Differences

### Shared Storage (Polling)
**How it works:**
- Workers poll SQLite at regular intervals
- Uses `SharedSqliteStorage` to share one pool across multiple job types
- Simple `:memory:` mode support

**Pros:**
- ✅ Simpler implementation
- ✅ Works with `:memory:` databases
- ✅ No additional hooks needed
- ✅ Slightly lower memory usage (~1-3%)

**Cons:**
- ❌ Higher latency (polling interval dependent)
- ❌ May waste cycles polling when idle

### Hooked Storage (Event-driven)
**How it works:**
- Uses SQLite update hooks for immediate notification
- Push-based notification system
- Requires `file:...?mode=memory&cache=shared` for in-memory databases

**Pros:**
- ✅ Lower latency (6-36% faster)
- ✅ Immediate job pickup
- ✅ Better for time-sensitive operations
- ✅ Higher burst throughput

**Cons:**
- ❌ Cannot use simple `:memory:` mode
- ❌ Slightly more complex
- ❌ ~1-3% higher memory usage

## Recommendations

### Use Shared Storage (Polling) if:
- ✅ You prioritize simplicity and maintainability
- ✅ Job latency of ~100-500ms is acceptable
- ✅ You want simple `:memory:` mode support
- ✅ You're sharing one pool across multiple job types
- ✅ Most jobs are not time-critical

### Use Hooked Storage (Event-driven) if:
- ✅ You need the lowest possible latency (6-36% improvement)
- ✅ You have many time-sensitive jobs
- ✅ You want better burst performance
- ✅ You're using file-based SQLite
- ✅ Sub-100ms job pickup is important

### Hybrid Approach
You could use different strategies for different job types:
- **High-priority jobs** → Hooked storage for immediate execution
- **Low/Medium-priority jobs** → Shared storage for simplicity
- **Cron jobs** → Either works well

## Benchmark Code

The complete benchmark implementation is available and can be shared if there's interest. It includes:
- Comprehensive latency testing with percentiles (P50, P95, P99)
- Memory usage tracking over time (peak and average)
- Multiple workload scenarios (100 to 10,000 jobs)
- Varying worker counts (1-10 workers)
- Different job durations (1ms to 100ms)

## Conclusion

Both implementations are excellent and production-ready:
- **Hooked storage** provides measurably better performance (6-36% lower latency)
- **Shared storage** is simpler and equally memory-efficient
- The choice depends on your specific requirements for latency vs simplicity

The 0.78 MB memory difference is trivial, so **performance (latency) should be the primary deciding factor** when choosing between these methods.

For Ryot's use case with in-memory SQLite and mixed job priorities, the current Shared Storage implementation is performing excellently, and the 6% average latency improvement from Hooked storage doesn't justify the added complexity—though Hooked storage would be a strong consideration if moving to file-based SQLite or requiring sub-100ms job pickup.

---

**Note:** This analysis was conducted for the [Ryot project](https://github.com/IgnisDa/ryot) with LLM assistance (Claude) to systematically implement and execute the benchmarks. All measurements are from actual runs on macOS using apalis-sqlite v1.0.0-rc.2 in release mode.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven) #39

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)

Context

Methodology

Results Summary

Performance (Latency & Throughput)

Memory Usage

Detailed Findings

Latency Comparison

Memory Comparison

Technical Differences

Shared Storage (Polling)

Hooked Storage (Event-driven)

Recommendations

Use Shared Storage (Polling) if:

Use Hooked Storage (Event-driven) if:

Hybrid Approach

Benchmark Code

Conclusion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Shared (Polling)	Hooked (Event-driven)	Difference
Average Latency	122.12 ms	114.73 ms	-6.1% (faster)
Fast Jobs (100)	17.19 ms	10.94 ms	-36% (faster)
Medium Load (1000 jobs)	86-100 ms	79-89 ms	-8-13% (faster)
Large Burst (5000 jobs)	532 ms	493 ms	-7.3% (faster)
Throughput	~3000-4500 jobs/s	~3000-5100 jobs/s	Similar to +12%

Metric	Shared (Polling)	Hooked (Event-driven)	Difference
Average Peak Memory	32.10 MB	32.89 MB	0.78 MB (2.4% lower)
Average Memory	31.65 MB	32.88 MB	1.23 MB (3.9% lower)
10,000 jobs	45.64 MB	46.42 MB	0.78 MB (1.7% lower)

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven) #39

Description

Benchmark: Shared Storage (Polling) vs Hooked Storage (Event-driven)

Context

Methodology

Results Summary

Performance (Latency & Throughput)

Memory Usage

Detailed Findings

Latency Comparison

Memory Comparison

Technical Differences

Shared Storage (Polling)

Hooked Storage (Event-driven)

Recommendations

Use Shared Storage (Polling) if:

Use Hooked Storage (Event-driven) if:

Hybrid Approach

Benchmark Code

Conclusion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions