Aho-Corasick automaton working On Redis — A Go library for efficient multi-pattern string matching backed by Redis.
ACOR implements the Aho-Corasick algorithm for efficient multi-pattern string matching, with all data structures persisted in Redis. This enables:
- Fast pattern matching — O(n + m) complexity where n is text length and m is number of matches
- Distributed state — Share pattern dictionaries across multiple application instances
- Persistence — Pattern dictionaries survive application restarts
- Scalability — Support for Redis Sentinel, Cluster, and Ring topologies
- Content filtering and profanity detection
- Log analysis and keyword extraction
- Intrusion detection systems
- Search term highlighting
- Real-time text classification
- Go >= 1.25
- Redis >= 3.0
go get -u github.com/skyoo2003/acorpackage main
import (
"fmt"
"github.com/skyoo2003/acor/pkg/acor"
)
func main() {
args := &acor.AhoCorasickArgs{
Addr: "localhost:6379",
Password: "",
DB: 0,
Name: "sample",
}
ac, err := acor.Create(args)
if err != nil {
panic(err)
}
defer ac.Close()
keywords := []string{"he", "her", "him"}
for _, k := range keywords {
if _, err := ac.Add(k); err != nil {
panic(err)
}
}
matched, err := ac.Find("he is him")
if err != nil {
panic(err)
}
fmt.Println(matched)
if err := ac.Flush(); err != nil {
panic(err)
}
}ACOR supports standalone Redis, Sentinel, Cluster, and Ring configurations:
// Sentinel
sentinelArgs := &acor.AhoCorasickArgs{
Addrs: []string{"localhost:26379", "localhost:26380"},
MasterName: "mymaster",
Password: "",
DB: 0,
Name: "sample",
}
// Cluster
clusterArgs := &acor.AhoCorasickArgs{
Addrs: []string{"localhost:7000", "localhost:7001", "localhost:7002"},
Password: "",
Name: "sample",
}
// Ring
ringArgs := &acor.AhoCorasickArgs{
RingAddrs: map[string]string{
"shard-1": "localhost:7000",
"shard-2": "localhost:7001",
},
Password: "",
DB: 0,
Name: "sample",
}ACOR supports two Redis schema versions:
| Version | Description | Keys per 100K keywords |
|---|---|---|
| V1 (Legacy) | Multiple keys per collection | ~500K |
| V2 (Optimized) | Fixed 3 keys per collection | 3 |
V2 is recommended for new collections and provides 50-60x faster Find() operations.
| Operation | V1 (Legacy) | V2 (Optimized) |
|---|---|---|
| Find() | O(N×3-5) RTT | 3 RTT (fixed) |
| Add() | O(M×3-10) RTT | 2-3 RTT |
# Preview migration
acor -name mycollection migrate --dry-run
# Execute migration
acor -name mycollection migrate
# Rollback to V1
acor -name mycollection migrate-rollback
# Check schema version
acor -name mycollection schema-versionACOR supports batch operations for better performance when working with multiple keywords:
// Add multiple keywords at once
result, err := ac.AddMany([]string{"he", "her", "him", "his"}, &acor.BatchOptions{
Mode: acor.BatchModeTransactional, // or acor.BatchModeBestEffort
})
// Remove multiple keywords
result, err = ac.RemoveMany([]string{"he", "her"}, nil)
// Find matches in multiple texts
matches, err := ac.FindMany([]string{"he is him", "this is hers"})Batch Modes:
BatchModeBestEffort: Continues on errors, returns partial resultsBatchModeTransactional: Rolls back all changes if any error occurs
For large texts, use parallel matching to leverage multiple goroutines:
matches, err := ac.FindParallel(largeText, &acor.ParallelOptions{
Workers: 4,
Boundary: acor.ChunkBoundaryWord, // ChunkBoundaryWord, ChunkBoundaryLine, or ChunkBoundarySentence
})Chunk boundaries ensure matches aren't split across chunks:
ChunkBoundaryWord: Split at word boundaries (default)ChunkBoundaryLine: Split at line breaksChunkBoundarySentence: Split at sentence endings
For distributed deployments that need both Redis persistence and local speed, use the Preset field:
ac, err := acor.Create(&acor.AhoCorasickArgs{
Addr: "localhost:6379",
Name: "my-collection",
Preset: acor.PresetBalanced,
CaseSensitive: false,
})
defer ac.Close()
ac.Add(ctx, "hello")
matches, _ := ac.Find(ctx, "hello world") // 0 RTT on hot pathRedis is the source of truth; a local preset-optimized automaton handles reads with no Redis I/O on the hot path. Cross-instance invalidation uses Redis Pub/Sub.
| Preset | Engine | Best For | Trade-off |
|---|---|---|---|
PresetSpeed |
Full DFA + flat array | Real-time packet inspection, latency-critical paths | Higher memory (states x alphabet) |
PresetBalanced |
Double-Array Trie + Banded DFA | General-purpose keyword filtering | Balanced speed and memory |
PresetMemoryEfficient |
Map-based + Bloom filter | Large-scale domain blocking, millions of patterns | Slower search |
PresetUltimate |
SIMD pre-filter + Double-Array + Banded DFA | Production systems needing max throughput | Reasonable memory with highest speed |
For read-heavy workloads with the original AhoCorasick, enable local caching to eliminate Redis round-trips:
ac, _ := acor.Create(&acor.AhoCorasickArgs{
Addr: "localhost:6379",
Name: "my-collection",
EnableCache: true,
})
// First Find() loads from Redis (3 RTT)
ac.Find("hello world")
// Subsequent Find() uses local cache (0 RTT)
ac.Find("another text")Cache Behavior:
- Cache is invalidated via Redis Pub/Sub when any instance modifies the collection
- First Find() after invalidation reloads from Redis
- Works with Standalone, Sentinel, Cluster, and Ring topologies
ACOR provides built-in observability support:
import (
"github.com/skyoo2003/acor/pkg/acor"
"github.com/skyoo2003/acor/pkg/metrics"
"github.com/skyoo2003/acor/pkg/logging"
"github.com/skyoo2003/acor/pkg/tracing"
"github.com/skyoo2003/acor/pkg/health"
)- Metrics: Prometheus metrics for HTTP, gRPC, and Redis operations
- Logging: Structured JSON logging with zerolog
- Tracing: OpenTelemetry distributed tracing
- Health: Kubernetes-compatible
/healthzand/readyzendpoints
ACOR includes a command-line interface for common operations:
# Install
go install github.com/skyoo2003/acor/cmd/acor@latest
# Add keywords
acor -name mycollection add "keyword1" "keyword2"
# Find matches
acor -name mycollection find "sample text"
# List keywords
acor -name mycollection listFull documentation is available at GitHub Pages.
API reference: pkg.go.dev
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
See SUPPORT.md for help channels and response times.
Please see our Security Policy for vulnerability reporting.
This project follows the Contributor Covenant Code of Conduct.
See GOVERNANCE.md for project decision-making and contribution model.
See CHANGELOG.md for version history.
Apache License 2.0 - Copyright 2016-2026 Sungkyu Yoo