Here is a benchmark comparison between zig-regex and the Rust regex crate (generated via OpenCode / DeepSeek V4 Flash Free).
While Zig wins on I/O and process startup time, zig-regex lags significantly behind in search throughput—especially for findAll.
What architectural changes or optimizations could we introduce to improve zig-regex performance for both single-match and multi-match operations, particularly for findAll?
zig-regex vs Rust regex crate — Performance Comparison
Date: 2026-06-02
Platform: Apple M4, macOS
Zig: 0.16.0
Rust: 1.96.0
Rust regex crate: 1.12.3
zig-regex: zig-utils/zig-regex @ latest (2026-06-02)
Methodology
We compare the search throughput of both engines — how fast they can count all
matches of a pattern in a haystack — excluding process startup time.
Each program:
- Reads the full haystack from stdin into memory
- Compiles the regex pattern
- Runs 1 warmup iteration (ignored from timing)
- Records a timestamp
- Runs N iterations (scanning the full haystack each time)
- Records another timestamp
- Prints
total_matches iterations elapsed_ns to stdout
The elapsed time divided by iterations gives the time per full-scan.
Haystack
A 1.17 MB synthetic text file generated from a pool of words:
hello, world, foo123, bar456, baz789, test, regex, zig, rust
Generated with:
import random
words = ['hello', 'world', 'foo123', 'bar456', 'baz789', 'test', 'regex', 'zig', 'rust']
text = ' '.join(random.choice(words) for _ in range(200000))
Programs
Rust (/tmp/rust-bench)
use std::io::Read;
use std::time::Instant;
use regex::Regex;
fn main() {
let args: Vec<String> = std::env::args().collect();
let pattern = &args[1];
let iterations: usize = args.get(2).and_then(|s| s.parse().ok()).unwrap_or(1);
let mut haystack = String::new();
std::io::stdin().read_to_string(&mut haystack).unwrap();
let re = Regex::new(pattern).unwrap();
let _ = re.find_iter(&haystack).count(); // warmup
let start = Instant::now();
let mut total = 0usize;
for _ in 0..iterations {
total += re.find_iter(&haystack).count();
}
let elapsed = start.elapsed();
eprintln!("{} {} {}.{:09}", total, iterations, elapsed.as_secs(), elapsed.subsec_nanos());
}
Uses regex::Regex::find_iter — a lazy, zero-allocation iterator over match
positions. No per-match heap allocations.
Zig (/tmp/zig-bench)
Uses the zig-utils/zig-regex library, built with -Doptimize=ReleaseFast.
const std = @import("std");
const Regex = @import("regex").Regex;
pub fn main(init: std.process.Init) !void {
const allocator = init.gpa;
const arena = init.arena.allocator();
const io = init.io;
const args = try init.minimal.args.toSlice(arena);
const pattern = args[1];
const iterations: usize = if (args.len > 2) try std.fmt.parseInt(usize, args[2], 10) else 1;
var stdin_buf: [4096]u8 = undefined;
var reader = std.Io.File.stdin().reader(io, &stdin_buf);
var result: std.ArrayList(u8) = .empty;
defer result.deinit(allocator);
var chunk: [4096]u8 = undefined;
while (true) {
const n = try reader.interface.readSliceShort(&chunk);
if (n == 0) break;
try result.appendSlice(allocator, chunk[0..n]);
}
const haystack = try result.toOwnedSlice(allocator);
defer allocator.free(haystack);
var regex = try Regex.compile(allocator, pattern);
defer regex.deinit();
const warm_matches = try regex.findAll(allocator, haystack);
defer {
for (warm_matches) |*m| m.deinit(allocator);
allocator.free(warm_matches);
}
const start = std.Io.Clock.awake.now(io);
var total: usize = 0;
var i: usize = 0;
while (i < iterations) : (i += 1) {
const matches = try regex.findAll(allocator, haystack);
defer {
for (matches) |*m| m.deinit(allocator);
allocator.free(matches);
}
total += matches.len;
}
const end = std.Io.Clock.awake.now(io);
const elapsed = start.durationTo(end);
var buf: [64]u8 = undefined;
var stdout_w = std.Io.File.stdout().writer(io, &buf);
const stdout = &stdout_w.interface;
try stdout.print("{} {} {}\n", .{ total, iterations, elapsed.nanoseconds });
try stdout.flush();
}
Uses Regex.findAll — allocates a Match struct (with per-group capture slices)
for every match on every call. Each call also re-initializes the internal VM.
Engine-only results (1.17 MB haystack)
Timed programs that exclude stdin reading and process startup (see source above).
| Operation |
Matches |
Zig |
Rust |
Ratio |
Compile hello |
— |
6.8 µs |
2.8 µs |
2.4x |
Single find hello |
— |
2.0 µs |
~0.5 µs |
~4x |
findAll hello |
21,899 |
60 ms |
0.51 ms |
118x |
findAll hello|world|test |
66,576 |
173 ms |
2.87 ms |
60x |
findAll \d+ |
66,408 |
83 ms |
3.78 ms |
22x |
findAll \w+ |
200,000 |
83 ms |
6.35 ms |
13x |
CLI comparison (hyperfine, full process)
Comparing the zig-regex CLI vs our Rust bench binary on a 1.17 MB file,
both built with optimizations enabled:
Single match (find / find_iter with .count())
$ hyperfine -w 3 -r 30 \
-n "Rust regex (ReleaseFast)" 'cat haystack.txt | ./rust-bench hello' \
-n "Zig regex (ReleaseFast CLI)" 'cat haystack.txt | ./regex hello'
Rust (find + count): 5.3 ms
Zig (find, first match): 4.2 ms (1.25x **faster**)
For a single find (first match), zig-regex is slightly faster than Rust
in this test. The haystack match is near the beginning of the file, and zig-regex
finds it in ~2 µs after compile. The bulk of both runtimes is stdin I/O and
process startup.
All matches (findAll / find_iter with .count())
$ hyperfine -w 3 -r 20 \
-n "Rust find_iter" 'cat haystack.txt | ./rust-bench hello' \
-n "Zig findAll" 'cat haystack.txt | ./regex -g hello | wc -l'
Rust (find_iter): 5.3 ms
Zig (findAll): 68 ms (13x slower)
So zig-regex is fast for single matches (microseconds) but slow for findAll
(milliseconds) due to per-match VM re-init and allocation overhead (see below).
Rust compile: 2.82 ms / 1000 = 2.8 µs per compile.
Zig compile: 6.8 µs per compile — 2.4x slower because VM initialization
allocates internal state as part of the compilation path.
Why the gap?
The findAll gap (13-129x) has the same architectural causes. Crucially, for a
single find/isMatch, zig-regex is competitive — the gap only appears when
iterating over many matches because of:
1. VM re-initialization per match
zig-regex creates a new VM instance for every match in findAll (line 312 of
src/regex.zig):
var virtual_machine = vm.VM.init(self.allocator, nfa_mut, ...);
This allocates and sets up the VM's internal state. With 22k matches, that's 22k
VM init/destroy cycles.
Rust's find_iter uses a lazy DFA (deterministic finite automaton) that
runs continuously from one match to the next with no per-match teardown.
2. Per-match heap allocations
zig-regex's findAll allocates a Match struct (with capture group slices) on
the heap for every match:
try matches.append(allocator, Match{ .slice = ..., .captures = ... });
This means 22k heap allocations for 22k matches.
Rust's find_iter returns match offsets through a Match value type on the
stack — no heap allocations during iteration.
3. SIMD literal prefilters
Rust's regex crate uses the Teddy algorithm (SIMD-accelerated substring
search) to skip large non-matching regions. This is especially effective for
literal patterns like hello.
zig-regex implements basic prefix search (std.mem.indexOf) which is scalar
only.
Key takeaways
-
Full CLI, single match (find): zig-regex (ReleaseFast CLI) was
4.2 ms vs Rust 5.3 ms on a 1.17 MB file — both dominated by I/O,
not regex. Zig was 1.25x faster in this end-to-end test.
-
Engine only, compile + single find: zig-regex is 2-4x slower in raw
engine time (6.8 µs vs 2.8 µs compile, 2.0 µs vs ~0.5 µs find), but these
are single-digit microseconds — negligible. The CLI results invert because
Zig's I/O and startup happen to be faster in this workload.
-
All matches (findAll): Rust is 13-118x faster. The bottleneck is
zig-regex's design: it re-initializes the VM and heap-allocates a Match
(with captures) for every single result, while Rust's find_iter is a
zero-allocation lazy iterator. A lightweight findIter that keeps the VM
alive between iterations would shrink this gap significantly.
-
The Rust regex crate is one of the most heavily optimized regex engines
in existence (DFA compilation, lazy DFAs, SIMD/Teddy prefilters). zig-regex
is a younger project using Thompson NFA + backtracking without an equivalent
optimization pipeline.
Here is a benchmark comparison between
zig-regexand theRust regexcrate (generated via OpenCode / DeepSeek V4 Flash Free).While
Zigwins on I/O and process startup time,zig-regexlags significantly behind in search throughput—especially forfindAll.What architectural changes or optimizations could we introduce to improve
zig-regexperformance for both single-match and multi-match operations, particularly forfindAll?zig-regex vs Rust regex crate — Performance Comparison
Date: 2026-06-02
Platform: Apple M4, macOS
Zig: 0.16.0
Rust: 1.96.0
Rust regex crate: 1.12.3
zig-regex: zig-utils/zig-regex @ latest (2026-06-02)
Methodology
We compare the search throughput of both engines — how fast they can count all
matches of a pattern in a haystack — excluding process startup time.
Each program:
total_matches iterations elapsed_nsto stdoutThe elapsed time divided by iterations gives the time per full-scan.
Haystack
A 1.17 MB synthetic text file generated from a pool of words:
hello,world,foo123,bar456,baz789,test,regex,zig,rustGenerated with:
Programs
Rust (
/tmp/rust-bench)Uses
regex::Regex::find_iter— a lazy, zero-allocation iterator over matchpositions. No per-match heap allocations.
Zig (
/tmp/zig-bench)Uses the
zig-utils/zig-regexlibrary, built with-Doptimize=ReleaseFast.Uses
Regex.findAll— allocates aMatchstruct (with per-group capture slices)for every match on every call. Each call also re-initializes the internal VM.
Engine-only results (1.17 MB haystack)
Timed programs that exclude stdin reading and process startup (see source above).
hellofindhellofindAllhellofindAllhello|world|testfindAll\d+findAll\w+CLI comparison (hyperfine, full process)
Comparing the zig-regex CLI vs our Rust bench binary on a 1.17 MB file,
both built with optimizations enabled:
Single match (
find/find_iterwith.count())For a single
find(first match), zig-regex is slightly faster than Rustin this test. The haystack match is near the beginning of the file, and zig-regex
finds it in ~2 µs after compile. The bulk of both runtimes is stdin I/O and
process startup.
All matches (
findAll/find_iterwith.count())So zig-regex is fast for single matches (microseconds) but slow for findAll
(milliseconds) due to per-match VM re-init and allocation overhead (see below).
Rust compile: 2.82 ms / 1000 = 2.8 µs per compile.
Zig compile: 6.8 µs per compile — 2.4x slower because VM initialization
allocates internal state as part of the compilation path.
Why the gap?
The findAll gap (13-129x) has the same architectural causes. Crucially, for a
single
find/isMatch, zig-regex is competitive — the gap only appears wheniterating over many matches because of:
1. VM re-initialization per match
zig-regex creates a new
VMinstance for every match infindAll(line 312 ofsrc/regex.zig):This allocates and sets up the VM's internal state. With 22k matches, that's 22k
VM init/destroy cycles.
Rust's
find_iteruses a lazy DFA (deterministic finite automaton) thatruns continuously from one match to the next with no per-match teardown.
2. Per-match heap allocations
zig-regex's
findAllallocates aMatchstruct (with capture group slices) onthe heap for every match:
This means 22k heap allocations for 22k matches.
Rust's
find_iterreturns match offsets through aMatchvalue type on thestack — no heap allocations during iteration.
3. SIMD literal prefilters
Rust's regex crate uses the Teddy algorithm (SIMD-accelerated substring
search) to skip large non-matching regions. This is especially effective for
literal patterns like
hello.zig-regex implements basic prefix search (
std.mem.indexOf) which is scalaronly.
Key takeaways
Full CLI, single match (
find): zig-regex (ReleaseFast CLI) was4.2 ms vs Rust 5.3 ms on a 1.17 MB file — both dominated by I/O,
not regex. Zig was 1.25x faster in this end-to-end test.
Engine only, compile + single find: zig-regex is 2-4x slower in raw
engine time (6.8 µs vs 2.8 µs compile, 2.0 µs vs ~0.5 µs find), but these
are single-digit microseconds — negligible. The CLI results invert because
Zig's I/O and startup happen to be faster in this workload.
All matches (
findAll): Rust is 13-118x faster. The bottleneck iszig-regex's design: it re-initializes the VM and heap-allocates a
Match(with captures) for every single result, while Rust's
find_iteris azero-allocation lazy iterator. A lightweight
findIterthat keeps the VMalive between iterations would shrink this gap significantly.
The Rust regex crate is one of the most heavily optimized regex engines
in existence (DFA compilation, lazy DFAs, SIMD/Teddy prefilters). zig-regex
is a younger project using Thompson NFA + backtracking without an equivalent
optimization pipeline.