Speed-up reduce view merging in fabric by nickva · Pull Request #6046 · apache/couchdb

nickva · 2026-06-21T05:53:29Z

Previously, find_next_key/5 reduce collector [1] effectively ran hd(lists:sort(...)) to get the next key. It was spending time sorting, then throwing everything away except the head. For the next row, it was sorting again, throwing everything away, and taking only the head, and so on... Needless to say, that's wasteful and we should simply find the minimum and return that.

As an another optimization replace the dict with a map. Maps are more efficient and generally faster.

Also, add extra unit tests asserting ICU collation behavior actually takes place. In a few cases explicitly assert that the new behavior is the same as the previous hd(lists:sort(...)) behavior.

To benchmark used a simple reduce view with a built-in _count reducer [2]. For 100k docs, Q=8 this optimization leads to an ~2x speedup in view querying. Theoretically it would be an O(n log n) -> O(n) speedup.

PR

./reduce_group_bench.py --ndocs 100000 --q 8 --reps 7
db=reduce_group_bench q=8 n=1 ndocs=100000 group=true
loading...
  loaded 100000 docs in 4.4s
building index...
  built in 4.5s
timing group=true query x7...
  rows=100000
  min=5644.0ms  median=5786.6ms  max=6108.0ms

MAIN

 ./reduce_group_bench.py --ndocs 100000 --q 8 --reps 7
db=reduce_group_bench q=8 n=1 ndocs=100000 group=true
loading...
  loaded 100000 docs in 4.5s
building index...
  built in 4.3s
timing group=true query x7...
  rows=100000
  min=10910.8ms  median=11212.5ms  max=11627.4ms

[1] Reduce collector buffers rows from workers and keys them by the view key. For each row it emits it 1) finds the next key 2) takes all the rows that collate equal to it. This optimization will only improve reduce views

[2] https://gist.github.com/nickva/cc0a3fb296be965e014eec8c484aa8a5

Previously, `find_next_key/5` reduce collector [1] effectively ran `hd(lists:sort(...))` to get the next key. It was spending time sorting, then throwing everything away except the head. For the next row, it was sorting again, throwing everything away, and taking only the head, and so on... Needless to say, that's wasteful and we should simply find the minimum and return that. As an another optimization replace the dict with a map. Maps are more efficient and generally faster. Also, add extra unit tests asserting ICU collation behavior actually takes place. In a few cases explicitly assert that the new behavior is the same as the previous `hd(lists:sort(...))` behavior. To benchmark used a simple reduce view with a built-in _count reducer [2]. For 100k docs, Q=8 this optimization leads to an ~2x speedup in view querying. Theoretically it would be an O(n log n) -> O(n) speedup. PR ``` ./reduce_group_bench.py --ndocs 100000 --q 8 --reps 7 db=reduce_group_bench q=8 n=1 ndocs=100000 group=true loading... loaded 100000 docs in 4.4s building index... built in 4.5s timing group=true query x7... rows=100000 min=5644.0ms median=5786.6ms max=6108.0ms ``` MAIN ``` ./reduce_group_bench.py --ndocs 100000 --q 8 --reps 7 db=reduce_group_bench q=8 n=1 ndocs=100000 group=true loading... loaded 100000 docs in 4.5s building index... built in 4.3s timing group=true query x7... rows=100000 min=10910.8ms median=11212.5ms max=11627.4ms ``` [1] Reduce collector buffers rows from workers and keys them by the view key. For each row it emits it 1) finds the next key 2) takes all the rows that collate equal to it. This optimization will only improve reduce views [2] https://gist.github.com/nickva/cc0a3fb296be965e014eec8c484aa8a5

nickva force-pushed the view-reduce-find-min-key branch from 96ef801 to c401c5d Compare June 21, 2026 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed-up reduce view merging in fabric#6046

Speed-up reduce view merging in fabric#6046
nickva wants to merge 1 commit into
mainfrom
view-reduce-find-min-key

nickva commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nickva commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant