Skip to content

perf: improve parallelism of SortBotMerge#831

Open
jodavies wants to merge 1 commit into
form-dev:masterfrom
jodavies:sortbot-blocks
Open

perf: improve parallelism of SortBotMerge#831
jodavies wants to merge 1 commit into
form-dev:masterfrom
jodavies:sortbot-blocks

Conversation

@jodavies
Copy link
Copy Markdown
Collaborator

@jodavies jodavies commented May 18, 2026

This change comes from discussion and benchmarking efforts in collaboration with @AnaIPereira. See the commit message for a full description.

On my 12-core 7900X, the results are as follows. @AnaIPereira is finishing up some testing on higher core-count sytems, to make sure we have a good value for MINWRITENUMBEROFTERMS and that the adjusted, lower value of MINIMUMNUMBEROFTERMS does not affect performance.

Speedup w.r.t. master:

Benchmark form -w2 -w4 -w6 -w12 -w24 (HT)
chromatic 0.99 ± 0.01 1.03 ± 0.02 1.08 ± 0.02 1.14 ± 0.01 1.12 ± 0.01 1.08 ± 0.00
color 0.99 ± 0.01 1.01 ± 0.01 1.02 ± 0.01 1.03 ± 0.02 1.03 ± 0.02 1.03 ± 0.01
fmft 1.01 ± 0.01 1.02 ± 0.00 1.08 ± 0.01 1.13 ± 0.01 1.14 ± 0.01 1.09 ± 0.01
forcer-exp 1.00 ± 0.00 1.00 ± 0.00 1.04 ± 0.01 1.09 ± 0.03 1.13 ± 0.00 1.14 ± 0.00
forcer 1.00 ± 0.01 1.01 ± 0.00 1.07 ± 0.01 1.14 ± 0.00 1.20 ± 0.01 1.21 ± 0.00
hyperform 1.00 ± 0.01 1.03 ± 0.01 1.06 ± 0.01 1.09 ± 0.01 1.08 ± 0.01 1.05 ± 0.01
mass-fact 1.00 ± 0.01 1.02 ± 0.02 1.01 ± 0.03 1.02 ± 0.02 1.06 ± 0.07 1.03 ± 0.05
mbox1l 1.00 ± 0.00 1.01 ± 0.00 1.07 ± 0.01 1.15 ± 0.02 1.30 ± 0.03 1.38 ± 0.04
minceex 1.00 ± 0.00 1.04 ± 0.00 1.06 ± 0.01 1.08 ± 0.01 1.08 ± 0.01 1.06 ± 0.01
mincer 1.01 ± 0.02 1.01 ± 0.01 1.03 ± 0.01 1.03 ± 0.01 1.02 ± 0.01 1.02 ± 0.01
mzv-dm 1.00 ± 0.01 0.99 ± 0.02 1.00 ± 0.01 1.00 ± 0.02 1.01 ± 0.04 1.00 ± 0.02
sort-disk 1.02 ± 0.01 1.01 ± 0.02 1.01 ± 0.01 1.01 ± 0.01 1.02 ± 0.02 1.00 ± 0.01
sort-large 1.02 ± 0.01 0.98 ± 0.01 0.99 ± 0.01 1.00 ± 0.02 1.01 ± 0.01 1.01 ± 0.01
sort-small 1.02 ± 0.02 0.98 ± 0.02 1.00 ± 0.02 1.02 ± 0.03 1.07 ± 0.04 1.07 ± 0.08
trace 1.00 ± 0.02 0.99 ± 0.01 0.99 ± 0.01 1.00 ± 0.01 1.00 ± 0.01 0.99 ± 0.03

@coveralls
Copy link
Copy Markdown

coveralls commented May 18, 2026

Coverage Status

coverage: 61.506% (+0.07%) from 61.438% — jodavies:sortbot-blocks into form-dev:master

Currently, the sortbot stage of tform sorting does not achieve good
parallelism. Primarily, this is because the "sortblock" size has
grown with default SmallSize and LargeSize adjustments, such that in
many cases the sortbot levels do not run in parallel at all, because
each thread's output fits in a single block.

This commit adjusts the logic for filling and unlocking the blocks
such that sortbot threads can start work as soon as possible:
 - Only put complete terms into the blocks, no splitting over the blocks.
 - Track the number of terms in each block, and use this when reading
   the data to determine when a block is complete, rather than waiting
   for a term to overlap the "stop" pointer.
 - When filling blocks (in PutToMaster), if a certain (small) minimum
   number of terms has been written, probe if a reading thread is waiting
   on the current block by attempting to lock the previous block. If a
   thread is waiting (so the lock fails), unlock the current block
   immediately and write the term into the next.

Also reduce the MINIMUMNUMBEROFTERMS parameter from 10 to 1, so that
the small+large buffer does not scale (so much) with large MaxTermSize
and many threads, and similarly reduce NUMBEROFBLOCKSINSORT to its minimal
value of 8.
@jodavies
Copy link
Copy Markdown
Collaborator Author

I have removed the second commit, which get rid of "block 0": there is another circumstance in which it is required, separate from copying a partial term from the last block, such that it is contiguous with the tail in block 1, which is when adding terms with polyratfun. When terms merge, if the result is larger, it is copied into the space before "term1" in the merge: this requires block 0 if term1 is the first term of block 1. I added commentary to the code to highlight this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants