Skip to content

Solver context reuse#193

Draft
tameware wants to merge 5 commits into
dds-bridge:developfrom
tameware:solver-context-reuse
Draft

Solver context reuse#193
tameware wants to merge 5 commits into
dds-bridge:developfrom
tameware:solver-context-reuse

Conversation

@tameware

Copy link
Copy Markdown
Collaborator

Provides some significant speedup for most cases. Still not as fast as v2.9. I am mildly concerned about the slowdown for the single-solve non-threaded case, but I figure most uses will be multi-threaded. Some benchmarks – n is number of threads.

thin-lto-speedier branch

Summary (3.0 vs 2.9, user time)
===============================
solver file            n   2.9_user   3.0_user    speedup note
------ ------------ ---- ---------- ---------- ---------- ----
calc   list1.txt       1      122.0       59.0      2.07x 3.0 faster
calc   list1.txt      14       43.0       45.0      0.96x 2.9 faster
calc   list10.txt      1      712.0      235.0      3.03x 3.0 faster
calc   list10.txt     14      156.0      239.0      0.65x 2.9 faster
calc   list100.txt     1     8435.0     1517.0      5.56x 3.0 faster
calc   list100.txt    14     1075.0     1460.0      0.74x 2.9 faster
calc   list1000.txt    1    73024.0    18549.0      3.94x 3.0 faster
calc   list1000.txt   14     8746.0    19507.0      0.45x 2.9 faster
solve  list1.txt       1       13.0       15.0      0.87x 2.9 faster
solve  list1.txt      14       13.0       14.0      0.93x 2.9 faster
solve  list10.txt      1       98.0      103.0      0.95x 2.9 faster
solve  list10.txt     14       46.0       45.0      1.02x 3.0 faster
solve  list100.txt     1     1616.0     1808.0      0.89x 2.9 faster
solve  list100.txt    14      224.0      266.0      0.84x 2.9 faster
solve  list1000.txt    1    17328.0    26412.0      0.66x 2.9 faster
solve  list1000.txt   14     1982.0     3144.0      0.63x 2.9 faster

solver-context-reuse branch

Summary (3.0 vs 2.9, user time)
===============================
solver file            n   2.9_user   3.0_user    speedup note
------ ------------ ---- ---------- ---------- ---------- ----
calc   list1.txt       1      121.0      127.0      0.95x 2.9 faster
calc   list1.txt      14       42.0       47.0      0.89x 2.9 faster
calc   list10.txt      1      705.0      922.0      0.76x 2.9 faster
calc   list10.txt     14      163.0      240.0      0.68x 2.9 faster
calc   list100.txt     1     8294.0    11610.0      0.71x 2.9 faster
calc   list100.txt    14     1060.0     1408.0      0.75x 2.9 faster
calc   list1000.txt    1    70494.0    99600.0      0.71x 2.9 faster
calc   list1000.txt   14     8781.0    12591.0      0.70x 2.9 faster
solve  list1.txt       1       13.0       16.0      0.81x 2.9 faster
solve  list1.txt      14       15.0       17.0      0.88x 2.9 faster
solve  list10.txt      1      102.0      107.0      0.95x 2.9 faster
solve  list10.txt     14       49.0       52.0      0.94x 2.9 faster
solve  list100.txt     1     1639.0     1865.0      0.88x 2.9 faster
solve  list100.txt    14      239.0      245.0      0.98x 2.9 faster
solve  list1000.txt    1    16796.0    21040.0      0.80x 2.9 faster
solve  list1000.txt   14     1970.0     2778.0      0.71x 2.9 faster

tameware and others added 5 commits June 16, 2026 12:00
Add a non-owning SolverContext::thread_ptr() accessor and use it at the
hot ThreadData access sites in ab_search, quick_tricks, and later_tricks
instead of copying the shared_ptr returned by thread(). This removes the
atomic reference-count traffic from the inner search loop.

Measured on dtest -f hands/list100.txt -s solve -n 14 (14 cores),
interleaved against develop to cancel machine drift: ~3-4% lower user
and system time with tighter run-to-run variance. Behavior is unchanged;
the pointer lifetime is tied to the owning context.

Co-authored-by: Cursor <cursoragent@cursor.com>
Honor SetResources thread limits again so batch calc uses strain-aware
scheduling and reuses per-slot SolverContexts instead of fresh std::thread
workers on every CalcAllTables call.

Co-authored-by: Cursor <cursoragent@cursor.com>
Share a worker context pool across calc and solve, route SolveAllBoards
through scheduler workers with per-thread thrId, and have dtest call the
batch API so parallel solve no longer races on slot 0.

Co-authored-by: Cursor <cursoragent@cursor.com>
Delegating to CalcPar/CalcDDtable hit the shared worker pool with TT state
left over from earlier tests, breaking CalcParContextVsNonContext on CI.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant