Skip to content

Perf: off-thread alert-check DuckDB queries (Lite UI freeze under load)#1121

Merged
erikdarlingdata merged 1 commit into
devfrom
feature/perf-blocking-chart-render
Jun 15, 2026
Merged

Perf: off-thread alert-check DuckDB queries (Lite UI freeze under load)#1121
erikdarlingdata merged 1 commit into
devfrom
feature/perf-blocking-chart-render

Conversation

@erikdarlingdata

Copy link
Copy Markdown
Owner

Problem

Active UI profiling under a HammerDB TPC-C load against SQL2025 surfaced intermittent ~1s UI freezes in Lite (an external dispatcher-latency probe caught blocks of 842–1174ms while the app was otherwise at p99 < 4ms).

Root cause

Found with dotnet-trace --profile dotnet-sampled-thread-time (wall-clock thread-time sampling — catches blocked/native frames a CPU profiler misses). The alert-check paths run synchronous DuckDB queries directly on the WPF dispatcher:

  • MainWindow.CheckPerformanceAlertsGetLatestPoisonWaitAvgs, GetRecentBlockedProcessReports, GetRecentDeadlocks, GetLongRunningQueries, GetLatestTempDbSpace, GetAnomalousJobs
  • ServerTab.RefreshAlertCountsAsyncGetAlertCounts

DuckDB.NET is synchronous, so await _dataService.X() completes on the calling thread. Under load a single DuckDB connection open is ~766ms. These fire on the 60s overview auto-refresh and on every Blocking-tab refresh, so they freeze whatever tab the user is on (the timing coincidence sent me chasing the Queries and Blocking tabs first — see below). The earlier off-thread sweep (#1110–1120) missed the alert/overview paths.

Fix

Wrap the 9 calls in Task.Run(() => _dataService.X(...)) — the same off-thread pattern used across the other refresh paths. 9 insertions / 9 deletions, 2 files.

How it was ruled in

Method-level Stopwatch instrumentation first disproved the obvious suspects — chart build + .Refresh() = 0–3ms (even 1,362 pts × 6 series), GC = 0 collections, grid bind < 30ms, DataGrid columns fixed-width. Only thread-time tracing revealed the synchronous DuckDB on the UI thread, which is invisible to method timers because await sync_call() doesn't show as a stall in the awaiting method.

Proof (Lite vs SQL2025 under HammerDB TPC-C, 75s spanning the overview auto-refresh)

metric before after
dispatcher latency MAX 1174 ms 9.2 ms
p99 2.0 ms 1.5 ms
stalls > 250 ms 2 0
stalls > 1 s 1 0

1,584 samples, nothing over 9.2 ms.

Validation

  • Lite builds clean (net10, 0 errors).
  • Lite.Tests: 434/434 pass.

🤖 Generated with Claude Code

The alert-check paths ran synchronous DuckDB queries directly on the WPF
dispatcher: MainWindow.CheckPerformanceAlerts (poison waits, blocked-process
reports, deadlocks, long-running queries, tempdb space, anomalous jobs) and
ServerTab.RefreshAlertCountsAsync (alert counts). DuckDB.NET is synchronous,
so `await _dataService.X()` completes on the calling thread; under load a
single DuckDB connection open is ~766ms. These fire on the 60s overview
auto-refresh and on every Blocking-tab refresh, causing intermittent ~1s UI
freezes that landed on whatever tab the user happened to be on.

Wrap the 9 calls in Task.Run(() => _dataService.X(...)) -- the same off-thread
pattern used across the other refresh paths; the off-thread sweep (#1110-1120)
missed the alert/overview paths.

Root-caused with dotnet-trace (dotnet-sampled-thread-time); method-level
Stopwatch instrumentation had ruled out chart render (0-3ms), GC (0), and
grid binding (<30ms).

Measured (Lite vs SQL2025 under HammerDB TPC-C, 75s spanning the overview
auto-refresh): dispatcher latency MAX 1174ms -> 9.2ms, p99 2.0ms -> 1.5ms,
zero stalls >16ms (was 2 over 250ms including one over 1s). Lite.Tests 434/434.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@erikdarlingdata erikdarlingdata merged commit 95a0936 into dev Jun 15, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant