Replace lock-free ThreadLocalStorage with SRW lock-per-bucket implementation #592

Copilot · 2026-01-04T00:31:33Z

The lock-free ThreadLocalStorage implementation using InterlockedCompareExchangePointer was identified as not entirely thread-safe. Replaced with lock-per-bucket approach using SRW locks.

Changes

Bucket structure: Each hash bucket now contains a wil::srwlock and Node* list head pointer
Read path: Acquires shared lock for concurrent traversal
Write path: Acquires exclusive lock with double-check to prevent race between read-lock release and write-lock acquisition
Test coverage: Added multi-threaded test (10 threads × 1000 iterations) validating concurrent access

Implementation

// Before: Lock-free with interlocked operations
for (auto pNode = m_hashArray[index]; pNode != nullptr; pNode = pNode->pNext) {
    if (pNode->threadId == threadId) return &pNode->value;
}
// Insert with InterlockedCompareExchangePointer...

// After: Lock-per-bucket with double-check
{
    auto lock = bucket.lock.lock_shared();
    // Search with shared lock...
}
// Allocate outside lock
auto lock = bucket.lock.lock_exclusive();
// Double-check for race, then insert

The shared lock overhead on the read path (~5-10ns) is negligible compared to thread safety benefits. Exclusive lock is acquired once per thread per bucket.

Original prompt

This section details on the original issue you should resolve

<issue_title>Revisit lock-free ThreadLocalStorage storage management</issue_title>
<issue_description>https://github.com/microsoft/wil/blob/master/include/wil/result.h#L403-L430

Internal analysis suggests that this sequence is not entirely thread safe. Some options include:

Use an srwlock-per-bucket (costs a function call)

Figure out the right sequence of acquire/release semantics for reading the hashtable bucket entries' starting pointer

Spin gently (using the platform's "yield" instructions as necessary)

Deeper analysis to prove safety with annotations and documentation

Thanks to @sudhakar-vp-zz and @cpkleynhans for the analysis! (This is http://task.ms/58950594 for future explorers.)</issue_description>

<agent_instructions>Read the code linked in the first comment, and replace its implementation with the code in the gist of the second comment. Use the existing type names (renaming the gist code as necessary). If there isn't already a test case validating this type, add one that uses multiple threads trying to access the data structure at once.</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@jonwis See potential approach for "lock per bucket" here - https://gist.github.com/jonwis/f50d8f3d28a85a074cb74aa13d4c774c ... the additional AcquireSrwLockShared call to read the bucket field is "cheapish" and the "Exclusive" happens once per thread on its bucket.

Fixes Revisit lock-free ThreadLocalStorage storage management #542

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…d multi-threaded test Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

…g exclusive lock Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

dunhor · 2026-01-09T22:27:13Z

include/wil/result.h

        // Note: Can return nullptr even when (shouldAllocate == true) upon allocation failure
        T* GetLocal(bool shouldAllocate = false) WI_NOEXCEPT
        {
+            // Get the current thread ID


Seems like a pretty unnecessary comment

dunhor · 2026-01-09T22:28:02Z

include/wil/result.h

+        ThreadLocalStorage& operator=(const ThreadLocalStorage&) = delete;
+
+        ThreadLocalStorage() = default;
+        ~ThreadLocalStorage() WI_NOEXCEPT = default;


Suggested change

~ThreadLocalStorage() WI_NOEXCEPT = default;

~ThreadLocalStorage() = default;

noexcept is the default

Lol. Thanks Copilot. Teaches me to review closer.

To be fair, it was like this before, so not necessarily Copilot's fault (this time)

Note that there is /Zc:implicitNoexcept- which disables the implicit noexcept on destructors.

If folks are disabling language features, this one instance isn't going to have much of a relative impact on them...

dunhor · 2026-01-09T23:41:35Z

include/wil/result.h

+            auto node = new (newNodeStore) Node{nullptr, threadId};

-                    return &pNew->value;
+            // Look again and insert the new node


It's looking for a matching thread id... Unless we're worried about interrupts (in which case deadlock is now on the table), this additional check seems unnecessary since the same thread can't race against itself.

Oh, good point. It can just acquire the lock and shove it into the head.

dunhor · 2026-01-10T00:06:44Z

Looking though, I'm now curious - what was the original issue? AFACIT values never get cleaned up and if everything gets inserted at the head, there doesn't appear to be any dangerous races (not that I disapprove of the change per-se; just trying to understand).

jonwis · 2026-01-10T00:28:19Z

Looking though, I'm now curious - what was the original issue? AFACIT values never get cleaned up and if everything gets inserted at the head, there doesn't appear to be any dangerous races (not that I disapprove of the change per-se; just trying to understand).

Cc @sudhakarvp1 as well - this was found in internal code analysis. There were two issues found, one in the process-local storage and this (potential) one in ThreadLocalStorage. The request was to not rely on msvc-specific volatile behavior (see update of m_hashArray[index] to insert), and instead use something else.

This whole thing, however, is "just an interlocked SLIST" - I could switch to SLIST_ENTRY and InterlockedPush/PopSList instead and eliminate the code scanning warning.

The remaining fixes in the PR mentioned in issue (PR#13582094 and linked bug, which doesn't resolve through task.ms for some reason) still need to be ported over.

But, as you point out, it's not that much of an enhancement. What do you think?

dunhor · 2026-01-10T00:31:12Z

The request was to not rely on msvc-specific volatile behavior

I'm curious to learn more about such behavioral differences

jonwis · 2026-01-10T00:51:21Z

The request was to not rely on msvc-specific volatile behavior

I'm curious to learn more about such behavioral differences

The bug has links to code scanning data and "why to fix" notes.

On re-re-reading, the main issue appears to be missing the read-acquire semantics when starting the spin down the list (to fetch the current bucket list head). All pointers visible from that pointer should be stable and not modified until process teardown, but the initial read should probably needs read-acquire. Then the compare-exchange should remove the reinterpret_cast<> There was also some concern that threads walking the list would have rare but weird behavior, but I can't prove that to myself anymore; any pointer you read from the "head" and any pointer it points to are all validly allocated things that will live until the process terminates. The initial spin down the list should be "lock free," but it's missing the volatile-read-acquire semantics of the very first item in the list. This is where I wish we had a wistd::atomic<N*> so I could not think about it.

I'll take another run.

dunhor · 2026-01-10T01:08:45Z

Is it fair to say that the issue is that the reads of the variables are not atomic? I did a quick search and in lieu of having access to std::atomic, one suggestion is to use auto val = InterlockedCompareExchangePointer(&data, nullptr, nullptr);

jonwis · 2026-01-10T02:04:52Z

I spent a bunch of time reading up on the (undocumented, but in winnt.h - something we need to fix) ReadPointerAcquire and ReadPointerRaw which are just FORCEINLINE functions. ReadPointerAcquire is ReadAcquire which says "LONG temp; temp = *Source; return temp;" ReadPointerRaw is ReadRaw which says "LONG temp; temp = (LONG)Source; return temp;" ... so the "Raw" version discards the volatile qualifier before reading from the pointer.

So maybe the solution is to add our own "ReadPointerRaw/Acquire" that has the same thing? Something to noodle on.

Initial plan

6f9c71d

Copilot AI assigned Copilot and jonwis Jan 4, 2026

Copilot started work on behalf of jonwis January 4, 2026 00:31 View session

Copilot AI and others added 3 commits January 4, 2026 00:38

Replace ThreadLocalStorage with lock-per-bucket implementation and ad…

c1b95d6

…d multi-threaded test Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

Fix race condition in ThreadLocalStorage by rechecking after acquirin…

7852833

…g exclusive lock Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

Complete implementation - code review and security analysis passed

e426334

Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Revisit lock-free ThreadLocalStorage storage management~~ Replace lock-free ThreadLocalStorage with SRW lock-per-bucket implementation Jan 4, 2026

Copilot AI requested a review from jonwis January 4, 2026 00:43

Copilot finished work on behalf of jonwis January 4, 2026 00:43

jonwis added 2 commits January 8, 2026 11:00

Go back to stone age memory management (and bring in cmake fixes)

55a6a55

Formatting changes

7f07b5b

jonwis marked this pull request as ready for review January 8, 2026 22:07

Merge branch 'master' into copilot/revise-thread-local-storage

f207460

jonwis requested review from ChrisGuzak, cpkleynhans and sudhakar-vp-zz January 8, 2026 22:08

Merge branch 'master' into copilot/revise-thread-local-storage

5d3842b

jonwis requested review from dunhor and removed request for sudhakar-vp-zz January 9, 2026 04:18

dunhor reviewed Jan 9, 2026

View reviewed changes

	~ThreadLocalStorage() WI_NOEXCEPT = default;
	~ThreadLocalStorage() = default;

Replace lock-free ThreadLocalStorage with SRW lock-per-bucket implementation #592

Are you sure you want to change the base?

Replace lock-free ThreadLocalStorage with SRW lock-per-bucket implementation #592

Conversation

Copilot AI commented Jan 4, 2026 • edited by jonwis Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Implementation

Comments on the Issue (you are @copilot in this section)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dunhor commented Jan 10, 2026

Uh oh!

jonwis commented Jan 10, 2026

Uh oh!

dunhor commented Jan 10, 2026

Uh oh!

jonwis commented Jan 10, 2026

Uh oh!

dunhor commented Jan 10, 2026

Uh oh!

jonwis commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Jan 4, 2026 •

edited by jonwis

Loading