-
Notifications
You must be signed in to change notification settings - Fork 276
Replace lock-free ThreadLocalStorage with SRW lock-per-bucket implementation #592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…d multi-threaded test Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>
…g exclusive lock Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>
Co-authored-by: jonwis <18537118+jonwis@users.noreply.github.com>
| // Note: Can return nullptr even when (shouldAllocate == true) upon allocation failure | ||
| T* GetLocal(bool shouldAllocate = false) WI_NOEXCEPT | ||
| { | ||
| // Get the current thread ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a pretty unnecessary comment
| ThreadLocalStorage& operator=(const ThreadLocalStorage&) = delete; | ||
|
|
||
| ThreadLocalStorage() = default; | ||
| ~ThreadLocalStorage() WI_NOEXCEPT = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ~ThreadLocalStorage() WI_NOEXCEPT = default; | |
| ~ThreadLocalStorage() = default; |
noexcept is the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lol. Thanks Copilot. Teaches me to review closer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair, it was like this before, so not necessarily Copilot's fault (this time)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there is /Zc:implicitNoexcept- which disables the implicit noexcept on destructors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If folks are disabling language features, this one instance isn't going to have much of a relative impact on them...
| auto node = new (newNodeStore) Node{nullptr, threadId}; | ||
|
|
||
| return &pNew->value; | ||
| // Look again and insert the new node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's looking for a matching thread id... Unless we're worried about interrupts (in which case deadlock is now on the table), this additional check seems unnecessary since the same thread can't race against itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good point. It can just acquire the lock and shove it into the head.
|
Looking though, I'm now curious - what was the original issue? AFACIT values never get cleaned up and if everything gets inserted at the head, there doesn't appear to be any dangerous races (not that I disapprove of the change per-se; just trying to understand). |
Cc @sudhakarvp1 as well - this was found in internal code analysis. There were two issues found, one in the process-local storage and this (potential) one in ThreadLocalStorage. The request was to not rely on msvc-specific volatile behavior (see update of This whole thing, however, is "just an interlocked SLIST" - I could switch to SLIST_ENTRY and InterlockedPush/PopSList instead and eliminate the code scanning warning. The remaining fixes in the PR mentioned in issue (PR#13582094 and linked bug, which doesn't resolve through task.ms for some reason) still need to be ported over. But, as you point out, it's not that much of an enhancement. What do you think? |
I'm curious to learn more about such behavioral differences |
The bug has links to code scanning data and "why to fix" notes. On re-re-reading, the main issue appears to be missing the read-acquire semantics when starting the spin down the list (to fetch the current bucket list head). All pointers visible from that pointer should be stable and not modified until process teardown, but the initial read should probably needs read-acquire. Then the compare-exchange should remove the reinterpret_cast<> There was also some concern that threads walking the list would have rare but weird behavior, but I can't prove that to myself anymore; any pointer you read from the "head" and any pointer it points to are all validly allocated things that will live until the process terminates. The initial spin down the list should be "lock free," but it's missing the volatile-read-acquire semantics of the very first item in the list. This is where I wish we had a I'll take another run. |
|
Is it fair to say that the issue is that the reads of the variables are not atomic? I did a quick search and in lieu of having access to |
|
I spent a bunch of time reading up on the (undocumented, but in winnt.h - something we need to fix) ReadPointerAcquire and ReadPointerRaw which are just So maybe the solution is to add our own "ReadPointerRaw/Acquire" that has the same thing? Something to noodle on. |
The lock-free
ThreadLocalStorageimplementation usingInterlockedCompareExchangePointerwas identified as not entirely thread-safe. Replaced with lock-per-bucket approach using SRW locks.Changes
wil::srwlockandNode*list head pointerImplementation
The shared lock overhead on the read path (~5-10ns) is negligible compared to thread safety benefits. Exclusive lock is acquired once per thread per bucket.
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.