The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.
Immediate symptoms:
The assert(NULL == p) in repo.cpp, line 218 triggers.
Then occasionally a crash with a nullptr later on.
The cause:
The function FFTAction::compileKernels will compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.
|
if( fftRepo.getclProgram( this->getGenerator(), this->getSignatureData(), program, q_device, fftPlan->context ) == CLFFT_INVALID_PROGRAM ) |
- five threads concurrently try to
compileKernels for the first time
- all threads will query the
fftRepo at the same time
- all threads will get a
CLFFT_INVALID_PROGRAM return code.
- Consequently, all five threads assume that the kernel has not been cached and will compile the kernel and
- all threads will call
fftRepo.setclProgram with the same parameters.
The first call will set the program, the next calls will trigger the assert.
The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.
I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.
The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.
Immediate symptoms:
The
assert(NULL == p)in repo.cpp, line 218 triggers.clFFT/src/library/repo.cpp
Line 218 in c59712e
Then occasionally a crash with a nullptr later on.
The cause:
The function
FFTAction::compileKernelswill compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.clFFT/src/library/enqueue.cpp
Line 713 in c59712e
compileKernelsfor the first timefftRepoat the same timeCLFFT_INVALID_PROGRAMreturn code.fftRepo.setclProgramwith the same parameters.The first call will set the program, the next calls will trigger the
assert.The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.
I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.