8598e84c5f
The testAndSet operation is expensive if the lock is contended: attempting to CAS that lock will cause the cacheline containing the lock to be brought to the current CPU's most local cache in exclusive mode, which in turn causes the CPU that has the lock to stall when it attempts to release it. That's not desirable if we were just trying an untimed tryLock*. In the case of timed, contended tryLocks or unconditional locks, we still need to perform an atomic operation to indicate we're about to wait. For that case, this patch reduces the minimum number of atomic operations from 2 to 1, which is a gain even in the case where no other thread has changed the lock status at all. In case they have, either by more threads attempting to lock or by the one that has the lock unlocking it, this avoids the cacheline bouncing around between the multiple CPUs between those two atomic operations. For QMutex, that second atomic is a fetchAndStore, not testAndSet. The above explanation is valid for architectures with Compare-And-Swap instructions, such as x86 and ARMv8.1. For architectures using Load Linked/Store Conditional instructions, the explanation doesn't apply but the benefits still should because we avoid the expense of the LL. See similar change to pthread_mutex_lock in https://sourceware.org/git/?p=glibc.git;a=commit;h=d672a98a1af106bd68deb15576710cd61363f7a6 Change-Id: I3d728c4197df49169066fffd1756dcc26b2cf5f3 Reviewed-by: Marc Mutz <marc.mutz@qt.io> |
||
---|---|---|
.github/workflows | ||
bin | ||
cmake | ||
coin | ||
config.tests | ||
dist | ||
doc | ||
examples | ||
lib | ||
libexec | ||
LICENSES | ||
mkspecs | ||
qmake | ||
src | ||
tests | ||
util | ||
.cmake.conf | ||
.gitattributes | ||
.gitignore | ||
.lgtm.yml | ||
.tag | ||
CMakeLists.txt | ||
conanfile.py | ||
config_help.txt | ||
configure | ||
configure.bat | ||
configure.cmake | ||
dependencies.yaml | ||
qt_cmdline.cmake | ||
sync.profile |