-
Dmitry Lenev authored
lock acquisition for DML" and fix for bug #18077895 "WL7305 PUSH (7249) CAUSES SYSBENCH OLTP_RO 1 THREAD TO DROP UP TO -8%". The idea of this task is to change the acquisition of unobtrusive locks on the fast path, which currently involves acquisition of MDL_lock::m_rwlock, some checks, increment of the packed counter and release of m_rwlock, to a single atomic compare-and-swap operation. Similarly the release of the lock on the fast path becomes single atomic compare-and-swap (in absence of obtrusive locks and assuming we are not releasing the last lock for this MDL_lock object) instead of acquisition of MDL_lock::m_rwlock, decrement of the packed counter, some checks and m_rwlock release. As result these operations become at least twice cheaper than their old versions which has a nice effect on performance/scalability. Acquisition/release of locks on the slow path (i.e. unobtrusive locks in presence of obtrusive locks and obtrusive locks) still has to use the old approach involving locking/unlocking MDL_lock::m_rwlocks and checks of MDL_lock::m_granted/m_waiting bitmaps/lists. This patch implements the above idea by performing the following three transformations: I) MDL_lock::m_fast_path_granted_count is replaced with an atomic MDL_lock::m_fast_path_state member, which in the ideal case of "fast path" acquisition/release is checked and changed using CAS without holding any mutexes. II) Since we would like to check in the same atomic CAS operation that MDL_lock object was not destroyed, its m_is_destroyed member is replaced by a IS_DESTROYED bit flag in the m_fast_path_state packed counter. III) Similarly, since we also would like to check in the same atomic CAS that there are no granted or pending obtrusive locks, we have to add a HAS_OBTRUSIVE bit flag in the m_fast_path_state, while keeping MDL_lock::m_obtrusive_locks_granted_waiting_count. This flag should be set when we are about to try acquiring an obtrusive lock and cleared once the last granted or pending obtrusive lock goes away. Most of the remaining changes in this patch are necessary in order to fix bug #18077895 "WL7305 PUSH (7249) CAUSES SYSBENCH OLTP_RO 1 THREAD TO DROP UP TO -8%". This bug manifested itself as a slowdown for workloads involving 1 connection in cases when there were many concurrent connections to the same server in the past or there were many dormant connections at the same time as 1 active connection. In such scenarios the release of a metadata lock meant that MDL_lock became unused, was removed from lock-free hash with all lock objects and we tried to return it back to allocator. The latter operation involved scanning pins for all current and past connections, which became fairly expensive in this scenario. This patch solves this problem by avoiding releasing MDL_lock objects and removing them from the hash once they are no longer used. Instead we keep unused objects in MDL_map and start their eviction only if their number passes certain threshold and the ratio of unused/total lock objects is big enough. We evict random unused objects so on average objects which are used more often will stay in the hash and rarely used objects will go away. The above idea is implemented by: a) Introducing a new HAS_SLOW_PATH flag in the MDL_lock::m_fast_path_state member, which indicates if there any tickets in MDL_lock::m_granted and m_waiting lists or we are about try to add one. Thanks to this flag, it is possible to distinguish between used and unused MDL_lock objects in atomic compare-and-swap operations used to implement fast path acquisition and release of locks. b) Changing code which releases locks to avoid removing unused MDL_lock objects from the hash and deleting them afterwards. Instead we atomically increment the newly introduced MDL_map::m_unused_lock_objects counter. Similarly, on the first acquisition of lock for MDL_lock which was previously unused we atomically decrement this counter. c) In cases when the increment of the MDL_map::m_unused_lock_objects counter exceeds the threshold value and the unused/total objects ratio is high enough, we try to reduce the number of unused objects. We look-up a random unused object in MDL_map, mark it as destroyed, remove it from the hash and return it back to allocator. As a consequence MDL_map::remove() method has became MDL_map::remove_random_unused(). d) To support the change described in c), a new lf_hash_random_match() function was introduced which allows us to efficiently find a random object which matches certain condition in LF_HASH. e) Also to support the change described in c), a new PRNG was added to MDL_context class. This PRNG is used as a source for randomness for look-ups of random unused objects. Unit tests were added covering handling of unused MDL_lock objects and for the new lf_hash_random_matches() function. Finally, this patch fixes a violation of the pinning protocol, which was introduced by WL7305 and which occured when the MDL subsystem failed to look up MDL_lock object in lock free hash. The LF_HASH documentation was updated to reflect the need to call lf_hash_search_unpin in this case.
Dmitry Lenev authoredlock acquisition for DML" and fix for bug #18077895 "WL7305 PUSH (7249) CAUSES SYSBENCH OLTP_RO 1 THREAD TO DROP UP TO -8%". The idea of this task is to change the acquisition of unobtrusive locks on the fast path, which currently involves acquisition of MDL_lock::m_rwlock, some checks, increment of the packed counter and release of m_rwlock, to a single atomic compare-and-swap operation. Similarly the release of the lock on the fast path becomes single atomic compare-and-swap (in absence of obtrusive locks and assuming we are not releasing the last lock for this MDL_lock object) instead of acquisition of MDL_lock::m_rwlock, decrement of the packed counter, some checks and m_rwlock release. As result these operations become at least twice cheaper than their old versions which has a nice effect on performance/scalability. Acquisition/release of locks on the slow path (i.e. unobtrusive locks in presence of obtrusive locks and obtrusive locks) still has to use the old approach involving locking/unlocking MDL_lock::m_rwlocks and checks of MDL_lock::m_granted/m_waiting bitmaps/lists. This patch implements the above idea by performing the following three transformations: I) MDL_lock::m_fast_path_granted_count is replaced with an atomic MDL_lock::m_fast_path_state member, which in the ideal case of "fast path" acquisition/release is checked and changed using CAS without holding any mutexes. II) Since we would like to check in the same atomic CAS operation that MDL_lock object was not destroyed, its m_is_destroyed member is replaced by a IS_DESTROYED bit flag in the m_fast_path_state packed counter. III) Similarly, since we also would like to check in the same atomic CAS that there are no granted or pending obtrusive locks, we have to add a HAS_OBTRUSIVE bit flag in the m_fast_path_state, while keeping MDL_lock::m_obtrusive_locks_granted_waiting_count. This flag should be set when we are about to try acquiring an obtrusive lock and cleared once the last granted or pending obtrusive lock goes away. Most of the remaining changes in this patch are necessary in order to fix bug #18077895 "WL7305 PUSH (7249) CAUSES SYSBENCH OLTP_RO 1 THREAD TO DROP UP TO -8%". This bug manifested itself as a slowdown for workloads involving 1 connection in cases when there were many concurrent connections to the same server in the past or there were many dormant connections at the same time as 1 active connection. In such scenarios the release of a metadata lock meant that MDL_lock became unused, was removed from lock-free hash with all lock objects and we tried to return it back to allocator. The latter operation involved scanning pins for all current and past connections, which became fairly expensive in this scenario. This patch solves this problem by avoiding releasing MDL_lock objects and removing them from the hash once they are no longer used. Instead we keep unused objects in MDL_map and start their eviction only if their number passes certain threshold and the ratio of unused/total lock objects is big enough. We evict random unused objects so on average objects which are used more often will stay in the hash and rarely used objects will go away. The above idea is implemented by: a) Introducing a new HAS_SLOW_PATH flag in the MDL_lock::m_fast_path_state member, which indicates if there any tickets in MDL_lock::m_granted and m_waiting lists or we are about try to add one. Thanks to this flag, it is possible to distinguish between used and unused MDL_lock objects in atomic compare-and-swap operations used to implement fast path acquisition and release of locks. b) Changing code which releases locks to avoid removing unused MDL_lock objects from the hash and deleting them afterwards. Instead we atomically increment the newly introduced MDL_map::m_unused_lock_objects counter. Similarly, on the first acquisition of lock for MDL_lock which was previously unused we atomically decrement this counter. c) In cases when the increment of the MDL_map::m_unused_lock_objects counter exceeds the threshold value and the unused/total objects ratio is high enough, we try to reduce the number of unused objects. We look-up a random unused object in MDL_map, mark it as destroyed, remove it from the hash and return it back to allocator. As a consequence MDL_map::remove() method has became MDL_map::remove_random_unused(). d) To support the change described in c), a new lf_hash_random_match() function was introduced which allows us to efficiently find a random object which matches certain condition in LF_HASH. e) Also to support the change described in c), a new PRNG was added to MDL_context class. This PRNG is used as a source for randomness for look-ups of random unused objects. Unit tests were added covering handling of unused MDL_lock objects and for the new lf_hash_random_matches() function. Finally, this patch fixes a violation of the pinning protocol, which was introduced by WL7305 and which occured when the MDL subsystem failed to look up MDL_lock object in lock free hash. The LF_HASH documentation was updated to reflect the need to call lf_hash_search_unpin in this case.
Loading