Skip to content
  • Dmitry Lenev's avatar
    58a8ee30
    Fix for bug#28714367 "5.7.21+ LF_NODE METADATA LOCK LEAK WHEN USING GET_LOCK". · 58a8ee30
    Dmitry Lenev authored
    Calls to GET_LOCK() function with zero timeout argument which failed due to
    concurrent connections holding the same user-level lock, left underlying
    metadata lock structure in state which prevented future reuse of its memory
    for other metadata locks (or release of this memory before server shutdown).
    As result memory was hogged by some of workloads which involved user-level
    locks with random/constantly changing names, attempted to lock by different
    connections with zero timeout.
    
    The problem was introduced by fix for bug@26739438 "DEADLOCK ON
    GET_LOCK(..., 0)". This fix added short-cut to MDL_context::acquire_lock()
    for case when we failed to acquire lock instantly and zero timeout was used.
    However, proper cleanup of MDL_lock fast path state and obtrusive lock
    count was not performed in this case, which led to MDL_lock object being
    always marked as used.
    
    This fix solves the problem by changing acquire_lock() code to resort
    to calling try_acquire_lock() in case of zero timeout. The latter call
    performs cleanup properly.
    
    It is hard to write robust test case for this bug for our test suite.
    So no test case provided as part of the patch. However, this fix was
    tested manually.
    58a8ee30
    Fix for bug#28714367 "5.7.21+ LF_NODE METADATA LOCK LEAK WHEN USING GET_LOCK".
    Dmitry Lenev authored
    Calls to GET_LOCK() function with zero timeout argument which failed due to
    concurrent connections holding the same user-level lock, left underlying
    metadata lock structure in state which prevented future reuse of its memory
    for other metadata locks (or release of this memory before server shutdown).
    As result memory was hogged by some of workloads which involved user-level
    locks with random/constantly changing names, attempted to lock by different
    connections with zero timeout.
    
    The problem was introduced by fix for bug@26739438 "DEADLOCK ON
    GET_LOCK(..., 0)". This fix added short-cut to MDL_context::acquire_lock()
    for case when we failed to acquire lock instantly and zero timeout was used.
    However, proper cleanup of MDL_lock fast path state and obtrusive lock
    count was not performed in this case, which led to MDL_lock object being
    always marked as used.
    
    This fix solves the problem by changing acquire_lock() code to resort
    to calling try_acquire_lock() in case of zero timeout. The latter call
    performs cleanup properly.
    
    It is hard to write robust test case for this bug for our test suite.
    So no test case provided as part of the patch. However, this fix was
    tested manually.
Loading