Skip to content
  • Marc Alff's avatar
    700ee866
    · 700ee866
    Marc Alff authored
    Bug#14116386 PERFSCHEMA.TABLE_LOCK_AGGREGATE_GLOBAL_4U_3T FAILS ON MYSQL-TRUNK SPORADICALLY
    
    This fix is a server code change, that fixes an issue of spuriously
    disappearing threads in the performance schema.
    
    This issue is affecting randomly many tests scripts, and in particular
    scripts that rely heavily on per thread statistics.
    
    Background information:
    
    Tests scripts are executed with mysql-test-run, a client.
    The test client consider that the server is "done" executing a statement
    when the current statement execution is done replying all the bytes in the
    client socket connection.
    In terms of code, this happens after the last socket write event, which is
    technically still part of the current statement execution.
    
    However, the server is not "done" yet, and is not idle.
    The server code is still executing code, such as:
    - more wait events, such as mutex locks
    - more stages events, such as "cleaning up",
    - finish the current statement instrumentation,
    - start an idle wait event.
    
    Only when the server is blocked in an IDLE event can the server be
    considered is a stable state. Anything that happen between the last socket
    write and the idle wait can potentially cause interferences with the queries
    executed by the client script (in a different connection).
    This scenario is very common for the test scripts involved, where a root
    monitoring connection spy on a regular client connection.
    
    Of particular interrest, during that window of execution in the server, are
    calls that maintain the performance_schema.threads table, for the
    PROCESSLIST_STATE and PROCESSLIST_INFO columns.
    These columns are still updated while the test client makes queries to the
    performance schema.
    
    The problem is with the implementation of:
    - set_thread_state_v1()
    - set_thread_info_v1()
    which for a short period of time flag the entire thread as dirty.
    
    This code cause the thread to litteraly disappear and reappear later in:
    - the threads table
    - every aggregate table that iterates on threads
    causing the test failures seen.
    
    This issue has in fact a significant impact, as these instrumentation points
    are called multiple times during a statement execution: at each instrumented
    stage.
    
    The fix is to:
    - not touch the PFS_thread::m_lock 
    - define a dedicated lock to maintain integrity of the processlist state and info
      attributes, PFS_thread::m_processlist_lock.
    
    With this fix, a PFS_thread is not spuriously hidden each time the thread state
    changes, making every aggregate based on threads more stable and accurate.
    700ee866
    Marc Alff authored
    Bug#14116386 PERFSCHEMA.TABLE_LOCK_AGGREGATE_GLOBAL_4U_3T FAILS ON MYSQL-TRUNK SPORADICALLY
    
    This fix is a server code change, that fixes an issue of spuriously
    disappearing threads in the performance schema.
    
    This issue is affecting randomly many tests scripts, and in particular
    scripts that rely heavily on per thread statistics.
    
    Background information:
    
    Tests scripts are executed with mysql-test-run, a client.
    The test client consider that the server is "done" executing a statement
    when the current statement execution is done replying all the bytes in the
    client socket connection.
    In terms of code, this happens after the last socket write event, which is
    technically still part of the current statement execution.
    
    However, the server is not "done" yet, and is not idle.
    The server code is still executing code, such as:
    - more wait events, such as mutex locks
    - more stages events, such as "cleaning up",
    - finish the current statement instrumentation,
    - start an idle wait event.
    
    Only when the server is blocked in an IDLE event can the server be
    considered is a stable state. Anything that happen between the last socket
    write and the idle wait can potentially cause interferences with the queries
    executed by the client script (in a different connection).
    This scenario is very common for the test scripts involved, where a root
    monitoring connection spy on a regular client connection.
    
    Of particular interrest, during that window of execution in the server, are
    calls that maintain the performance_schema.threads table, for the
    PROCESSLIST_STATE and PROCESSLIST_INFO columns.
    These columns are still updated while the test client makes queries to the
    performance schema.
    
    The problem is with the implementation of:
    - set_thread_state_v1()
    - set_thread_info_v1()
    which for a short period of time flag the entire thread as dirty.
    
    This code cause the thread to litteraly disappear and reappear later in:
    - the threads table
    - every aggregate table that iterates on threads
    causing the test failures seen.
    
    This issue has in fact a significant impact, as these instrumentation points
    are called multiple times during a statement execution: at each instrumented
    stage.
    
    The fix is to:
    - not touch the PFS_thread::m_lock 
    - define a dedicated lock to maintain integrity of the processlist state and info
      attributes, PFS_thread::m_processlist_lock.
    
    With this fix, a PFS_thread is not spuriously hidden each time the thread state
    changes, making every aggregate based on threads more stable and accurate.
Loading