Skip to content
  • Joao Gramacho's avatar
    093b0dc4
    BUG#18652178 STOP SQL_THREAD, START SQL_THREAD CAUSES A TRX TO LOG · 093b0dc4
    Joao Gramacho authored
                 WITH A DIFFERENT GTID
    BUG#18306199 START SLAVE UNTIL MASTER_LOG_POS=MIDDLE-OF-GTID-EVENT
                 STOPS BEFORE TRANSACTION
    Problem:
    =======
    
    Slave loses track of GTID-header group boundaries when the group spans
    across multiple relay log files.
    
    This means that when the transaction is retried, or if you stop the SQL
    thread in the middle of a transaction after some Rotates (considering
    that the transaction/group was spanned into multiple relay log files),
    the Gtid_log_event will be silently skipped on slave, and the
    transaction will be logged with a slave's GTID.
    
    Also, when using "START SLAVE UNTIL MASTER_LOG_POS = x;", if "x" is in
    the middle of a transaction, the server is supposed to complete
    the transaction. This works fine when GTIDs are disabled. However,
    when GTIDs are enabled, if "x" is in the middle of the Gtid_log_event,
    it will stop before the transaction, not considering the
    Gtid_log_event as the beginning of the transaction.
    
    Analysis:
    ========
    
    When Rotate events are applied at the slave, the SQL thread verifies
    if it is not inside of a transaction to update SQL thread position.
    For STS, the SQL thread uses the Relay_log_info::is_in_group()
    function to determine if it is inside of a transaction.
    
    It was found a problem in Relay_log_info::is_in_group(). It wasn't
    considering a Gtid_log_event as the beginning of a group/transaction.
    
    Because of this problem, the SQL thread was updating its position
    when applying a Rotate_log_event that immediately follows a
    Gtid_log_event (i.e., in the middle of a transaction but outside
    BEGIN...COMMIT).
    
    This should not happen, as SQL thread position should not be updated
    in the middle of a transaction so it could retry (or re-apply) the
    transaction from the beginning in the case of failures because of
    InnoDB deadlock or because the transaction's execution time exceeded
    InnoDB's innodb_lock_wait_timeout (or in the case of a request to
    stop the SQL thread). 
    
    For the problem with "START SLAVE UNTIL MASTER...", the SQL thread
    uses the Relay_log_info::is_until_satisfied() function to verify if
    the until condition is satisfied. In this function, when the until
    condition is UNTIL_MASTER_POS, the SQL thread will use the current
    transaction position if in the middle of a transaction or the current
    event position if not in the middle of a transaction. This
    verification wasn't considering an Gtid_log_event as the beginning of
    a transaction.
    
    Fix:
    ===
    
    Made Relay_log_info::is_in_group() to check if the thread has a GTID
    set to the current transaction. If so, it will return true, stating
    that it is already in a group.
    
    Made Relay_log_info::is_until_satisfied() to use the is_in_group()
    function to verify if the SQL thread is in the middle of a transaction.
    With this change, is_until_satisfied() will consider the
    Gtid_log_event as part of the transaction.
    093b0dc4
    BUG#18652178 STOP SQL_THREAD, START SQL_THREAD CAUSES A TRX TO LOG
    Joao Gramacho authored
                 WITH A DIFFERENT GTID
    BUG#18306199 START SLAVE UNTIL MASTER_LOG_POS=MIDDLE-OF-GTID-EVENT
                 STOPS BEFORE TRANSACTION
    Problem:
    =======
    
    Slave loses track of GTID-header group boundaries when the group spans
    across multiple relay log files.
    
    This means that when the transaction is retried, or if you stop the SQL
    thread in the middle of a transaction after some Rotates (considering
    that the transaction/group was spanned into multiple relay log files),
    the Gtid_log_event will be silently skipped on slave, and the
    transaction will be logged with a slave's GTID.
    
    Also, when using "START SLAVE UNTIL MASTER_LOG_POS = x;", if "x" is in
    the middle of a transaction, the server is supposed to complete
    the transaction. This works fine when GTIDs are disabled. However,
    when GTIDs are enabled, if "x" is in the middle of the Gtid_log_event,
    it will stop before the transaction, not considering the
    Gtid_log_event as the beginning of the transaction.
    
    Analysis:
    ========
    
    When Rotate events are applied at the slave, the SQL thread verifies
    if it is not inside of a transaction to update SQL thread position.
    For STS, the SQL thread uses the Relay_log_info::is_in_group()
    function to determine if it is inside of a transaction.
    
    It was found a problem in Relay_log_info::is_in_group(). It wasn't
    considering a Gtid_log_event as the beginning of a group/transaction.
    
    Because of this problem, the SQL thread was updating its position
    when applying a Rotate_log_event that immediately follows a
    Gtid_log_event (i.e., in the middle of a transaction but outside
    BEGIN...COMMIT).
    
    This should not happen, as SQL thread position should not be updated
    in the middle of a transaction so it could retry (or re-apply) the
    transaction from the beginning in the case of failures because of
    InnoDB deadlock or because the transaction's execution time exceeded
    InnoDB's innodb_lock_wait_timeout (or in the case of a request to
    stop the SQL thread). 
    
    For the problem with "START SLAVE UNTIL MASTER...", the SQL thread
    uses the Relay_log_info::is_until_satisfied() function to verify if
    the until condition is satisfied. In this function, when the until
    condition is UNTIL_MASTER_POS, the SQL thread will use the current
    transaction position if in the middle of a transaction or the current
    event position if not in the middle of a transaction. This
    verification wasn't considering an Gtid_log_event as the beginning of
    a transaction.
    
    Fix:
    ===
    
    Made Relay_log_info::is_in_group() to check if the thread has a GTID
    set to the current transaction. If so, it will return true, stating
    that it is already in a group.
    
    Made Relay_log_info::is_until_satisfied() to use the is_in_group()
    function to verify if the SQL thread is in the middle of a transaction.
    With this change, is_until_satisfied() will consider the
    Gtid_log_event as part of the transaction.
Loading