-
Joao Gramacho authored
WITH A DIFFERENT GTID BUG#18306199 START SLAVE UNTIL MASTER_LOG_POS=MIDDLE-OF-GTID-EVENT STOPS BEFORE TRANSACTION Problem: ======= Slave loses track of GTID-header group boundaries when the group spans across multiple relay log files. This means that when the transaction is retried, or if you stop the SQL thread in the middle of a transaction after some Rotates (considering that the transaction/group was spanned into multiple relay log files), the Gtid_log_event will be silently skipped on slave, and the transaction will be logged with a slave's GTID. Also, when using "START SLAVE UNTIL MASTER_LOG_POS = x;", if "x" is in the middle of a transaction, the server is supposed to complete the transaction. This works fine when GTIDs are disabled. However, when GTIDs are enabled, if "x" is in the middle of the Gtid_log_event, it will stop before the transaction, not considering the Gtid_log_event as the beginning of the transaction. Analysis: ======== When Rotate events are applied at the slave, the SQL thread verifies if it is not inside of a transaction to update SQL thread position. For STS, the SQL thread uses the Relay_log_info::is_in_group() function to determine if it is inside of a transaction. It was found a problem in Relay_log_info::is_in_group(). It wasn't considering a Gtid_log_event as the beginning of a group/transaction. Because of this problem, the SQL thread was updating its position when applying a Rotate_log_event that immediately follows a Gtid_log_event (i.e., in the middle of a transaction but outside BEGIN...COMMIT). This should not happen, as SQL thread position should not be updated in the middle of a transaction so it could retry (or re-apply) the transaction from the beginning in the case of failures because of InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout (or in the case of a request to stop the SQL thread). For the problem with "START SLAVE UNTIL MASTER...", the SQL thread uses the Relay_log_info::is_until_satisfied() function to verify if the until condition is satisfied. In this function, when the until condition is UNTIL_MASTER_POS, the SQL thread will use the current transaction position if in the middle of a transaction or the current event position if not in the middle of a transaction. This verification wasn't considering an Gtid_log_event as the beginning of a transaction. Fix: === Made Relay_log_info::is_in_group() to check if the thread has a GTID set to the current transaction. If so, it will return true, stating that it is already in a group. Made Relay_log_info::is_until_satisfied() to use the is_in_group() function to verify if the SQL thread is in the middle of a transaction. With this change, is_until_satisfied() will consider the Gtid_log_event as part of the transaction.
Joao Gramacho authoredWITH A DIFFERENT GTID BUG#18306199 START SLAVE UNTIL MASTER_LOG_POS=MIDDLE-OF-GTID-EVENT STOPS BEFORE TRANSACTION Problem: ======= Slave loses track of GTID-header group boundaries when the group spans across multiple relay log files. This means that when the transaction is retried, or if you stop the SQL thread in the middle of a transaction after some Rotates (considering that the transaction/group was spanned into multiple relay log files), the Gtid_log_event will be silently skipped on slave, and the transaction will be logged with a slave's GTID. Also, when using "START SLAVE UNTIL MASTER_LOG_POS = x;", if "x" is in the middle of a transaction, the server is supposed to complete the transaction. This works fine when GTIDs are disabled. However, when GTIDs are enabled, if "x" is in the middle of the Gtid_log_event, it will stop before the transaction, not considering the Gtid_log_event as the beginning of the transaction. Analysis: ======== When Rotate events are applied at the slave, the SQL thread verifies if it is not inside of a transaction to update SQL thread position. For STS, the SQL thread uses the Relay_log_info::is_in_group() function to determine if it is inside of a transaction. It was found a problem in Relay_log_info::is_in_group(). It wasn't considering a Gtid_log_event as the beginning of a group/transaction. Because of this problem, the SQL thread was updating its position when applying a Rotate_log_event that immediately follows a Gtid_log_event (i.e., in the middle of a transaction but outside BEGIN...COMMIT). This should not happen, as SQL thread position should not be updated in the middle of a transaction so it could retry (or re-apply) the transaction from the beginning in the case of failures because of InnoDB deadlock or because the transaction's execution time exceeded InnoDB's innodb_lock_wait_timeout (or in the case of a request to stop the SQL thread). For the problem with "START SLAVE UNTIL MASTER...", the SQL thread uses the Relay_log_info::is_until_satisfied() function to verify if the until condition is satisfied. In this function, when the until condition is UNTIL_MASTER_POS, the SQL thread will use the current transaction position if in the middle of a transaction or the current event position if not in the middle of a transaction. This verification wasn't considering an Gtid_log_event as the beginning of a transaction. Fix: === Made Relay_log_info::is_in_group() to check if the thread has a GTID set to the current transaction. If so, it will return true, stating that it is already in a group. Made Relay_log_info::is_until_satisfied() to use the is_in_group() function to verify if the SQL thread is in the middle of a transaction. With this change, is_until_satisfied() will consider the Gtid_log_event as part of the transaction.
Loading