-
Sujatha Sivakumar authored
SHOW SLAVE STATUS Problem: ======= If a client thread on a slave does FLUSH TABLES WITH READ LOCK; then master does some updates, SHOW SLAVE STATUS in the same client will be blocked. If the blocked slave server is killed and restarted when GTID's are enabled, one GTID gets missed leaving the slave in an out of sync state. Using 'relay-log-recovery=1' should do a crash safe slave recovery but it is not happening. Analysis: ======== RPL info tables "slave_master_info" and "slave_relay_log_info" are used to store slave's thread's positions. When a FTWRL is issued on slave this command blocks all operations on tables. Hence when some update operations are done on master the slave thread will be able to open the tables but while closing, an internal commit operation will be blocked. These blocked info tables make the "SHOW SLAVE STATUS" to hang. When "relay-log-recovery=1" during crash recovery all the partial written events will be discarded the master info table is initialised with the information read from relay log info table. The "Retrieved GTID" set should be cleared so that partial read events are discarded and they are fetched once again. Since this is not happening "The Retried GTID" is considered to be executed and the actual transaction is skipped. Fix: === Info tables should be made independent of global read lock. At the time of opening the RPL info tables the following "MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK" flag is set to make the info tables not to block when the FTWRL is in progress. Hence a similar flag is introduced in "ha_commit_trans" code which will allow commit to complete even if a global read lock is active. This flag can be used to allow changes to internal tables (e.g. slave status tables). To fix the missing GTID issue using "relay-log-recovery" option, during recovery process retrieved GTID set is cleared.
Sujatha Sivakumar authoredSHOW SLAVE STATUS Problem: ======= If a client thread on a slave does FLUSH TABLES WITH READ LOCK; then master does some updates, SHOW SLAVE STATUS in the same client will be blocked. If the blocked slave server is killed and restarted when GTID's are enabled, one GTID gets missed leaving the slave in an out of sync state. Using 'relay-log-recovery=1' should do a crash safe slave recovery but it is not happening. Analysis: ======== RPL info tables "slave_master_info" and "slave_relay_log_info" are used to store slave's thread's positions. When a FTWRL is issued on slave this command blocks all operations on tables. Hence when some update operations are done on master the slave thread will be able to open the tables but while closing, an internal commit operation will be blocked. These blocked info tables make the "SHOW SLAVE STATUS" to hang. When "relay-log-recovery=1" during crash recovery all the partial written events will be discarded the master info table is initialised with the information read from relay log info table. The "Retrieved GTID" set should be cleared so that partial read events are discarded and they are fetched once again. Since this is not happening "The Retried GTID" is considered to be executed and the actual transaction is skipped. Fix: === Info tables should be made independent of global read lock. At the time of opening the RPL info tables the following "MYSQL_OPEN_IGNORE_GLOBAL_READ_LOCK" flag is set to make the info tables not to block when the FTWRL is in progress. Hence a similar flag is introduced in "ha_commit_trans" code which will allow commit to complete even if a global read lock is active. This flag can be used to allow changes to internal tables (e.g. slave status tables). To fix the missing GTID issue using "relay-log-recovery" option, during recovery process retrieved GTID set is cleared.
Loading