-
Sujatha Sivakumar authored
FILE/POS WITH SEMISYNC Problem: ======== When DMLs are in progress on the master stopping a slave and setting ahead binlog name/pos will cause an assert on the master. Analysis: ======== Trx1 is waiting for ack on certain position meanwhile change master command is issued from the slave to master with a future log file name and position. Upon receipt master's dump thread assumes that it has received ack till the future log file and position and clears the active transaction list and sets the reply_file_name_ and reply_file_pos_ to future file name and pos. since reply file name and position are ahead of Trx1's current position Trx1 completes successfully. Execute a new DML Trx2. During commit Trx2 waits for an ack on some position but the reply file name and position are already pointing to future file name and position so this current transaction thinks that it has already got an ack and no need to wait and proceeds further. But the active transaction has not actually received any ack the transaction node is not cleared from active transaction list. During the exit of commit operation there is an assert which expects active transaction node to be cleared from the transaction list which fails this leads to an assert. Fix: === Ideally when a request for change master is received the request file name and position should be validated for correctness and only after the validation the transmit_start hook should be fired. Hence moved the transmit_start hook position to appropriate place where it is placed post the validation of file name and position. Post fix when a future log file name and position is specified ER_MASTER_FATAL_ERROR_READING_BINLOG error will be returned to slave. Note: During fixing this bug one more bug was identified. Rpl_semi_sync_master_clients this parameter can become negative at certain conditions causing UNINSTALL PLUGIN rpl_semi_sync_master to fail. 'Rpl_semi_sync_master_clients' client is incremented when dump thread initiates transmission and when a new slave connection is established. This value is decremented when dump thread exits and slave gets disconnected. On a certain condition if IO thread stops due to an error and dump thread exited before "transmit_start" hook dump thread will still try to call "transmit_stop" and this will cause Rpl_semi_sync_master_clients= -1 which is invalid state. Hence as part of the fix a new flag is introduced such that "transmit_stop" will be called only when "transmit_start" is called.
Sujatha Sivakumar authoredFILE/POS WITH SEMISYNC Problem: ======== When DMLs are in progress on the master stopping a slave and setting ahead binlog name/pos will cause an assert on the master. Analysis: ======== Trx1 is waiting for ack on certain position meanwhile change master command is issued from the slave to master with a future log file name and position. Upon receipt master's dump thread assumes that it has received ack till the future log file and position and clears the active transaction list and sets the reply_file_name_ and reply_file_pos_ to future file name and pos. since reply file name and position are ahead of Trx1's current position Trx1 completes successfully. Execute a new DML Trx2. During commit Trx2 waits for an ack on some position but the reply file name and position are already pointing to future file name and position so this current transaction thinks that it has already got an ack and no need to wait and proceeds further. But the active transaction has not actually received any ack the transaction node is not cleared from active transaction list. During the exit of commit operation there is an assert which expects active transaction node to be cleared from the transaction list which fails this leads to an assert. Fix: === Ideally when a request for change master is received the request file name and position should be validated for correctness and only after the validation the transmit_start hook should be fired. Hence moved the transmit_start hook position to appropriate place where it is placed post the validation of file name and position. Post fix when a future log file name and position is specified ER_MASTER_FATAL_ERROR_READING_BINLOG error will be returned to slave. Note: During fixing this bug one more bug was identified. Rpl_semi_sync_master_clients this parameter can become negative at certain conditions causing UNINSTALL PLUGIN rpl_semi_sync_master to fail. 'Rpl_semi_sync_master_clients' client is incremented when dump thread initiates transmission and when a new slave connection is established. This value is decremented when dump thread exits and slave gets disconnected. On a certain condition if IO thread stops due to an error and dump thread exited before "transmit_start" hook dump thread will still try to call "transmit_stop" and this will cause Rpl_semi_sync_master_clients= -1 which is invalid state. Hence as part of the fix a new flag is introduced such that "transmit_stop" will be called only when "transmit_start" is called.
Loading