-
Sujatha Sivakumar authored
SLAVE AND AUTOCOMMIT=OFF Problem: ======= Enable CRASH-SAFE MTS and do start slave. Once slave is up stop the slave and set autocommit=0. Executing start slave on this session will hang. The hang will be there for the duration of lock wait timeout. After the lock wait timeout, when start slave proceeds further it will crash during a call to 'Rpl_info::remove_info'. Analysis: ======== The problem happened because the Rpl_info_table::do_check_info() methods, used on slave internal structures initialization, didn't finish the transaction started to access info tables. Since tables were still locked by the access from do_check_info(), the following initialization procedures failed to acquire locks on the info tables. Regarding crash during MTS recovery process workers are created to complete the MTS recovery. When the above mentioned initialization problem happens within 'Create_worker' function call an error that says "Failed to initialize the worker info structure" is reported and the "worker" object gets deleted within the 'Create_worker' call. Upon returning back to 'Relay_log_info::mts_finalize_recovery' call code will try to access the worker object which was already deleted resulting in a crash. Fix: === When info tables are used and autocommit= 0 we force a new transaction to start and commit to avoid deadlocks on START SLAVE with CRASH-SAFE MTS slave. Added code to verify the existence of worker object before accessing the worker object in finalize recovery code. sujatha:~/bug_repo/Bug21440793_mysql-5.6$ git pull Already up-to-date.
Sujatha Sivakumar authoredSLAVE AND AUTOCOMMIT=OFF Problem: ======= Enable CRASH-SAFE MTS and do start slave. Once slave is up stop the slave and set autocommit=0. Executing start slave on this session will hang. The hang will be there for the duration of lock wait timeout. After the lock wait timeout, when start slave proceeds further it will crash during a call to 'Rpl_info::remove_info'. Analysis: ======== The problem happened because the Rpl_info_table::do_check_info() methods, used on slave internal structures initialization, didn't finish the transaction started to access info tables. Since tables were still locked by the access from do_check_info(), the following initialization procedures failed to acquire locks on the info tables. Regarding crash during MTS recovery process workers are created to complete the MTS recovery. When the above mentioned initialization problem happens within 'Create_worker' function call an error that says "Failed to initialize the worker info structure" is reported and the "worker" object gets deleted within the 'Create_worker' call. Upon returning back to 'Relay_log_info::mts_finalize_recovery' call code will try to access the worker object which was already deleted resulting in a crash. Fix: === When info tables are used and autocommit= 0 we force a new transaction to start and commit to avoid deadlocks on START SLAVE with CRASH-SAFE MTS slave. Added code to verify the existence of worker object before accessing the worker object in finalize recovery code. sujatha:~/bug_repo/Bug21440793_mysql-5.6$ git pull Already up-to-date.
Loading