-
Andrei Elkin authored
The assert is hit when XA transaction updated only a non-transactional table and went to prepare stage. At that time MYSQL_BIN_LOG::write_binlog_and_commit_engine() is invoked where the trx should not attempt any committing. But that's happened. The binlog "engine" managed to commit_low() in conditions of the reported case which led to the assert few instructions later #7 0x00007fad151dbe42 in __GI___assert_fail (assertion=0x1e37d16 "is_started()", file=0x1e37c68 #8 0x0000000000fafcf7 in Ha_trx_info::next (this=0x7fac8c002028) at #9 0x0000000000f9dfa9 in ha_prepare (thd=0x7fac8c000bb0) More analysis proved there's an "inverted" related issue in the rollback branch. When the pure non-transactional engine xa transaction is rolled back, this time, it misses to execute rollback_low() method to leave a screwed state which the following query discovers hiting against an assert. bool trans_commit_stmt(THD*): Assertion `thd->in_active_multi_stmt_transaction() || thd->m_transaction_psi == __null' failed. And finally, testing revealed a case of no test coverage so far in combination of XA ROLLBACK, no transactional tables involved and no XA PREPARE. In such case logging was just incorrect mixing XA START and ROLLBACK. The original issue is fixed with making MYSQL_BIN_LOG::write_binlog_and_commit_engine() to compute a local boolean flag `skip-commit' correctly based on the value of the XA state. Commit is disallowed when the state is Prepared. To satisfy to the ONE-phase XA, the committing XA is also made to receive XA_PREPARED status, as intermediate, right after the prepare phase is done. in a general commit handler of ha_commit_trans(). The second issue of the rollback part is fixed with relocating an existing explicit rollback_low() for xa-rollback to a safer point. And the final third issue is fixed with augmentment of ending_trans()'s the trans_cannot_safely_rollback(thd) branch to compose an appropriate Query-log-event. Logics of preventing second time do_binlog_xa_commit_rollback() invocation that is actual to the "externally" committing XA is simplified. The former idea was in that the first invocation of do_binlog_xa_commit_rollback() in the "external" XA commit branch would necessarily turn the cache from empty, asserted, to not empty. On the other hand, at running do_binlog_xa_commit_rollback() for the local xa the cache must be empty (because it should've been flushed at prepare), asserted in the rollback case too. Hence the state of the cache checking was correct: (the local xa go through, the external xa goes through once which is first time). Now instead of the above deduction the 1st invocation is just gets explicitly flagged. And because we would like to preserve signature of MYSQL_BIN_LOG class methods the flag is made to pass as a new member of `bool binlog_cache_mngr::has_logged_xid' Mixing transactional and non-transactional tables in rpl.rpl_xa_survive_disconnect_mixed_engines reveal one issue in MTS grouping. An XA transaction "prepare" group can be closed with XA-ROLLBACK query which was previously missed to capture. A use case for that is mixed transactional and non-transactional updates. It's been corrected now. As a side effect is_loggable_xa_prepare() had to be refined to satisfy @c simulate_commit_failure.
Andrei Elkin authoredThe assert is hit when XA transaction updated only a non-transactional table and went to prepare stage. At that time MYSQL_BIN_LOG::write_binlog_and_commit_engine() is invoked where the trx should not attempt any committing. But that's happened. The binlog "engine" managed to commit_low() in conditions of the reported case which led to the assert few instructions later #7 0x00007fad151dbe42 in __GI___assert_fail (assertion=0x1e37d16 "is_started()", file=0x1e37c68 #8 0x0000000000fafcf7 in Ha_trx_info::next (this=0x7fac8c002028) at #9 0x0000000000f9dfa9 in ha_prepare (thd=0x7fac8c000bb0) More analysis proved there's an "inverted" related issue in the rollback branch. When the pure non-transactional engine xa transaction is rolled back, this time, it misses to execute rollback_low() method to leave a screwed state which the following query discovers hiting against an assert. bool trans_commit_stmt(THD*): Assertion `thd->in_active_multi_stmt_transaction() || thd->m_transaction_psi == __null' failed. And finally, testing revealed a case of no test coverage so far in combination of XA ROLLBACK, no transactional tables involved and no XA PREPARE. In such case logging was just incorrect mixing XA START and ROLLBACK. The original issue is fixed with making MYSQL_BIN_LOG::write_binlog_and_commit_engine() to compute a local boolean flag `skip-commit' correctly based on the value of the XA state. Commit is disallowed when the state is Prepared. To satisfy to the ONE-phase XA, the committing XA is also made to receive XA_PREPARED status, as intermediate, right after the prepare phase is done. in a general commit handler of ha_commit_trans(). The second issue of the rollback part is fixed with relocating an existing explicit rollback_low() for xa-rollback to a safer point. And the final third issue is fixed with augmentment of ending_trans()'s the trans_cannot_safely_rollback(thd) branch to compose an appropriate Query-log-event. Logics of preventing second time do_binlog_xa_commit_rollback() invocation that is actual to the "externally" committing XA is simplified. The former idea was in that the first invocation of do_binlog_xa_commit_rollback() in the "external" XA commit branch would necessarily turn the cache from empty, asserted, to not empty. On the other hand, at running do_binlog_xa_commit_rollback() for the local xa the cache must be empty (because it should've been flushed at prepare), asserted in the rollback case too. Hence the state of the cache checking was correct: (the local xa go through, the external xa goes through once which is first time). Now instead of the above deduction the 1st invocation is just gets explicitly flagged. And because we would like to preserve signature of MYSQL_BIN_LOG class methods the flag is made to pass as a new member of `bool binlog_cache_mngr::has_logged_xid' Mixing transactional and non-transactional tables in rpl.rpl_xa_survive_disconnect_mixed_engines reveal one issue in MTS grouping. An XA transaction "prepare" group can be closed with XA-ROLLBACK query which was previously missed to capture. A use case for that is mixed transactional and non-transactional updates. It's been corrected now. As a side effect is_loggable_xa_prepare() had to be refined to satisfy @c simulate_commit_failure.
Loading