Skip to content
  • Andrei Elkin's avatar
    ca5d1791
    Bug #21618727 XA TRX DOES NOT WORK WITH SPECIFIED GTID WHEN BINLOG IS DISABLED · ca5d1791
    Andrei Elkin authored
    **Problem description**
    
    XA is not handled correctly in combination of GTID_MODE=ON and
    --skip-log-bin on the server.
    
    There are two errors whose common reason is a missed out gtid_state's
    save() and not performed gtid ownership cleanup.
    The server can be in either the master or the slave role that
    affects which of execution paths is taken in XA commit and rollback.
    The two missed invocations are present in either branch.
    
    **High-level description of fixes**
    
    The missed invocations are added up in the cases when
    the XA transaction gets prepared, committed or rolled back.
    To remind about the prepared case, the prepared XA is a sort of a
    transaction on its own from perspectives of the binary logging and the
    gtid_executed table (as instances of the gtid persistency
    layer). Therefore while the XA gets merely prepared a statement to
    gtid_executed table needs to be fully committed, that is as a separate
    transaction.
    The patch copes with this task that was actually diagnosed in
    bug21452916 report.
    In order to separate the gtid record off the prepared XA the patch
    designs Attachable_trx_rw a read-write version of THD::Attachable_trx.
    
    **Low-level details** of the changes include:
    
    modified:   sql/handler.cc
      A common through-out sources computation to call gtid_state::save()
      is factored out as a commit_owned_gtids();
      The new function is introduced to ha_prepare() to cover XA PREPARE time
      updating of mysql.gtid_executed.
    
    modified:   sql/rpl_gtid_persist.cc
      Attachable_trx_rw is registered and deactivated (committed).
      XA state is saved, reset for the following execution of
      a statement on 'gtid_executed' table, and restored.
    
    modified:   sql/sql_base.cc
      An assert guarding access to system tables in open_and_lock_tables() is relaxed
      to allow opening specified tables for Attachable_trx_rw
    
    modified:   sql/sql_class.cc
    modified:   sql/sql_class.h
      Definition of Attachable_trx_rw and associated interfaces.
      Attachable_trx_rw is defined to base on THD::Attachable_trx.
    
    modified:   sql/xa.cc
      commit_owned_gtids() gtid_state->update_on_*() are deployed in
      few functions to cover all cases of XA COMMIT and ROLLBACK, incl errors,
      one phase.
    
      A common rule for choosing which of update_on_{commit,rollback}() to
      pick goes as the following:
    
      Whenever commit_owned_gtids() succeeds that indicates on
      mysql.gtid_executed table successful insertion. And since
      @@global.gtid_executed must be consistent with the table,
      update_on_commit() is chosen in this case.
      Conversely, update_on_rollback() runs only to match a failing
      invocation of commit_owned_gtids().  When at committing or preparing
      there's an RM error (e.g Innodb unilateral failure)
      commit_owned_gtids()is not invoked at all.
      Presence of RM error does not block commit_owned_gtids() invocation
      in the XA-rollback case, because the rollback intent does not
      change.
    
    **Testing**
    
    The patch extends the existing testing base for XA and manual GTID
    combination, specifically:
    
               suite/binlog/t/binlog_gtid_next_xa.test
    
    The above test is augmented with coverage of all affected
    commit-rollback execution branches in xa.cc and ha_prepare(), and also
    made to run in --skip-log-bin environment.
    
    new:
               suite/rpl/t/rpl_xa_survive_disconnect_lsu_off.test
    
    The slave logics is not directly affected, but still the changes fix
    an assert in Diagnostics_area::set_ok_status, when
    rpl_xa_survive_disconnect.test is run on slave having
    --log-slave-updates=off to face an expected (cos of no resetting of XA
    state) error:
    
      ER_RMFAIL: The command cannot be executed when global transaction is
                 in the IDLE state.
    
    The new test is the old one's version requiring the above slave server
    option.
    
    **NOTE**: This work does not cover to crash-safety leaving this matter
    out for earlier reported BUG#20672719.
    ca5d1791
    Bug #21618727 XA TRX DOES NOT WORK WITH SPECIFIED GTID WHEN BINLOG IS DISABLED
    Andrei Elkin authored
    **Problem description**
    
    XA is not handled correctly in combination of GTID_MODE=ON and
    --skip-log-bin on the server.
    
    There are two errors whose common reason is a missed out gtid_state's
    save() and not performed gtid ownership cleanup.
    The server can be in either the master or the slave role that
    affects which of execution paths is taken in XA commit and rollback.
    The two missed invocations are present in either branch.
    
    **High-level description of fixes**
    
    The missed invocations are added up in the cases when
    the XA transaction gets prepared, committed or rolled back.
    To remind about the prepared case, the prepared XA is a sort of a
    transaction on its own from perspectives of the binary logging and the
    gtid_executed table (as instances of the gtid persistency
    layer). Therefore while the XA gets merely prepared a statement to
    gtid_executed table needs to be fully committed, that is as a separate
    transaction.
    The patch copes with this task that was actually diagnosed in
    bug21452916 report.
    In order to separate the gtid record off the prepared XA the patch
    designs Attachable_trx_rw a read-write version of THD::Attachable_trx.
    
    **Low-level details** of the changes include:
    
    modified:   sql/handler.cc
      A common through-out sources computation to call gtid_state::save()
      is factored out as a commit_owned_gtids();
      The new function is introduced to ha_prepare() to cover XA PREPARE time
      updating of mysql.gtid_executed.
    
    modified:   sql/rpl_gtid_persist.cc
      Attachable_trx_rw is registered and deactivated (committed).
      XA state is saved, reset for the following execution of
      a statement on 'gtid_executed' table, and restored.
    
    modified:   sql/sql_base.cc
      An assert guarding access to system tables in open_and_lock_tables() is relaxed
      to allow opening specified tables for Attachable_trx_rw
    
    modified:   sql/sql_class.cc
    modified:   sql/sql_class.h
      Definition of Attachable_trx_rw and associated interfaces.
      Attachable_trx_rw is defined to base on THD::Attachable_trx.
    
    modified:   sql/xa.cc
      commit_owned_gtids() gtid_state->update_on_*() are deployed in
      few functions to cover all cases of XA COMMIT and ROLLBACK, incl errors,
      one phase.
    
      A common rule for choosing which of update_on_{commit,rollback}() to
      pick goes as the following:
    
      Whenever commit_owned_gtids() succeeds that indicates on
      mysql.gtid_executed table successful insertion. And since
      @@global.gtid_executed must be consistent with the table,
      update_on_commit() is chosen in this case.
      Conversely, update_on_rollback() runs only to match a failing
      invocation of commit_owned_gtids().  When at committing or preparing
      there's an RM error (e.g Innodb unilateral failure)
      commit_owned_gtids()is not invoked at all.
      Presence of RM error does not block commit_owned_gtids() invocation
      in the XA-rollback case, because the rollback intent does not
      change.
    
    **Testing**
    
    The patch extends the existing testing base for XA and manual GTID
    combination, specifically:
    
               suite/binlog/t/binlog_gtid_next_xa.test
    
    The above test is augmented with coverage of all affected
    commit-rollback execution branches in xa.cc and ha_prepare(), and also
    made to run in --skip-log-bin environment.
    
    new:
               suite/rpl/t/rpl_xa_survive_disconnect_lsu_off.test
    
    The slave logics is not directly affected, but still the changes fix
    an assert in Diagnostics_area::set_ok_status, when
    rpl_xa_survive_disconnect.test is run on slave having
    --log-slave-updates=off to face an expected (cos of no resetting of XA
    state) error:
    
      ER_RMFAIL: The command cannot be executed when global transaction is
                 in the IDLE state.
    
    The new test is the old one's version requiring the above slave server
    option.
    
    **NOTE**: This work does not cover to crash-safety leaving this matter
    out for earlier reported BUG#20672719.
Loading