Skip to content
  • Sujatha Sivakumar's avatar
    3bffd42b
    Bug#21507981: REPLICATION POSITION LOST AFTER CRASH ON MTS · 3bffd42b
    Sujatha Sivakumar authored
    CONFIGURED SLAVE
    
    Problem:
    ========
    Enable MTS along with crash-safe replication tables. Make
    sure that the server is busily inserting data with multiple
    threads in parallel. Shutdown mysqld uncleanly (kill -9 or
    power off server without notice).
    
    Now users are restarting the server with
    --relay-log-recovery=1 to recover the crashed slave.
    
    This results in following error:
    ================================
    2015-06-24 13:49:03 3895 [ERROR] --relay-log-recovery cannot
    be executed when the slave was stopped with an error or
    killed in MTS mode; consider using RESET SLAVE or restart
    the server with --relay-log-recovery = 0 followed by
    START SLAVE UNTIL SQL_AFTER_MTS_GAPS.
    
    i.e relay-log-recovery will not work in MTS mode.
    
    Manual steps that are followed to fix this issue:
    ================================================
    1) The server has to be restarted with –-relay-log-recovery= 0.
    2) Execute START SLAVE UNTIL SQL_AFTER_MTS_GAPS.
    This step will ensure that gaps are filled and the slave will
    stop at this point.
    3) Restart the slave server with ‘relay-log-recovery=1’.
    
    Analysis:
    ========
    The above mentioned process involves manual intervention.
    This needs to be automated.
    
    Fix:
    ====
    During crash recovery if gaps are present in MTS execution
    then START SLAVE UNTIL SQL_AFTER_MTS_GAPS is invoked
    implicitly and the gaps are filled. Once slave reaches this
    gap less consistent state it will stop. Then initialize the
    Receiver thread's positions to the latest Applier thread's
    positions and discard the old relay logs as we do in the
    case of single threaded slave mode.
    
    This recovery process may not work if MTS has stopped due to
    an error during an earlier session in that case appropriate
    error message will be printed. Users will have to eliminate
    the route cause of the error and restart the recovery
    process.
    3bffd42b
    Bug#21507981: REPLICATION POSITION LOST AFTER CRASH ON MTS
    Sujatha Sivakumar authored
    CONFIGURED SLAVE
    
    Problem:
    ========
    Enable MTS along with crash-safe replication tables. Make
    sure that the server is busily inserting data with multiple
    threads in parallel. Shutdown mysqld uncleanly (kill -9 or
    power off server without notice).
    
    Now users are restarting the server with
    --relay-log-recovery=1 to recover the crashed slave.
    
    This results in following error:
    ================================
    2015-06-24 13:49:03 3895 [ERROR] --relay-log-recovery cannot
    be executed when the slave was stopped with an error or
    killed in MTS mode; consider using RESET SLAVE or restart
    the server with --relay-log-recovery = 0 followed by
    START SLAVE UNTIL SQL_AFTER_MTS_GAPS.
    
    i.e relay-log-recovery will not work in MTS mode.
    
    Manual steps that are followed to fix this issue:
    ================================================
    1) The server has to be restarted with –-relay-log-recovery= 0.
    2) Execute START SLAVE UNTIL SQL_AFTER_MTS_GAPS.
    This step will ensure that gaps are filled and the slave will
    stop at this point.
    3) Restart the slave server with ‘relay-log-recovery=1’.
    
    Analysis:
    ========
    The above mentioned process involves manual intervention.
    This needs to be automated.
    
    Fix:
    ====
    During crash recovery if gaps are present in MTS execution
    then START SLAVE UNTIL SQL_AFTER_MTS_GAPS is invoked
    implicitly and the gaps are filled. Once slave reaches this
    gap less consistent state it will stop. Then initialize the
    Receiver thread's positions to the latest Applier thread's
    positions and discard the old relay logs as we do in the
    case of single threaded slave mode.
    
    This recovery process may not work if MTS has stopped due to
    an error during an earlier session in that case appropriate
    error message will be printed. Users will have to eliminate
    the route cause of the error and restart the recovery
    process.
Loading