Skip to content
  • Sujatha Sivakumar's avatar
    7d6110ec
    Bug#27411175: RESET SLAVE ALL DOES NOT CLEAR · 7d6110ec
    Sujatha Sivakumar authored
    MYSQL.SLAVE_RELAY_LOG_INFO
    
    Problem:
    =======
    RESET SLAVE / RESET SLAVE ALL will not remove errant relay
    log entries from the mysql.slave_relay_log_info table for
    group replication channels.
    
    Analysis:
    =========
    Existing code for RESET SLAVE / RESET SLAVE ALL command
    doesn't include group replication specific
    'group_replication_applier' and 'group_replication_recovery'
    channels during RESET SLAVE [ALL] operation.  Because of
    this the group replication specific channels are not
    affected by RESET SLAVE [ALL] command.
    
    Fix:
    ===
    Implemented code changes to include group replication
    specific channels to be part of RESET SLAVE [ALL] command.
    
    Please note that the RESET SLAVE [ALL] command does the
    RESET operation only when the group member is OFFLINE.
    Executing RESET SLAVE [ALL] on an ONLINE group member will
    result in an error.
    
    Bug#20280946: RESTARTING SLAVE SERVER POST 'RESET SLAVE'
    COMMAND, CLEANS UP SLAVE SETTING
    
    Problem:
    ========
    If we restart slave server post 'RESET SLAVE' command, then
    slave server forgets the slave configuration. We need to run
    CHM command again to do the replication setup.
    
    Analysis:
    =========
    RESET SLAVE followed by a restart of the slave has the same
    effect as RESET SLAVE ALL command. It will clear recovery
    channel specific in memory credentials. In case of highly
    available systems like GR reset slave for cleanup followed
    by a server crash will prevent the member from rejoining the
    group as the recovery credentials are now gone.
    
    Fix:
    ===
    During RESET SLAVE command identify the channel which is in
    initialized state and do the clean up for that channel.
    Preserve the channel specific connection credentials in
    crash safe master info repository table. This will ensure
    that the credentials are always available in spite of
    restarts or crash.
    
    BUG#27636289: RPL BREAKS WITH RESTART AFTER RESET SLAVE IF
    --RELAY-LOG-PURGE=0
    
    Problem:
    ========
    If slave server is restarted followed by reset master on
    both master and slave, reset slave on slave server. slave
    goes to an ERROR state as shown below.
    
    Last_IO_Error: Got fatal error 1236 from master when reading
    data from binary log: 'Slave has more GTIDs than the master
    has, using the master's SERVER_UUID. This may indicate that
    the end of the binary log was truncated or that the last
    binary log file was lost, e.g., after a power or disk
    failure when sync_binlog != 1. The master may or may not
    have rolled back transactions that were already replica'
    
    The slave fails with the above ERROR even though master and
    slave both has empty GLOBAL.GTID_EXECUTED.
    
    Analysis:
    ========
    The relay log purge procedure called by RESET SLAVE is
    purging all existing relay log files and generating the
    first one before the received_gtid_set of the channel be
    cleared.
    
    So, the received_gtid_set being used to generate the
    PREVIOUS_GTIDS of the first relay log file after a RESET
    SLAVE contains old (garbage) information.
    
    When this first relay log file with wrong PREVIOUS_GTIDS is
    still present after a slave server restart, the slave will
    assume use its PREVIOUS_GTIDS, leading to
    ER_MASTER_FATAL_ERROR_READING_BINLOG with the following
    error message: "Slave has more GTIDs than the master has,
    using the master's SERVER_UUID."
    
    Fix:
    ===
    During RESET SLAVE firstly cleanup the received_gtid_set and
    then purge the relay log files.
    7d6110ec
    Bug#27411175: RESET SLAVE ALL DOES NOT CLEAR
    Sujatha Sivakumar authored
    MYSQL.SLAVE_RELAY_LOG_INFO
    
    Problem:
    =======
    RESET SLAVE / RESET SLAVE ALL will not remove errant relay
    log entries from the mysql.slave_relay_log_info table for
    group replication channels.
    
    Analysis:
    =========
    Existing code for RESET SLAVE / RESET SLAVE ALL command
    doesn't include group replication specific
    'group_replication_applier' and 'group_replication_recovery'
    channels during RESET SLAVE [ALL] operation.  Because of
    this the group replication specific channels are not
    affected by RESET SLAVE [ALL] command.
    
    Fix:
    ===
    Implemented code changes to include group replication
    specific channels to be part of RESET SLAVE [ALL] command.
    
    Please note that the RESET SLAVE [ALL] command does the
    RESET operation only when the group member is OFFLINE.
    Executing RESET SLAVE [ALL] on an ONLINE group member will
    result in an error.
    
    Bug#20280946: RESTARTING SLAVE SERVER POST 'RESET SLAVE'
    COMMAND, CLEANS UP SLAVE SETTING
    
    Problem:
    ========
    If we restart slave server post 'RESET SLAVE' command, then
    slave server forgets the slave configuration. We need to run
    CHM command again to do the replication setup.
    
    Analysis:
    =========
    RESET SLAVE followed by a restart of the slave has the same
    effect as RESET SLAVE ALL command. It will clear recovery
    channel specific in memory credentials. In case of highly
    available systems like GR reset slave for cleanup followed
    by a server crash will prevent the member from rejoining the
    group as the recovery credentials are now gone.
    
    Fix:
    ===
    During RESET SLAVE command identify the channel which is in
    initialized state and do the clean up for that channel.
    Preserve the channel specific connection credentials in
    crash safe master info repository table. This will ensure
    that the credentials are always available in spite of
    restarts or crash.
    
    BUG#27636289: RPL BREAKS WITH RESTART AFTER RESET SLAVE IF
    --RELAY-LOG-PURGE=0
    
    Problem:
    ========
    If slave server is restarted followed by reset master on
    both master and slave, reset slave on slave server. slave
    goes to an ERROR state as shown below.
    
    Last_IO_Error: Got fatal error 1236 from master when reading
    data from binary log: 'Slave has more GTIDs than the master
    has, using the master's SERVER_UUID. This may indicate that
    the end of the binary log was truncated or that the last
    binary log file was lost, e.g., after a power or disk
    failure when sync_binlog != 1. The master may or may not
    have rolled back transactions that were already replica'
    
    The slave fails with the above ERROR even though master and
    slave both has empty GLOBAL.GTID_EXECUTED.
    
    Analysis:
    ========
    The relay log purge procedure called by RESET SLAVE is
    purging all existing relay log files and generating the
    first one before the received_gtid_set of the channel be
    cleared.
    
    So, the received_gtid_set being used to generate the
    PREVIOUS_GTIDS of the first relay log file after a RESET
    SLAVE contains old (garbage) information.
    
    When this first relay log file with wrong PREVIOUS_GTIDS is
    still present after a slave server restart, the slave will
    assume use its PREVIOUS_GTIDS, leading to
    ER_MASTER_FATAL_ERROR_READING_BINLOG with the following
    error message: "Slave has more GTIDs than the master has,
    using the master's SERVER_UUID."
    
    Fix:
    ===
    During RESET SLAVE firstly cleanup the received_gtid_set and
    then purge the relay log files.
Loading