Skip to content
  • Sujatha Sivakumar's avatar
    d4ba1018
    Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG · d4ba1018
    Sujatha Sivakumar authored
    FILE/POS WITH SEMISYNC
    
    Problem:
    ========
    When DMLs are in progress on the master stopping a slave and
    setting ahead binlog name/pos will cause an assert on the
    master.
    
    Analysis:
    ========
    Trx1 is waiting for ack on certain position meanwhile change
    master command is issued from the slave to master with a
    future log file name and position. Upon receipt master's
    dump thread assumes that it has received ack till the future
    log file and position and clears the active transaction list
    and sets the reply_file_name_ and reply_file_pos_ to future
    file name and pos. since reply file name and position are
    ahead of Trx1's current position Trx1 completes
    successfully. Execute a new DML Trx2. During commit Trx2
    waits for an ack on some position but the reply file name
    and position are already pointing to future file name and
    position so this current transaction thinks that it has
    already got an ack and no need to wait and proceeds further.
    But the active transaction has not actually received any ack
    the transaction node is not cleared from active transaction
    list. During the exit of commit operation there is an assert
    which expects active transaction node to be cleared from the
    transaction list which fails this leads to an assert.
    
    Fix:
    ===
    Ideally when a request for change master is received the
    request file name and position should be validated for
    correctness and only after the validation the transmit_start
    hook should be fired.  Hence moved the transmit_start hook
    position to appropriate place where it is placed post the
    validation of file name and position. Post fix when a future
    log file name and position is specified
    ER_MASTER_FATAL_ERROR_READING_BINLOG error will be returned
    to slave.
    
    Note: During fixing this bug one more bug was identified.
    
    Rpl_semi_sync_master_clients this parameter can become
    negative at certain conditions causing UNINSTALL PLUGIN
    rpl_semi_sync_master to fail.
    
    'Rpl_semi_sync_master_clients' client is incremented when
    dump thread initiates transmission and when a new slave
    connection is established. This value is decremented when
    dump thread exits and slave gets disconnected. On a certain
    condition if IO thread stops due to an error and dump thread
    exited before "transmit_start" hook dump thread will still
    try to call "transmit_stop" and this will cause
    Rpl_semi_sync_master_clients= -1 which is invalid state.
    Hence as part of the fix a new flag is introduced such that
    "transmit_stop" will be called only when "transmit_start" is
    called.
    d4ba1018
    Bug#17453826:ASSERTION ERROR WHEN SETTING FUTURE BINLOG
    Sujatha Sivakumar authored
    FILE/POS WITH SEMISYNC
    
    Problem:
    ========
    When DMLs are in progress on the master stopping a slave and
    setting ahead binlog name/pos will cause an assert on the
    master.
    
    Analysis:
    ========
    Trx1 is waiting for ack on certain position meanwhile change
    master command is issued from the slave to master with a
    future log file name and position. Upon receipt master's
    dump thread assumes that it has received ack till the future
    log file and position and clears the active transaction list
    and sets the reply_file_name_ and reply_file_pos_ to future
    file name and pos. since reply file name and position are
    ahead of Trx1's current position Trx1 completes
    successfully. Execute a new DML Trx2. During commit Trx2
    waits for an ack on some position but the reply file name
    and position are already pointing to future file name and
    position so this current transaction thinks that it has
    already got an ack and no need to wait and proceeds further.
    But the active transaction has not actually received any ack
    the transaction node is not cleared from active transaction
    list. During the exit of commit operation there is an assert
    which expects active transaction node to be cleared from the
    transaction list which fails this leads to an assert.
    
    Fix:
    ===
    Ideally when a request for change master is received the
    request file name and position should be validated for
    correctness and only after the validation the transmit_start
    hook should be fired.  Hence moved the transmit_start hook
    position to appropriate place where it is placed post the
    validation of file name and position. Post fix when a future
    log file name and position is specified
    ER_MASTER_FATAL_ERROR_READING_BINLOG error will be returned
    to slave.
    
    Note: During fixing this bug one more bug was identified.
    
    Rpl_semi_sync_master_clients this parameter can become
    negative at certain conditions causing UNINSTALL PLUGIN
    rpl_semi_sync_master to fail.
    
    'Rpl_semi_sync_master_clients' client is incremented when
    dump thread initiates transmission and when a new slave
    connection is established. This value is decremented when
    dump thread exits and slave gets disconnected. On a certain
    condition if IO thread stops due to an error and dump thread
    exited before "transmit_start" hook dump thread will still
    try to call "transmit_stop" and this will cause
    Rpl_semi_sync_master_clients= -1 which is invalid state.
    Hence as part of the fix a new flag is introduced such that
    "transmit_stop" will be called only when "transmit_start" is
    called.
Loading