Skip to content
  • Venkatesh Duggirala's avatar
    0cd3070c
    BUG#16666407 BINLOG WRITE ERRORS SILENTLY IGNORED · 0cd3070c
    Venkatesh Duggirala authored
    BUG#20938915 2PC SUCCEEDS EVEN THOUGH BINLOG FLUSH/SYNC FAILS
    
    Problem & Analysis:
    ===================
    Server has 3 stages in Binlog Group Commit.
    
    1) FLUSH_STAGE:
       a) Leader flushes all threads (leader + followers) caches to binlog cache
       b) Leader flushes binlog cache to binary log file
    
    2) SYNC STAGE:
       Leader calls fsync on binary log file
    
    3) COMMIT STAGE:
        Leader calls commit handlers for each thread involved in the group.
    
    Failures can happen at any stage ( 1a, 1b, 2 , 3) and server was not
    handling these failures properly in stage 1a, 1b and 2.
    
    If the error occurs in stage 3, it returns an error (ER_ERROR_DURING_COMMIT)
    to the client. The ongoing transactions are binlogged successfully and
    might not have committed in the storage engine depends on at what level
    error occurred.
    
    Fix: Using binlog_error_action variable, Server will decide the action to do
    if an error occurs during flush/sync stage (1a, 1b or 2).
    
    If binlog_error_action == ABORT_SERVER, then it will abort the server after
    informing the client with 'ER_BINLOGGING_IMPOSSIBLE' error. All ongoing
    transactions will remain in prepare phase. Incase of stage
    1b or stage 2, transaction *might* have already reached binary log before
    server is aborted, in that case upon restart, Server will commit all the
    prepared transactions. If they are not in binary log, server will rollback
    all the prepared transactions.
    
    If binlog_error_action == 'IGNORE_ERROR', then it will ignore the error
    and disable the binlogging further until server is restarted again.
    The same will be mentioned in the error log file and ongoing transaction
    will be committed without any error (and also not binlogged as well due to
    the error).
    
    Also, the case where after leader enters into flush stage, it can happen that
    binlog is in closed state due to a previous flush/sync error and that case
    handled in this patch. If this situation happens, leader skips flush and sync
    stage and directly moves to commit stage and commits the transaction and
    no error is thrown (just like how the transaction behaves if the binary log is
    disabled).
    0cd3070c
    BUG#16666407 BINLOG WRITE ERRORS SILENTLY IGNORED
    Venkatesh Duggirala authored
    BUG#20938915 2PC SUCCEEDS EVEN THOUGH BINLOG FLUSH/SYNC FAILS
    
    Problem & Analysis:
    ===================
    Server has 3 stages in Binlog Group Commit.
    
    1) FLUSH_STAGE:
       a) Leader flushes all threads (leader + followers) caches to binlog cache
       b) Leader flushes binlog cache to binary log file
    
    2) SYNC STAGE:
       Leader calls fsync on binary log file
    
    3) COMMIT STAGE:
        Leader calls commit handlers for each thread involved in the group.
    
    Failures can happen at any stage ( 1a, 1b, 2 , 3) and server was not
    handling these failures properly in stage 1a, 1b and 2.
    
    If the error occurs in stage 3, it returns an error (ER_ERROR_DURING_COMMIT)
    to the client. The ongoing transactions are binlogged successfully and
    might not have committed in the storage engine depends on at what level
    error occurred.
    
    Fix: Using binlog_error_action variable, Server will decide the action to do
    if an error occurs during flush/sync stage (1a, 1b or 2).
    
    If binlog_error_action == ABORT_SERVER, then it will abort the server after
    informing the client with 'ER_BINLOGGING_IMPOSSIBLE' error. All ongoing
    transactions will remain in prepare phase. Incase of stage
    1b or stage 2, transaction *might* have already reached binary log before
    server is aborted, in that case upon restart, Server will commit all the
    prepared transactions. If they are not in binary log, server will rollback
    all the prepared transactions.
    
    If binlog_error_action == 'IGNORE_ERROR', then it will ignore the error
    and disable the binlogging further until server is restarted again.
    The same will be mentioned in the error log file and ongoing transaction
    will be committed without any error (and also not binlogged as well due to
    the error).
    
    Also, the case where after leader enters into flush stage, it can happen that
    binlog is in closed state due to a previous flush/sync error and that case
    handled in this patch. If this situation happens, leader skips flush and sync
    stage and directly moves to commit stage and commits the transaction and
    no error is thrown (just like how the transaction behaves if the binary log is
    disabled).
Loading