Skip to content
  • Joao Gramacho's avatar
    cf2aa612
    BUG#25585436 ASSERT `!RLI_DESCRIPTION_EVENT || IS_PARALLEL_EXEC()' AT · cf2aa612
    Joao Gramacho authored
                 RPL_RLI.CC:2393
    
    Problem and Analysis
    --------------------
    
    The issues happen when the MTS coordinator is trying to determine which
    worker should handle a new transaction being scheduled that depends on
    a transaction still in progress in a worker because of MTS logical
    clock.
    
    If the worker fails to apply the transaction, instead of being
    notified that there was an error and the logical clock the coordinator
    was waiting will never be reached, the coordinator is assigning the
    transaction to itself before seen that an error happened.
    
    When the MTS coordinator becomes aware of the error, it has already
    scheduled (and applied) some events of the new transaction, messing with
    some of the cleanup logics.
    
    Requesting the MTS coordinator (STOP SLAVE) while it is applying a
    transaction that should be handled by workers are making debug binaries
    to hit the assert.
    
    Fix
    ---
    
    Make schedule_next_event return an error when a dependent transaction
    is aware of the failure of a transaction it was waiting on.
    cf2aa612
    BUG#25585436 ASSERT `!RLI_DESCRIPTION_EVENT || IS_PARALLEL_EXEC()' AT
    Joao Gramacho authored
                 RPL_RLI.CC:2393
    
    Problem and Analysis
    --------------------
    
    The issues happen when the MTS coordinator is trying to determine which
    worker should handle a new transaction being scheduled that depends on
    a transaction still in progress in a worker because of MTS logical
    clock.
    
    If the worker fails to apply the transaction, instead of being
    notified that there was an error and the logical clock the coordinator
    was waiting will never be reached, the coordinator is assigning the
    transaction to itself before seen that an error happened.
    
    When the MTS coordinator becomes aware of the error, it has already
    scheduled (and applied) some events of the new transaction, messing with
    some of the cleanup logics.
    
    Requesting the MTS coordinator (STOP SLAVE) while it is applying a
    transaction that should be handled by workers are making debug binaries
    to hit the assert.
    
    Fix
    ---
    
    Make schedule_next_event return an error when a dependent transaction
    is aware of the failure of a transaction it was waiting on.
Loading