mysql-test/extra/rpl_tests/rpl_gtids_restart_slave_io_lost_trx.test · mysql-cluster-7.6.13 · Rasoul Jahanshahi / Mysql Server

Aug 11, 2014

BUG#17326020 ASSERTION ON SLAVE AFTER STOP/START SLAVE · 3f6ed37f

Joao Gramacho authored Aug 11, 2014

             USING MTS+GTID REPLICATION

Problem:
=======

When the IO thread reconnects to a master using GTID auto positioning
while in the middle of a transaction, it will left the partial
transaction on the relaylog and will fully retrieve the same
transaction again.

If the slave is configured to use MTS, it will hit an assert "!worker"
once reaching the ROTATE_LOG_EVENT send by the master in the IO thread
reconnection.

Analysis:
========

Once a slave with MTS reaches the beginning of a group (by an
GTID_LOG_EVENT or QUERY(BEGIN)), it will expect to feed the same
worker with events until reaching the end of the transaction. No
events that should be applied synchronously (by the MTS coordinator
with all workers waiting for jobs) can be applied while MTS is feeding
a worker with events. The assertion exists to stop the server execution
in such situations.

The correct behavior of the slave once knowing that a transaction was
left in the middle and will not finish (as it will be applied again
from the beginning later) is to abort this transaction.

STS SQL thread knows how to rollback the incomplete transaction, but
MTS doesn't know how to do it yet.

Fix:
===

The SQL slave will now check if it is going to apply a synchronous
ROTATE_LOG_EVENT sent by the master during GTID auto negotiation after
IO thread reconnection.

Before applying this ROTATE_LOG_EVENT, the SQL slave will check also if
it is in the middle of a group. If it is, it will queue to the current
worker a QUERY(ROLLBACK) event to make the worker gracefully finish its
work and, only then, will let the MTS coordinator to apply the
ROTATE_LOG_EVENT in synchronous mode.

@ sql/rpl_slave.cc

Added code into exec_relay_log_event() to inject a QUERY(ROLLBACK)
event if necessary to make the current worker gracefully finish its
job before applying an event that needs synchronous MTS execution.

3f6ed37f

BUG#17326020 ASSERTION ON SLAVE AFTER STOP/START SLAVE

Joao Gramacho authored Aug 11, 2014

             USING MTS+GTID REPLICATION

Problem:
=======

When the IO thread reconnects to a master using GTID auto positioning
while in the middle of a transaction, it will left the partial
transaction on the relaylog and will fully retrieve the same
transaction again.

If the slave is configured to use MTS, it will hit an assert "!worker"
once reaching the ROTATE_LOG_EVENT send by the master in the IO thread
reconnection.

Analysis:
========

Once a slave with MTS reaches the beginning of a group (by an
GTID_LOG_EVENT or QUERY(BEGIN)), it will expect to feed the same
worker with events until reaching the end of the transaction. No
events that should be applied synchronously (by the MTS coordinator
with all workers waiting for jobs) can be applied while MTS is feeding
a worker with events. The assertion exists to stop the server execution
in such situations.

The correct behavior of the slave once knowing that a transaction was
left in the middle and will not finish (as it will be applied again
from the beginning later) is to abort this transaction.

STS SQL thread knows how to rollback the incomplete transaction, but
MTS doesn't know how to do it yet.

Fix:
===

The SQL slave will now check if it is going to apply a synchronous
ROTATE_LOG_EVENT sent by the master during GTID auto negotiation after
IO thread reconnection.

Before applying this ROTATE_LOG_EVENT, the SQL slave will check also if
it is in the middle of a group. If it is, it will queue to the current
worker a QUERY(ROLLBACK) event to make the worker gracefully finish its
work and, only then, will let the MTS coordinator to apply the
ROTATE_LOG_EVENT in synchronous mode.

@ sql/rpl_slave.cc

Added code into exec_relay_log_event() to inject a QUERY(ROLLBACK)
event if necessary to make the current worker gracefully finish its
job before applying an event that needs synchronous MTS execution.