mysql-test/suite/ndb_rpl/t/ndb_rpl_batch_handling.test · 8406d36d6a0557ff69addbb973083b22b69cb225 · Rasoul Jahanshahi / Mysql Server

Oct 27, 2014

Bug #19875710 NDB : COMMIT ACK MARKERS LEAK @ LQH IN 7.4 · a8598789

Frazer Clement authored Oct 27, 2014

      
Fix TC ref counting of Commit Ack markers in LQH so as not
to leak markers at LQH.
      
TC has one TC-Commit-Ack-Marker record per transaction which
is used to track which nodes and LDM instances hold 
LQH-Commit-Ack-Marker records.
      
This is used when receiving TC_COMMIT_ACK to know which nodes
and LDM instances should be sent a REMOVE_MARKER_ORD signal.
      
TC only needs one operation per transaction to have 
LQH-Commit-Ack-marker records (in each live node in one
nodegroup), so the approach taken is to request them 
for all write operations until one of the write 
operations succeeds (and keeps its marker at LQH).  
After this, subsequent write operations needn't allocate 
markers at LQH.
      
Write operations that don't succeed and don't immediately
cause a transaction abort (e.g. those defined with
IgnoreError, and which e.g. find no row, or row already exists
or something) are aborted (and discarded at LQH), and so they
leave no LQH-Commit-Ack marker.
      
Where a transaction prepares write operations that all fail at
LQH, there will be no LQH-Commit-Ack markers, and so no need
for a TC-Commit-Ack marker.  This is handled using a reference
count of how many LQH-Commit-Ack markers have been requested
*or acknowledged*.  If this becomes == 0 then there's no need
for a TC-Commit-Ack marker.
      
TC uses a per-transaction state and a per-transaction reference
counter to manage this.
      
The bug is that the reference count was only covering the 
outstanding requests, and not the LQH-Commit-Ack markers that
were acknowledged.  In other words the reference count was 
decremented in execLQHKEYCONF, which signified that an LQH-Commit-Ack
marker was allocated on that LQH instance.
      
In certain situations this resulted in the allocated LQH-Commit-Ack
markers being leaked, and eventually this causes the cluster to become
read only as new write operations cannot allocate LQH-Commit-Ack markers.
      
Bug seems to have been added as part of
  Bug #19451060 	BUG#73339 IN MYSQL BUG SYSTEM, NDBREQUIRE INCORRECT
      
Fix is to *not* decrement the reference count in execLQHKEYCONF.

However, the current implementation 'forgets' that an operation resulted in
marker allocation (and reference count increment) after LQHKEYCONF is
processed.

To solve this, TC is modified to record which operations caused 
LQH-Commit-Ack markers to be allocated, so that during the 
per-operation phase of transaction ABORT or COMMIT, the 
reference count can be decremented and so re-checked for 
consistency.

      
Some additional jam()s and comments are added.
      
A new ndbinfo.ndb$pools pool is added - LQH Commit Ack Markers.  
This is used in the testcase to ensure that all LQH Commit Ack 
markers are released, and may be useful for problem diagnosis
in future.
      
Replication used in the test to get batching of writing operations
and NdbApi AO_IgnoreError flag setting.
      
Some basic transaction abort testcases are added which showed problems
with a partial fix.

a8598789

Bug #19875710 NDB : COMMIT ACK MARKERS LEAK @ LQH IN 7.4

Frazer Clement authored Oct 27, 2014

      
Fix TC ref counting of Commit Ack markers in LQH so as not
to leak markers at LQH.
      
TC has one TC-Commit-Ack-Marker record per transaction which
is used to track which nodes and LDM instances hold 
LQH-Commit-Ack-Marker records.
      
This is used when receiving TC_COMMIT_ACK to know which nodes
and LDM instances should be sent a REMOVE_MARKER_ORD signal.
      
TC only needs one operation per transaction to have 
LQH-Commit-Ack-marker records (in each live node in one
nodegroup), so the approach taken is to request them 
for all write operations until one of the write 
operations succeeds (and keeps its marker at LQH).  
After this, subsequent write operations needn't allocate 
markers at LQH.
      
Write operations that don't succeed and don't immediately
cause a transaction abort (e.g. those defined with
IgnoreError, and which e.g. find no row, or row already exists
or something) are aborted (and discarded at LQH), and so they
leave no LQH-Commit-Ack marker.
      
Where a transaction prepares write operations that all fail at
LQH, there will be no LQH-Commit-Ack markers, and so no need
for a TC-Commit-Ack marker.  This is handled using a reference
count of how many LQH-Commit-Ack markers have been requested
*or acknowledged*.  If this becomes == 0 then there's no need
for a TC-Commit-Ack marker.
      
TC uses a per-transaction state and a per-transaction reference
counter to manage this.
      
The bug is that the reference count was only covering the 
outstanding requests, and not the LQH-Commit-Ack markers that
were acknowledged.  In other words the reference count was 
decremented in execLQHKEYCONF, which signified that an LQH-Commit-Ack
marker was allocated on that LQH instance.
      
In certain situations this resulted in the allocated LQH-Commit-Ack
markers being leaked, and eventually this causes the cluster to become
read only as new write operations cannot allocate LQH-Commit-Ack markers.
      
Bug seems to have been added as part of
  Bug #19451060 	BUG#73339 IN MYSQL BUG SYSTEM, NDBREQUIRE INCORRECT
      
Fix is to *not* decrement the reference count in execLQHKEYCONF.

However, the current implementation 'forgets' that an operation resulted in
marker allocation (and reference count increment) after LQHKEYCONF is
processed.

To solve this, TC is modified to record which operations caused 
LQH-Commit-Ack markers to be allocated, so that during the 
per-operation phase of transaction ABORT or COMMIT, the 
reference count can be decremented and so re-checked for 
consistency.

      
Some additional jam()s and comments are added.
      
A new ndbinfo.ndb$pools pool is added - LQH Commit Ack Markers.  
This is used in the testcase to ensure that all LQH Commit Ack 
markers are released, and may be useful for problem diagnosis
in future.
      
Replication used in the test to get batching of writing operations
and NdbApi AO_IgnoreError flag setting.
      
Some basic transaction abort testcases are added which showed problems
with a partial fix.