sql/rpl_gtid_set.cc · mysql-5.7.26 · Rasoul Jahanshahi / Mysql Server

Feb 20, 2017

Bug#24806259 PERFORMANCE DEGRADATION WHEN THE SERVER HAS MANY GTIDS · 85afc80f

Venkatesh Duggirala authored Feb 20, 2017

Problem: If a server replicated data from many different servers,
(can happen over the period of time if slave server's master
is different time to time, or replication chain is big enough
that the transactions are travelled from many different Masters
and reached the node), then FLUSH BINARY LOG command takes
a lot of time. Since FLUSH BINARY LOG takes lock on binary log,
it locks all the operations on the server which degrades performance
of the server during this long time.
Also > restarting of such this server takes many minutes.
     > setting GTID_PURGED with a big GTID list is taking time.

Analysis:
Flush binary log is calling  Gtid_state::save_gtids_of_last_binlog_into_table
which prepares logged_gtids_last_binlog to compute the gtids that needs to be
stored in the gtid_executed table. During this logged_gtids_last_binlog
prepare operation, server calls
Gtid_set::add_gtid_set()
   => Sid_map::add_sid()
         => Sid_map::add_node()
when ever it finds a new UUID to be added to the set.
When server calls add_node it has to update many internal structures
to maintain the consistency of the Gtid_state. Some of these structures are
declared as dynamic arrays (Preallocated_array).
Eg:
     > Gtid_set::m_intervals # Array that maps SIDNO to SID;
                             # the element at index N points
                             # to a Node with SIDNO N-1,

     > Sid_map::sidno_to_hash   # Array that maps SIDNO to SID;
                                # the element at index N points to
                                # a Node with SIDNO N-1.

     > Mutex_cond_array::m_array # Array of read-write locks.

As the name says, these array lengths needs to be adjusted dynamically when ever
there is a requirement (no space in the array). In the current code,
add_node is calling Preallocated_array::reserve(sidno)
which will make the array adjusted to the size of 'sidno'. The mapped sidno for
a new UUID is just equal to number of elements in the array + 1. Hence these
arrays are getting incremented by just one more element.
These dynamic arrays are adjusted in the following way
              > Allocate new memory required to fit the new array size
              > copy element by element from old array to new array location.
Complexity of the algorithm is O(n^2) where n is the number of new UUIDs that
needs to be added to these structures.

Similar problem is happening when this server is restarted or when
we are setting GTID_PURGED with a big gtid list. During these two
operations, server prepares gtid_executed set by reading the gtids from
gtid_executed table (or from the gtid text in case SET GTID_PURGED command)
and server calls add_node for every new UUID.
If the number of unique UUIDs are in thousands, the restart operation can
take long time.

Fix: Preallocated_array::push_back has logic inside that it adjust
these dynamic arrays by doubling the current size when ever it finds
no space. Hence server should not call "reserve" explicitly and
let the push_back handle the cash of no space in the array.

85afc80f

Bug#24806259 PERFORMANCE DEGRADATION WHEN THE SERVER HAS MANY GTIDS

Venkatesh Duggirala authored Feb 20, 2017

Problem: If a server replicated data from many different servers,
(can happen over the period of time if slave server's master
is different time to time, or replication chain is big enough
that the transactions are travelled from many different Masters
and reached the node), then FLUSH BINARY LOG command takes
a lot of time. Since FLUSH BINARY LOG takes lock on binary log,
it locks all the operations on the server which degrades performance
of the server during this long time.
Also > restarting of such this server takes many minutes.
     > setting GTID_PURGED with a big GTID list is taking time.

Analysis:
Flush binary log is calling  Gtid_state::save_gtids_of_last_binlog_into_table
which prepares logged_gtids_last_binlog to compute the gtids that needs to be
stored in the gtid_executed table. During this logged_gtids_last_binlog
prepare operation, server calls
Gtid_set::add_gtid_set()
   => Sid_map::add_sid()
         => Sid_map::add_node()
when ever it finds a new UUID to be added to the set.
When server calls add_node it has to update many internal structures
to maintain the consistency of the Gtid_state. Some of these structures are
declared as dynamic arrays (Preallocated_array).
Eg:
     > Gtid_set::m_intervals # Array that maps SIDNO to SID;
                             # the element at index N points
                             # to a Node with SIDNO N-1,

     > Sid_map::sidno_to_hash   # Array that maps SIDNO to SID;
                                # the element at index N points to
                                # a Node with SIDNO N-1.

     > Mutex_cond_array::m_array # Array of read-write locks.

As the name says, these array lengths needs to be adjusted dynamically when ever
there is a requirement (no space in the array). In the current code,
add_node is calling Preallocated_array::reserve(sidno)
which will make the array adjusted to the size of 'sidno'. The mapped sidno for
a new UUID is just equal to number of elements in the array + 1. Hence these
arrays are getting incremented by just one more element.
These dynamic arrays are adjusted in the following way
              > Allocate new memory required to fit the new array size
              > copy element by element from old array to new array location.
Complexity of the algorithm is O(n^2) where n is the number of new UUIDs that
needs to be added to these structures.

Similar problem is happening when this server is restarted or when
we are setting GTID_PURGED with a big gtid list. During these two
operations, server prepares gtid_executed set by reading the gtids from
gtid_executed table (or from the gtid text in case SET GTID_PURGED command)
and server calls add_node for every new UUID.
If the number of unique UUIDs are in thousands, the restart operation can
take long time.

Fix: Preallocated_array::push_back has logic inside that it adjust
these dynamic arrays by doubling the current size when ever it finds
no space. Hence server should not call "reserve" explicitly and
let the push_back handle the cash of no space in the array.