Skip to content
  • Dmitry Lenev's avatar
    5c6171e2
    WL#6671 "Improve scalability by not using thr_lock.c locks for InnoDB tables". · 5c6171e2
    Dmitry Lenev authored
    Avoid scalability bottleneck associated with THR_LOCK::mutex locks for InnoDB
    tables by not using thr_lock.c locks for them. The patch tries to make minimal
    changes to SE API and locking code on SQL-layer. Further improvements in these
    areas enabled by this change will be done as separate WLs.
    
    Before this patch InnoDB downgraded strong TL_READ_NO_INSERT/TL_WRITE
    thr_lock.c locks to weaker ones compatible with each other in most cases.
    So it has really relied on thr_lock.c locking only in a few scenarios:
    
    1) To isolate HANDLER READ statements from LOCK TABLES WRITE statements.
    2) To isolate LOCK TABLES statements that lock tables for write implicitly
       or for read, both explicitly or implicitly, from concurrent DML statements.
    3) Due to coding mistake thr_lock.c lock was necessary to isolate ALTER TABLE
       IMPORT/DISCARD TABLESPACE under LOCK TABLES from concurrent I_S queries/
       open HANDLERs.
    4) To indicate that InnoDB tables don't support LOCK TABLES READ LOCAL by
       upgrading TL_READ lock requested by statement to TL_READ_NO_INSERT
    
    After addressing these scenarios it became possible to completely abandon
    thr_lock.c locking for InnoDB tables. To do this this patch:
    
    1)   Changes code for HANDLER READ statements to upgrade S metadata lock to
         SR metadata lock for the duration of read. This allows us properly
         isolate HANDLER READ from LOCK TABLES WRITE and makes metadata locking
         for these statements consistent with locking for other DML.
    2.a) Introduces new type of metadata lock - MDL_SHARED_READ_ONLY. This lock
         is similar to SR lock with exception that it is not compatible with SW
         locks. This type of lock is used as replacement for TL_READ_NO_INSERT
         thr_lock.c locks for tables locked by LOCK TABLES for read (both
         explicitly or implicitly).
         To preserve backward compatibility SRO lock was assigned lower priority
         than SW locks (acquired by DML that modify data). This means that stream
         of DML can lead to starvation of LOCK TABLES READ statement. To provide
         a way out from such a situation, MDL subsystem was changed to respect
         max_write_lock_count limit for SW locks as well. Also new
         MDL_SHARED_WRITE_LOW_PRIO type of lock was introduced. It has lower
         priority than SRO locks and is used by DML with LOW_PRIORITY clause
         (i.e. for the same DML which had lower priority than LOCK TABLES READ
         before the patch).
    2.b) Changes code for LOCK TABLES to acquire SNRW lock on tables implicitly
         locked for write to compensate for removal of TL_WRITE lock.
    2.c) After 2.a) and 2.b) were implemented it became impossible to predict
         in which order SNRW and SRO locks will be acquired, so we no longer
         can rely on that "strong" locks are always acquired in the same order
         to avoid deadlocks for DDL.
         To solve this issue and keep behavior compatible we had to change
         function which chooses deadlock resolution victim to prefer
         waits for "strong" locks from LOCK TABLES over locks from other DDL
         (MDL subsystem was extended for this) and to ensure that attempt to
         acquire locks for LOCK TABLES is restarted when we get ER_LOCK_DEADLOCK
         error.
    3)   Changes ALTER TABLE IMPORT/DISCARD TABLESPACE code to acquire X
         lock on table being imported/discarded even under lock tables.
    4)   New storage engine flag HA_NO_READ_LOCAL_LOCK was introduced to mark
         storage engines which don't support LOCK TABLES READ LOCAL but don't
         want to use thr_lock.c locks to indicate this. LOCK TABLES READ LOCAL
         automatically acquires SRO locks for them.
    
    After the above steps InnoDB code was changed to indicate that InnoDB
    no longer needs thr_lock.c locks acquired. This was done by changing
    ha_innobase::lock_count() to return 0 and by ensuring that
    ha_innobase::store_lock() doesn't try to store type of thr_lock.c lock
    in MYSQL_LOCK::locks[] array it gets as a parameter.
    
    
    It is worth to emphasize the following non-obvious behavior changes
    caused by this patch:
    
    *) LOCK TABLES READ blocks and is blocked by concurrent transactions
       changing the table, for all storage engines, similarly to how
       LOCK TABLES WRITE work now.
    *) Tables which are implicitly used by LOCK TABLES (e.g. through view
       or trigger) are locked using metadata locks in addition to (all SEs
       except InnoDB) or instead of (InnoDB) THR_LOCK locks. This means that
       previous item also applies to such tables.
    *) Multi-update is blocking/starts to be blocked by concurrent LOCK TABLES
       READ on any table from its join, even though such table will be only used
       for reading and won't be updated.
    
    Many test cases which were relying on old behavior had to be adjusted.
    Particularly:
    
    *) HANDLER-related tests had to be adjusted to take into account that
       HANDLER READ will wait for and acquire SR lock.
    *) Some tests using thr_lock.c locks had to be adjusted to use statements
       other than LOCK TABLES READ.
    *) Some tests which waited until LOCK TABLES READ will block DML on thr_lock.c
       locks now wait for blocking on MDL.
    *) P_S test coverage for aggregates now either uses MyISAM tables or were
       updated to take into account that InnoDB doesn't acquire thr_lock.c locks.
    *) Coverage for new behavior and new types of MDL locks were added, as well
       as unit tests for the latter.
    5c6171e2
    WL#6671 "Improve scalability by not using thr_lock.c locks for InnoDB tables".
    Dmitry Lenev authored
    Avoid scalability bottleneck associated with THR_LOCK::mutex locks for InnoDB
    tables by not using thr_lock.c locks for them. The patch tries to make minimal
    changes to SE API and locking code on SQL-layer. Further improvements in these
    areas enabled by this change will be done as separate WLs.
    
    Before this patch InnoDB downgraded strong TL_READ_NO_INSERT/TL_WRITE
    thr_lock.c locks to weaker ones compatible with each other in most cases.
    So it has really relied on thr_lock.c locking only in a few scenarios:
    
    1) To isolate HANDLER READ statements from LOCK TABLES WRITE statements.
    2) To isolate LOCK TABLES statements that lock tables for write implicitly
       or for read, both explicitly or implicitly, from concurrent DML statements.
    3) Due to coding mistake thr_lock.c lock was necessary to isolate ALTER TABLE
       IMPORT/DISCARD TABLESPACE under LOCK TABLES from concurrent I_S queries/
       open HANDLERs.
    4) To indicate that InnoDB tables don't support LOCK TABLES READ LOCAL by
       upgrading TL_READ lock requested by statement to TL_READ_NO_INSERT
    
    After addressing these scenarios it became possible to completely abandon
    thr_lock.c locking for InnoDB tables. To do this this patch:
    
    1)   Changes code for HANDLER READ statements to upgrade S metadata lock to
         SR metadata lock for the duration of read. This allows us properly
         isolate HANDLER READ from LOCK TABLES WRITE and makes metadata locking
         for these statements consistent with locking for other DML.
    2.a) Introduces new type of metadata lock - MDL_SHARED_READ_ONLY. This lock
         is similar to SR lock with exception that it is not compatible with SW
         locks. This type of lock is used as replacement for TL_READ_NO_INSERT
         thr_lock.c locks for tables locked by LOCK TABLES for read (both
         explicitly or implicitly).
         To preserve backward compatibility SRO lock was assigned lower priority
         than SW locks (acquired by DML that modify data). This means that stream
         of DML can lead to starvation of LOCK TABLES READ statement. To provide
         a way out from such a situation, MDL subsystem was changed to respect
         max_write_lock_count limit for SW locks as well. Also new
         MDL_SHARED_WRITE_LOW_PRIO type of lock was introduced. It has lower
         priority than SRO locks and is used by DML with LOW_PRIORITY clause
         (i.e. for the same DML which had lower priority than LOCK TABLES READ
         before the patch).
    2.b) Changes code for LOCK TABLES to acquire SNRW lock on tables implicitly
         locked for write to compensate for removal of TL_WRITE lock.
    2.c) After 2.a) and 2.b) were implemented it became impossible to predict
         in which order SNRW and SRO locks will be acquired, so we no longer
         can rely on that "strong" locks are always acquired in the same order
         to avoid deadlocks for DDL.
         To solve this issue and keep behavior compatible we had to change
         function which chooses deadlock resolution victim to prefer
         waits for "strong" locks from LOCK TABLES over locks from other DDL
         (MDL subsystem was extended for this) and to ensure that attempt to
         acquire locks for LOCK TABLES is restarted when we get ER_LOCK_DEADLOCK
         error.
    3)   Changes ALTER TABLE IMPORT/DISCARD TABLESPACE code to acquire X
         lock on table being imported/discarded even under lock tables.
    4)   New storage engine flag HA_NO_READ_LOCAL_LOCK was introduced to mark
         storage engines which don't support LOCK TABLES READ LOCAL but don't
         want to use thr_lock.c locks to indicate this. LOCK TABLES READ LOCAL
         automatically acquires SRO locks for them.
    
    After the above steps InnoDB code was changed to indicate that InnoDB
    no longer needs thr_lock.c locks acquired. This was done by changing
    ha_innobase::lock_count() to return 0 and by ensuring that
    ha_innobase::store_lock() doesn't try to store type of thr_lock.c lock
    in MYSQL_LOCK::locks[] array it gets as a parameter.
    
    
    It is worth to emphasize the following non-obvious behavior changes
    caused by this patch:
    
    *) LOCK TABLES READ blocks and is blocked by concurrent transactions
       changing the table, for all storage engines, similarly to how
       LOCK TABLES WRITE work now.
    *) Tables which are implicitly used by LOCK TABLES (e.g. through view
       or trigger) are locked using metadata locks in addition to (all SEs
       except InnoDB) or instead of (InnoDB) THR_LOCK locks. This means that
       previous item also applies to such tables.
    *) Multi-update is blocking/starts to be blocked by concurrent LOCK TABLES
       READ on any table from its join, even though such table will be only used
       for reading and won't be updated.
    
    Many test cases which were relying on old behavior had to be adjusted.
    Particularly:
    
    *) HANDLER-related tests had to be adjusted to take into account that
       HANDLER READ will wait for and acquire SR lock.
    *) Some tests using thr_lock.c locks had to be adjusted to use statements
       other than LOCK TABLES READ.
    *) Some tests which waited until LOCK TABLES READ will block DML on thr_lock.c
       locks now wait for blocking on MDL.
    *) P_S test coverage for aggregates now either uses MyISAM tables or were
       updated to take into account that InnoDB doesn't acquire thr_lock.c locks.
    *) Coverage for new behavior and new types of MDL locks were added, as well
       as unit tests for the latter.
Loading