-
Dmitry Lenev authored
Avoid scalability bottleneck associated with THR_LOCK::mutex locks for InnoDB tables by not using thr_lock.c locks for them. The patch tries to make minimal changes to SE API and locking code on SQL-layer. Further improvements in these areas enabled by this change will be done as separate WLs. Before this patch InnoDB downgraded strong TL_READ_NO_INSERT/TL_WRITE thr_lock.c locks to weaker ones compatible with each other in most cases. So it has really relied on thr_lock.c locking only in a few scenarios: 1) To isolate HANDLER READ statements from LOCK TABLES WRITE statements. 2) To isolate LOCK TABLES statements that lock tables for write implicitly or for read, both explicitly or implicitly, from concurrent DML statements. 3) Due to coding mistake thr_lock.c lock was necessary to isolate ALTER TABLE IMPORT/DISCARD TABLESPACE under LOCK TABLES from concurrent I_S queries/ open HANDLERs. 4) To indicate that InnoDB tables don't support LOCK TABLES READ LOCAL by upgrading TL_READ lock requested by statement to TL_READ_NO_INSERT After addressing these scenarios it became possible to completely abandon thr_lock.c locking for InnoDB tables. To do this this patch: 1) Changes code for HANDLER READ statements to upgrade S metadata lock to SR metadata lock for the duration of read. This allows us properly isolate HANDLER READ from LOCK TABLES WRITE and makes metadata locking for these statements consistent with locking for other DML. 2.a) Introduces new type of metadata lock - MDL_SHARED_READ_ONLY. This lock is similar to SR lock with exception that it is not compatible with SW locks. This type of lock is used as replacement for TL_READ_NO_INSERT thr_lock.c locks for tables locked by LOCK TABLES for read (both explicitly or implicitly). To preserve backward compatibility SRO lock was assigned lower priority than SW locks (acquired by DML that modify data). This means that stream of DML can lead to starvation of LOCK TABLES READ statement. To provide a way out from such a situation, MDL subsystem was changed to respect max_write_lock_count limit for SW locks as well. Also new MDL_SHARED_WRITE_LOW_PRIO type of lock was introduced. It has lower priority than SRO locks and is used by DML with LOW_PRIORITY clause (i.e. for the same DML which had lower priority than LOCK TABLES READ before the patch). 2.b) Changes code for LOCK TABLES to acquire SNRW lock on tables implicitly locked for write to compensate for removal of TL_WRITE lock. 2.c) After 2.a) and 2.b) were implemented it became impossible to predict in which order SNRW and SRO locks will be acquired, so we no longer can rely on that "strong" locks are always acquired in the same order to avoid deadlocks for DDL. To solve this issue and keep behavior compatible we had to change function which chooses deadlock resolution victim to prefer waits for "strong" locks from LOCK TABLES over locks from other DDL (MDL subsystem was extended for this) and to ensure that attempt to acquire locks for LOCK TABLES is restarted when we get ER_LOCK_DEADLOCK error. 3) Changes ALTER TABLE IMPORT/DISCARD TABLESPACE code to acquire X lock on table being imported/discarded even under lock tables. 4) New storage engine flag HA_NO_READ_LOCAL_LOCK was introduced to mark storage engines which don't support LOCK TABLES READ LOCAL but don't want to use thr_lock.c locks to indicate this. LOCK TABLES READ LOCAL automatically acquires SRO locks for them. After the above steps InnoDB code was changed to indicate that InnoDB no longer needs thr_lock.c locks acquired. This was done by changing ha_innobase::lock_count() to return 0 and by ensuring that ha_innobase::store_lock() doesn't try to store type of thr_lock.c lock in MYSQL_LOCK::locks[] array it gets as a parameter. It is worth to emphasize the following non-obvious behavior changes caused by this patch: *) LOCK TABLES READ blocks and is blocked by concurrent transactions changing the table, for all storage engines, similarly to how LOCK TABLES WRITE work now. *) Tables which are implicitly used by LOCK TABLES (e.g. through view or trigger) are locked using metadata locks in addition to (all SEs except InnoDB) or instead of (InnoDB) THR_LOCK locks. This means that previous item also applies to such tables. *) Multi-update is blocking/starts to be blocked by concurrent LOCK TABLES READ on any table from its join, even though such table will be only used for reading and won't be updated. Many test cases which were relying on old behavior had to be adjusted. Particularly: *) HANDLER-related tests had to be adjusted to take into account that HANDLER READ will wait for and acquire SR lock. *) Some tests using thr_lock.c locks had to be adjusted to use statements other than LOCK TABLES READ. *) Some tests which waited until LOCK TABLES READ will block DML on thr_lock.c locks now wait for blocking on MDL. *) P_S test coverage for aggregates now either uses MyISAM tables or were updated to take into account that InnoDB doesn't acquire thr_lock.c locks. *) Coverage for new behavior and new types of MDL locks were added, as well as unit tests for the latter.
Dmitry Lenev authoredAvoid scalability bottleneck associated with THR_LOCK::mutex locks for InnoDB tables by not using thr_lock.c locks for them. The patch tries to make minimal changes to SE API and locking code on SQL-layer. Further improvements in these areas enabled by this change will be done as separate WLs. Before this patch InnoDB downgraded strong TL_READ_NO_INSERT/TL_WRITE thr_lock.c locks to weaker ones compatible with each other in most cases. So it has really relied on thr_lock.c locking only in a few scenarios: 1) To isolate HANDLER READ statements from LOCK TABLES WRITE statements. 2) To isolate LOCK TABLES statements that lock tables for write implicitly or for read, both explicitly or implicitly, from concurrent DML statements. 3) Due to coding mistake thr_lock.c lock was necessary to isolate ALTER TABLE IMPORT/DISCARD TABLESPACE under LOCK TABLES from concurrent I_S queries/ open HANDLERs. 4) To indicate that InnoDB tables don't support LOCK TABLES READ LOCAL by upgrading TL_READ lock requested by statement to TL_READ_NO_INSERT After addressing these scenarios it became possible to completely abandon thr_lock.c locking for InnoDB tables. To do this this patch: 1) Changes code for HANDLER READ statements to upgrade S metadata lock to SR metadata lock for the duration of read. This allows us properly isolate HANDLER READ from LOCK TABLES WRITE and makes metadata locking for these statements consistent with locking for other DML. 2.a) Introduces new type of metadata lock - MDL_SHARED_READ_ONLY. This lock is similar to SR lock with exception that it is not compatible with SW locks. This type of lock is used as replacement for TL_READ_NO_INSERT thr_lock.c locks for tables locked by LOCK TABLES for read (both explicitly or implicitly). To preserve backward compatibility SRO lock was assigned lower priority than SW locks (acquired by DML that modify data). This means that stream of DML can lead to starvation of LOCK TABLES READ statement. To provide a way out from such a situation, MDL subsystem was changed to respect max_write_lock_count limit for SW locks as well. Also new MDL_SHARED_WRITE_LOW_PRIO type of lock was introduced. It has lower priority than SRO locks and is used by DML with LOW_PRIORITY clause (i.e. for the same DML which had lower priority than LOCK TABLES READ before the patch). 2.b) Changes code for LOCK TABLES to acquire SNRW lock on tables implicitly locked for write to compensate for removal of TL_WRITE lock. 2.c) After 2.a) and 2.b) were implemented it became impossible to predict in which order SNRW and SRO locks will be acquired, so we no longer can rely on that "strong" locks are always acquired in the same order to avoid deadlocks for DDL. To solve this issue and keep behavior compatible we had to change function which chooses deadlock resolution victim to prefer waits for "strong" locks from LOCK TABLES over locks from other DDL (MDL subsystem was extended for this) and to ensure that attempt to acquire locks for LOCK TABLES is restarted when we get ER_LOCK_DEADLOCK error. 3) Changes ALTER TABLE IMPORT/DISCARD TABLESPACE code to acquire X lock on table being imported/discarded even under lock tables. 4) New storage engine flag HA_NO_READ_LOCAL_LOCK was introduced to mark storage engines which don't support LOCK TABLES READ LOCAL but don't want to use thr_lock.c locks to indicate this. LOCK TABLES READ LOCAL automatically acquires SRO locks for them. After the above steps InnoDB code was changed to indicate that InnoDB no longer needs thr_lock.c locks acquired. This was done by changing ha_innobase::lock_count() to return 0 and by ensuring that ha_innobase::store_lock() doesn't try to store type of thr_lock.c lock in MYSQL_LOCK::locks[] array it gets as a parameter. It is worth to emphasize the following non-obvious behavior changes caused by this patch: *) LOCK TABLES READ blocks and is blocked by concurrent transactions changing the table, for all storage engines, similarly to how LOCK TABLES WRITE work now. *) Tables which are implicitly used by LOCK TABLES (e.g. through view or trigger) are locked using metadata locks in addition to (all SEs except InnoDB) or instead of (InnoDB) THR_LOCK locks. This means that previous item also applies to such tables. *) Multi-update is blocking/starts to be blocked by concurrent LOCK TABLES READ on any table from its join, even though such table will be only used for reading and won't be updated. Many test cases which were relying on old behavior had to be adjusted. Particularly: *) HANDLER-related tests had to be adjusted to take into account that HANDLER READ will wait for and acquire SR lock. *) Some tests using thr_lock.c locks had to be adjusted to use statements other than LOCK TABLES READ. *) Some tests which waited until LOCK TABLES READ will block DML on thr_lock.c locks now wait for blocking on MDL. *) P_S test coverage for aggregates now either uses MyISAM tables or were updated to take into account that InnoDB doesn't acquire thr_lock.c locks. *) Coverage for new behavior and new types of MDL locks were added, as well as unit tests for the latter.
Loading