-
Marc Alff authored
This changeset implement various performance improvements implemented to date. Major change: The instrumentation of table io and table lock has been re factored, to avoid performing excessive CPU computation when a table is reused in the table open cache. The statistics (1) by calling thread and (2) by object are collected in a more disjoint way, avoiding the need to aggregate stats per table share in (2) every time the thread using the table changes in (1). See the updated doxygen documentation for table_locker. Minor changes: Performance schema memory structures and critical global variables are not aligned on 128 bytes boundaries, to avoid false sharing between CPU cache lines, and to avoid using more cache lines than necessary. Every LF_HASH table is sized to the maximum when the hash is created, to avoid having to resize dynamically the LF_HASH when new elements are inserted. The size and load count of every LF_HASH table is now displayed in the output of SHOW ENGINE PERFORMANCE_SCHEMA STATUS, to help diagnostics. The macro PSI_CALL has been replaced by PSI_XXX_CALL for each type of instrumentation. This change is a preliminary cleanup, to make implementing further optimizations possible. Event name indexes for hard coded events (table io, table lock, idle) are now compile time constants, instead of being dynamically allocated. Using constants makes the code simpler and helps the compiler to perform better optimizations.
Marc Alff authoredThis changeset implement various performance improvements implemented to date. Major change: The instrumentation of table io and table lock has been re factored, to avoid performing excessive CPU computation when a table is reused in the table open cache. The statistics (1) by calling thread and (2) by object are collected in a more disjoint way, avoiding the need to aggregate stats per table share in (2) every time the thread using the table changes in (1). See the updated doxygen documentation for table_locker. Minor changes: Performance schema memory structures and critical global variables are not aligned on 128 bytes boundaries, to avoid false sharing between CPU cache lines, and to avoid using more cache lines than necessary. Every LF_HASH table is sized to the maximum when the hash is created, to avoid having to resize dynamically the LF_HASH when new elements are inserted. The size and load count of every LF_HASH table is now displayed in the output of SHOW ENGINE PERFORMANCE_SCHEMA STATUS, to help diagnostics. The macro PSI_CALL has been replaced by PSI_XXX_CALL for each type of instrumentation. This change is a preliminary cleanup, to make implementing further optimizations possible. Event name indexes for hard coded events (table io, table lock, idle) are now compile time constants, instead of being dynamically allocated. Using constants makes the code simpler and helps the compiler to perform better optimizations.
Loading