mysql-test/suite/sys_vars/r/max_length_for_sort_data_basic.result · mysql-8.0.3 · Rasoul Jahanshahi / Mysql Server

Oct 27, 2016

Bug #24823885: PERFORMANCE REGRESSION WHEN CHANGING CHARACTER SET TO UTF8MB4 · e0315768

Steinar H. Gunderson authored Oct 27, 2016

Increase max_length_for_sort_data from 1024 to 4096.

This parameter controls the threshold for when we stop sorting full rows,
and instead just sort the sort key plus a row ID, and then go back to the
table to pick up those rows afterwards.

Since this parameter was introduced and got its current value in 2003,
several important things have happened:

 - Computers have gotten more RAM, so that sort buffers can be larger.
 - We have switched from MyISAM to InnoDB, where picking out a row by ID
   is much more expensive.
 - Unicode collations, for which we grossly overestimate row size in the
   typical case (we assume the string is completely filled with maximum-length
   UTF-8 characters, whereas the typical is more like a 10–20% fill grade
   with ASCII), have become commonplace.

Thus, increase this value; the actual value chosen is a bit arbitrary and would
benefit from actual benchmarks across a wide variety of real loads, but it's
obviously a step in the right direction.

sysbench result goes from 7080 -> 9078 tps (+28.2%).

Change-Id: I031ded33e5a18ca903b4549b5692563137672408

e0315768

Bug #24823885: PERFORMANCE REGRESSION WHEN CHANGING CHARACTER SET TO UTF8MB4

Steinar H. Gunderson authored Oct 27, 2016

Increase max_length_for_sort_data from 1024 to 4096.

This parameter controls the threshold for when we stop sorting full rows,
and instead just sort the sort key plus a row ID, and then go back to the
table to pick up those rows afterwards.

Since this parameter was introduced and got its current value in 2003,
several important things have happened:

 - Computers have gotten more RAM, so that sort buffers can be larger.
 - We have switched from MyISAM to InnoDB, where picking out a row by ID
   is much more expensive.
 - Unicode collations, for which we grossly overestimate row size in the
   typical case (we assume the string is completely filled with maximum-length
   UTF-8 characters, whereas the typical is more like a 10–20% fill grade
   with ASCII), have become commonplace.

Thus, increase this value; the actual value chosen is a bit arbitrary and would
benefit from actual benchmarks across a wide variety of real loads, but it's
obviously a step in the right direction.

sysbench result goes from 7080 -> 9078 tps (+28.2%).

Change-Id: I031ded33e5a18ca903b4549b5692563137672408