-
Dmitry Lenev authored
All 8 tables which were used in a 8-table Sysbench run (named 'test.sbtest1' .. 'test.sbtest8') fell into the same MDL hash partition (with metadata_locks_hash_instances set to default value - 8). This made the lock protecting this partition a bottleneck in some of the Sysbench tests. The MDL hash partition for the table is selected by taking modulo number-of-partitions from the hashed value of the key, which includes database and table name (e.g. '\002test\000sbtest1\000'). The same hash value is later also used with the hash container representing the individual partition. It turned out that the my_hash_sort_bin hash function, which was used for this purpose, is fairly likely to produce hash values for keys looking like "key1" .. "keyN" which don't differ in the lower bits. This means that objects with such keys are fairly likely to fall into the same MDL hash partition. To solve this problem this patch changes the hash function used for both MDL hash partition selection and in the hash container representing the individual partition to MurmurHash3. MurmurHash3 is a modern non-cryptographic hash function with much better statistical properties than my_hash_sort_bin. Particularly, due to good avalanche effect, keys looking like "key1" .. "key8" are unlikely to have hashes with the same lower bits, so it is unlikely for tables which are named in a similar fashion to fall into the same MDL hash partition (e.g. in the specific case mentioned in the original bug report only 3 keys out of 8 fall in the same partition). Also MurmurHash3 is generally faster than my_hash_sort_bin. For example, on my machine, the 32-bit version of MurmurHash3 used in this patch is around 3 times faster than my_hash_sort_bin for short, 16-byte keys and up to 5 times faster for really long keys. To be able to use MurmurHash3 with a HASH container the implementation of the container was extended to support usage of custom hash function instead of one prescribed by the character set.
Dmitry Lenev authoredAll 8 tables which were used in a 8-table Sysbench run (named 'test.sbtest1' .. 'test.sbtest8') fell into the same MDL hash partition (with metadata_locks_hash_instances set to default value - 8). This made the lock protecting this partition a bottleneck in some of the Sysbench tests. The MDL hash partition for the table is selected by taking modulo number-of-partitions from the hashed value of the key, which includes database and table name (e.g. '\002test\000sbtest1\000'). The same hash value is later also used with the hash container representing the individual partition. It turned out that the my_hash_sort_bin hash function, which was used for this purpose, is fairly likely to produce hash values for keys looking like "key1" .. "keyN" which don't differ in the lower bits. This means that objects with such keys are fairly likely to fall into the same MDL hash partition. To solve this problem this patch changes the hash function used for both MDL hash partition selection and in the hash container representing the individual partition to MurmurHash3. MurmurHash3 is a modern non-cryptographic hash function with much better statistical properties than my_hash_sort_bin. Particularly, due to good avalanche effect, keys looking like "key1" .. "key8" are unlikely to have hashes with the same lower bits, so it is unlikely for tables which are named in a similar fashion to fall into the same MDL hash partition (e.g. in the specific case mentioned in the original bug report only 3 keys out of 8 fall in the same partition). Also MurmurHash3 is generally faster than my_hash_sort_bin. For example, on my machine, the 32-bit version of MurmurHash3 used in this patch is around 3 times faster than my_hash_sort_bin for short, 16-byte keys and up to 5 times faster for really long keys. To be able to use MurmurHash3 with a HASH container the implementation of the container was extended to support usage of custom hash function instead of one prescribed by the character set.
Loading