Skip to content
  • Dmitry Lenev's avatar
    dfbe3ed6
    Bug #16396598 "MDL HASH CAN STILL BE CONCURRENCY BOTTLENECK". · dfbe3ed6
    Dmitry Lenev authored
    All 8 tables which were used in a 8-table Sysbench run (named
    'test.sbtest1' .. 'test.sbtest8') fell into the same MDL hash
    partition (with metadata_locks_hash_instances set to default
    value - 8). This made the lock protecting this partition a
    bottleneck in some of the Sysbench tests.
    
    The MDL hash partition for the table is selected by taking modulo
    number-of-partitions from the hashed value of the key, which
    includes database and table name (e.g. '\002test\000sbtest1\000').
    The same hash value is later also used with the hash container
    representing the individual partition.
    It turned out that the my_hash_sort_bin hash function, which
    was used for this purpose, is fairly likely to produce hash
    values for keys looking like "key1" .. "keyN" which don't
    differ in the lower bits. This means that objects with such keys
    are fairly likely to fall into the same MDL hash partition.
    
    To solve this problem this patch changes the hash function used
    for both MDL hash partition selection and in the hash container
    representing the individual partition to MurmurHash3.
    
    MurmurHash3 is a modern non-cryptographic hash function with
    much better statistical properties than my_hash_sort_bin.
    Particularly, due to good avalanche effect, keys looking like
    "key1" .. "key8" are unlikely to have hashes with the same
    lower bits, so it is unlikely for tables which are named in
    a similar fashion to fall into the same MDL hash partition
    (e.g. in the specific case mentioned in the original bug report
    only 3 keys out of 8 fall in the same partition).
    
    Also MurmurHash3 is generally faster than my_hash_sort_bin.
    For example, on my machine, the 32-bit version of MurmurHash3
    used in this patch is around 3 times faster than my_hash_sort_bin
    for short, 16-byte keys and up to 5 times faster for really
    long keys.
    
    To be able to use MurmurHash3 with a HASH container the
    implementation of the container was extended to support usage
    of custom hash function instead of one prescribed by the character
    set.
    dfbe3ed6
    Bug #16396598 "MDL HASH CAN STILL BE CONCURRENCY BOTTLENECK".
    Dmitry Lenev authored
    All 8 tables which were used in a 8-table Sysbench run (named
    'test.sbtest1' .. 'test.sbtest8') fell into the same MDL hash
    partition (with metadata_locks_hash_instances set to default
    value - 8). This made the lock protecting this partition a
    bottleneck in some of the Sysbench tests.
    
    The MDL hash partition for the table is selected by taking modulo
    number-of-partitions from the hashed value of the key, which
    includes database and table name (e.g. '\002test\000sbtest1\000').
    The same hash value is later also used with the hash container
    representing the individual partition.
    It turned out that the my_hash_sort_bin hash function, which
    was used for this purpose, is fairly likely to produce hash
    values for keys looking like "key1" .. "keyN" which don't
    differ in the lower bits. This means that objects with such keys
    are fairly likely to fall into the same MDL hash partition.
    
    To solve this problem this patch changes the hash function used
    for both MDL hash partition selection and in the hash container
    representing the individual partition to MurmurHash3.
    
    MurmurHash3 is a modern non-cryptographic hash function with
    much better statistical properties than my_hash_sort_bin.
    Particularly, due to good avalanche effect, keys looking like
    "key1" .. "key8" are unlikely to have hashes with the same
    lower bits, so it is unlikely for tables which are named in
    a similar fashion to fall into the same MDL hash partition
    (e.g. in the specific case mentioned in the original bug report
    only 3 keys out of 8 fall in the same partition).
    
    Also MurmurHash3 is generally faster than my_hash_sort_bin.
    For example, on my machine, the 32-bit version of MurmurHash3
    used in this patch is around 3 times faster than my_hash_sort_bin
    for short, 16-byte keys and up to 5 times faster for really
    long keys.
    
    To be able to use MurmurHash3 with a HASH container the
    implementation of the container was extended to support usage
    of custom hash function instead of one prescribed by the character
    set.
Loading