Skip to content
  • Vasil Dimov's avatar
    a5a22244
    Fix Bug#20783098 INNODB_CHECKSUM_ALGORITHM=CRC32 IS NOT BYTE ORDER AGNOSTIC · a5a22244
    Vasil Dimov authored
    
    
    The CRC32 checksum generation code interprets portions of the byte
    string to checksum as a 8-byte integer so that it can process 8 bytes
    at a time (rather than 1 byte at a time). For this, the code uses the
    native byte order of the machine:
    
      crc ^= *(ib_uint64_t*) buf;
    
    and then does numerical calculations with the result (e.g. crc >> N).
    Thus the resulting checksum depends on the byte order of the machine
    and is different on big and little endian machines. This means that
    files written to with --innodb-checksum-algorithm=crc32/strict_crc32 on
    big (little) endian machines are not readable on little (big) endian
    machines because the checksum, though valid, is not recognized.
    
    The simplest solution would be to start writing only e.g. big endian
    checksums and recognize only such ones, but this would introduce an
    unacceptable backwards incompatibility.
    
    The solution implemented is to recognize both big and little endian
    CRC32 checksums during verification, while first calculating and
    checking the little endian one.
    
    Swapping the byteorder in order to calculate "the other" CRC32 checksum
    slows down the checksum calculation by about 1-2% (e.g. recognize
    big-endian-CRC32 on little endian machines or recognize
    little-endian-CRC32 on big endian machines).
    
    When generating the checksum (when writing to disk) we now always use
    little endian byteorder (no change in little endian machines, and an
    extra step of swapping the byteorder on big-endian machines).
    
    Reviewed-by: default avatarDebarun Banerjee <debarun.banerjee@oracle.com>
    RB: 8781
    a5a22244
    Fix Bug#20783098 INNODB_CHECKSUM_ALGORITHM=CRC32 IS NOT BYTE ORDER AGNOSTIC
    Vasil Dimov authored
    
    
    The CRC32 checksum generation code interprets portions of the byte
    string to checksum as a 8-byte integer so that it can process 8 bytes
    at a time (rather than 1 byte at a time). For this, the code uses the
    native byte order of the machine:
    
      crc ^= *(ib_uint64_t*) buf;
    
    and then does numerical calculations with the result (e.g. crc >> N).
    Thus the resulting checksum depends on the byte order of the machine
    and is different on big and little endian machines. This means that
    files written to with --innodb-checksum-algorithm=crc32/strict_crc32 on
    big (little) endian machines are not readable on little (big) endian
    machines because the checksum, though valid, is not recognized.
    
    The simplest solution would be to start writing only e.g. big endian
    checksums and recognize only such ones, but this would introduce an
    unacceptable backwards incompatibility.
    
    The solution implemented is to recognize both big and little endian
    CRC32 checksums during verification, while first calculating and
    checking the little endian one.
    
    Swapping the byteorder in order to calculate "the other" CRC32 checksum
    slows down the checksum calculation by about 1-2% (e.g. recognize
    big-endian-CRC32 on little endian machines or recognize
    little-endian-CRC32 on big endian machines).
    
    When generating the checksum (when writing to disk) we now always use
    little endian byteorder (no change in little endian machines, and an
    extra step of swapping the byteorder on big-endian machines).
    
    Reviewed-by: default avatarDebarun Banerjee <debarun.banerjee@oracle.com>
    RB: 8781
Loading