-
Vasil Dimov authored
The CRC32 checksum generation code interprets portions of the byte string to checksum as a 8-byte integer so that it can process 8 bytes at a time (rather than 1 byte at a time). For this, the code uses the native byte order of the machine: crc ^= *(ib_uint64_t*) buf; and then does numerical calculations with the result (e.g. crc >> N). Thus the resulting checksum depends on the byte order of the machine and is different on big and little endian machines. This means that files written to with --innodb-checksum-algorithm=crc32/strict_crc32 on big (little) endian machines are not readable on little (big) endian machines because the checksum, though valid, is not recognized. The simplest solution would be to start writing only e.g. big endian checksums and recognize only such ones, but this would introduce an unacceptable backwards incompatibility. The solution implemented is to recognize both big and little endian CRC32 checksums during verification, while first calculating and checking the little endian one. Swapping the byteorder in order to calculate "the other" CRC32 checksum slows down the checksum calculation by about 1-2% (e.g. recognize big-endian-CRC32 on little endian machines or recognize little-endian-CRC32 on big endian machines). When generating the checksum (when writing to disk) we now always use little endian byteorder (no change in little endian machines, and an extra step of swapping the byteorder on big-endian machines). Reviewed-by:
Debarun Banerjee <debarun.banerjee@oracle.com> RB: 8781
Vasil Dimov authoredThe CRC32 checksum generation code interprets portions of the byte string to checksum as a 8-byte integer so that it can process 8 bytes at a time (rather than 1 byte at a time). For this, the code uses the native byte order of the machine: crc ^= *(ib_uint64_t*) buf; and then does numerical calculations with the result (e.g. crc >> N). Thus the resulting checksum depends on the byte order of the machine and is different on big and little endian machines. This means that files written to with --innodb-checksum-algorithm=crc32/strict_crc32 on big (little) endian machines are not readable on little (big) endian machines because the checksum, though valid, is not recognized. The simplest solution would be to start writing only e.g. big endian checksums and recognize only such ones, but this would introduce an unacceptable backwards incompatibility. The solution implemented is to recognize both big and little endian CRC32 checksums during verification, while first calculating and checking the little endian one. Swapping the byteorder in order to calculate "the other" CRC32 checksum slows down the checksum calculation by about 1-2% (e.g. recognize big-endian-CRC32 on little endian machines or recognize little-endian-CRC32 on big endian machines). When generating the checksum (when writing to disk) we now always use little endian byteorder (no change in little endian machines, and an extra step of swapping the byteorder on big-endian machines). Reviewed-by:
Debarun Banerjee <debarun.banerjee@oracle.com> RB: 8781
Loading