-
Frazer Clement authored
The Dbdih::getInstanceKey() function is used to determine which LDM instance is responsible for storing a table fragment, based on the table fragment's log_part_id, which is set at table creation time, and must be stable between initial restarts of a node. The DbDih::getInstanceKey() function should return a value between 1 and NDBMT_MAX_BLOCK_INSTANCES-1 inclusive, identifying which LDM worker array entry should be used for a given fragment. Value 0 refers to the LDM Proxy instance in ndbmtd, and so is not used. However getInstanceKey() has a bug where it can return numbers between 1 and NDBMT_MAX_BLOCK_INSTANCES inclusive. This exceeds its range. Where it returns NDBMT_MAX_BLOCK_INSTANCES as a result, this causes an array out-of-bounds exception when mapping an instance key to a receiving thread, and this can have various bad effects. This bug is not generally observed as getInstanceKey() uses the log_part_id to determine the instance number, and recent code changes keep log_part_ids far below values which can return NDBMT_MAX_BLOCK_INSTANCES. Older code stored unlimited log part ids in the fragment definition, and used modulo division by the number of LDM instances at 'runtime' to get the correct log part. More recent code stores log part ids modulo the current number of running LDM instances at table creation time, and continues to determine fragment instance keys (LDM instance numbers) based on modulo division. It's not that clear exactly what problem was solved here, but it happens to hide the getInstanceKey() bug. The number of running LDM instances is always < NDBMT_MAX_BLOCK_INSTANCES -1, so normal code does not hit the bug in getInstanceKey() During an upgrade we can load table definitions on disk from an older version with non-modulo log part numbers. This can expose the getInstanceKey() bug which has various symptoms (hangs, crashes). To solve this, this patch performs modulo division of log_part_ids by the running node's number of LDM instances when loading a potentially 'old' (unbounded) fragment definition from disk. This extends the guarantee that fragment log_part_ids will be < NDBMT_MAX_BLOCK_INSTANCES - 1 to cover the upgrade scenario so that the getInstanceKey() bug is not exposed. This fix is aligned with the previous modification which stores these 'modulo log part ids'. To further clarify the new situation, getInstanceKey() is modified to no longer perform modulo division on the log_part_id as it is no longer required. Additionally, getInstanceKey() is modified to require that the resulting instance key is in range. Further, existing code which sets a fragment's log_part_id is modified to require that the log part id is within range (0 to NDBMT_MAX_BLOCK_INSTANCES -2 inclusive). A new define NDBMT_MAX_WORKER_INSTANCES is defined as NDBMT_MAX_BLOCK_INSTANCES - 1, in which terms the range of the log_part_id is 0 to NDBMT_MAX_WORKER_INSTANCES -1 inclusive. Performing %= num_ldm_instances on the log_part_id should be safe for non-initial restarts, as those are only allowed when the number of log parts is not changed. One downside of storing log parts in their 'modulo' form is that an increase in the number of log parts cannot remap existing fragments to the new log parts over an initial restart. That problem was introduced by the original change to store modulo log parts and is considered out of scope here. testUpgrade is extended to cover the scenario required here : - Many tables - Many fragments/table - System restart upgrade rather than node restart upgrade An execution of this upgrade test is added to the upgrade-tests suite for running in Autotest.
Frazer Clement authoredThe Dbdih::getInstanceKey() function is used to determine which LDM instance is responsible for storing a table fragment, based on the table fragment's log_part_id, which is set at table creation time, and must be stable between initial restarts of a node. The DbDih::getInstanceKey() function should return a value between 1 and NDBMT_MAX_BLOCK_INSTANCES-1 inclusive, identifying which LDM worker array entry should be used for a given fragment. Value 0 refers to the LDM Proxy instance in ndbmtd, and so is not used. However getInstanceKey() has a bug where it can return numbers between 1 and NDBMT_MAX_BLOCK_INSTANCES inclusive. This exceeds its range. Where it returns NDBMT_MAX_BLOCK_INSTANCES as a result, this causes an array out-of-bounds exception when mapping an instance key to a receiving thread, and this can have various bad effects. This bug is not generally observed as getInstanceKey() uses the log_part_id to determine the instance number, and recent code changes keep log_part_ids far below values which can return NDBMT_MAX_BLOCK_INSTANCES. Older code stored unlimited log part ids in the fragment definition, and used modulo division by the number of LDM instances at 'runtime' to get the correct log part. More recent code stores log part ids modulo the current number of running LDM instances at table creation time, and continues to determine fragment instance keys (LDM instance numbers) based on modulo division. It's not that clear exactly what problem was solved here, but it happens to hide the getInstanceKey() bug. The number of running LDM instances is always < NDBMT_MAX_BLOCK_INSTANCES -1, so normal code does not hit the bug in getInstanceKey() During an upgrade we can load table definitions on disk from an older version with non-modulo log part numbers. This can expose the getInstanceKey() bug which has various symptoms (hangs, crashes). To solve this, this patch performs modulo division of log_part_ids by the running node's number of LDM instances when loading a potentially 'old' (unbounded) fragment definition from disk. This extends the guarantee that fragment log_part_ids will be < NDBMT_MAX_BLOCK_INSTANCES - 1 to cover the upgrade scenario so that the getInstanceKey() bug is not exposed. This fix is aligned with the previous modification which stores these 'modulo log part ids'. To further clarify the new situation, getInstanceKey() is modified to no longer perform modulo division on the log_part_id as it is no longer required. Additionally, getInstanceKey() is modified to require that the resulting instance key is in range. Further, existing code which sets a fragment's log_part_id is modified to require that the log part id is within range (0 to NDBMT_MAX_BLOCK_INSTANCES -2 inclusive). A new define NDBMT_MAX_WORKER_INSTANCES is defined as NDBMT_MAX_BLOCK_INSTANCES - 1, in which terms the range of the log_part_id is 0 to NDBMT_MAX_WORKER_INSTANCES -1 inclusive. Performing %= num_ldm_instances on the log_part_id should be safe for non-initial restarts, as those are only allowed when the number of log parts is not changed. One downside of storing log parts in their 'modulo' form is that an increase in the number of log parts cannot remap existing fragments to the new log parts over an initial restart. That problem was introduced by the original change to store modulo log parts and is considered out of scope here. testUpgrade is extended to cover the scenario required here : - Many tables - Many fragments/table - System restart upgrade rather than node restart upgrade An execution of this upgrade test is added to the upgrade-tests suite for running in Autotest.
Loading