Skip to content
  • Frazer Clement's avatar
    9b3b0f7d
    Bug #16693068 7.1.18 TO 7.1.22-27 UPGRADE NOT POSSIBLE IF LARGE NUMBER OF TABLES PRESENT · 9b3b0f7d
    Frazer Clement authored
      The Dbdih::getInstanceKey() function is used to determine which LDM instance
      is responsible for storing a table fragment, based on the table fragment's
      log_part_id, which is set at table creation time, and must be stable between
      initial restarts of a node.
          
      The DbDih::getInstanceKey() function should return a value between 1 and 
      NDBMT_MAX_BLOCK_INSTANCES-1 inclusive, identifying which LDM worker array entry
      should be used for a given fragment.  Value 0 refers to the LDM Proxy instance
      in ndbmtd, and so is not used.  
      However getInstanceKey() has a bug where it can return numbers between 1 and 
      NDBMT_MAX_BLOCK_INSTANCES inclusive.  This exceeds its range.
      
      Where it returns NDBMT_MAX_BLOCK_INSTANCES as a result, this causes an 
      array out-of-bounds exception when mapping an instance key to a receiving
      thread, and this can have various bad effects. 
    
      This bug is not generally observed as getInstanceKey() uses the log_part_id
      to determine the instance number, and recent code changes keep log_part_ids
      far below values which can return NDBMT_MAX_BLOCK_INSTANCES.
    
      Older code stored unlimited log part ids in the fragment definition,
      and used modulo division by the number of LDM instances at 'runtime' 
      to get the correct log part.
    
      More recent code stores log part ids modulo the current number of running
      LDM instances at table creation time, and continues to determine fragment 
      instance keys (LDM instance numbers) based on modulo division.  It's not
      that clear exactly what problem was solved here, but it happens to hide the 
      getInstanceKey() bug.
    
      The number of running LDM instances is always < NDBMT_MAX_BLOCK_INSTANCES -1,
      so normal code does not hit the bug in getInstanceKey()
    
      During an upgrade we can load table definitions on disk from an older version
      with non-modulo log part numbers.  This can expose the getInstanceKey() bug
      which has various symptoms (hangs, crashes). 
      
      To solve this, this patch performs modulo division of log_part_ids by the running
      node's number of LDM instances when loading a potentially 'old' (unbounded) fragment
      definition from disk.  This extends the guarantee that fragment log_part_ids will
      be < NDBMT_MAX_BLOCK_INSTANCES - 1 to cover the upgrade scenario so that the 
      getInstanceKey() bug is not exposed.  This fix is aligned with the previous modification
      which stores these 'modulo log part ids'.
    
      To further clarify the new situation, getInstanceKey() is 
      modified to no longer perform modulo division on the 
      log_part_id as it is no longer required.
      Additionally, getInstanceKey() is modified to require that 
      the resulting instance key is in range.
      Further, existing code which sets a fragment's log_part_id 
      is modified to require that the log part id is within range 
      (0 to NDBMT_MAX_BLOCK_INSTANCES -2 inclusive).
      A new define NDBMT_MAX_WORKER_INSTANCES is defined as 
      NDBMT_MAX_BLOCK_INSTANCES - 1, in which terms the range of the 
      log_part_id is 0 to NDBMT_MAX_WORKER_INSTANCES -1 inclusive.
    
      Performing %= num_ldm_instances on the log_part_id should be
      safe for non-initial restarts, as those are only allowed when
      the number of log parts is not changed.
      
      One downside of storing log parts in their 'modulo' form is that an
      increase in the number of log parts cannot remap existing fragments
      to the new log parts over an initial restart.
      That problem was introduced by the original change to store modulo
      log parts and is considered out of scope here.
    
      testUpgrade is extended to cover the scenario required here :
       - Many tables
       - Many fragments/table
       - System restart upgrade rather than node restart upgrade
          
      An execution of this upgrade test is added to the upgrade-tests
      suite for running in Autotest.
    9b3b0f7d
    Bug #16693068 7.1.18 TO 7.1.22-27 UPGRADE NOT POSSIBLE IF LARGE NUMBER OF TABLES PRESENT
    Frazer Clement authored
      The Dbdih::getInstanceKey() function is used to determine which LDM instance
      is responsible for storing a table fragment, based on the table fragment's
      log_part_id, which is set at table creation time, and must be stable between
      initial restarts of a node.
          
      The DbDih::getInstanceKey() function should return a value between 1 and 
      NDBMT_MAX_BLOCK_INSTANCES-1 inclusive, identifying which LDM worker array entry
      should be used for a given fragment.  Value 0 refers to the LDM Proxy instance
      in ndbmtd, and so is not used.  
      However getInstanceKey() has a bug where it can return numbers between 1 and 
      NDBMT_MAX_BLOCK_INSTANCES inclusive.  This exceeds its range.
      
      Where it returns NDBMT_MAX_BLOCK_INSTANCES as a result, this causes an 
      array out-of-bounds exception when mapping an instance key to a receiving
      thread, and this can have various bad effects. 
    
      This bug is not generally observed as getInstanceKey() uses the log_part_id
      to determine the instance number, and recent code changes keep log_part_ids
      far below values which can return NDBMT_MAX_BLOCK_INSTANCES.
    
      Older code stored unlimited log part ids in the fragment definition,
      and used modulo division by the number of LDM instances at 'runtime' 
      to get the correct log part.
    
      More recent code stores log part ids modulo the current number of running
      LDM instances at table creation time, and continues to determine fragment 
      instance keys (LDM instance numbers) based on modulo division.  It's not
      that clear exactly what problem was solved here, but it happens to hide the 
      getInstanceKey() bug.
    
      The number of running LDM instances is always < NDBMT_MAX_BLOCK_INSTANCES -1,
      so normal code does not hit the bug in getInstanceKey()
    
      During an upgrade we can load table definitions on disk from an older version
      with non-modulo log part numbers.  This can expose the getInstanceKey() bug
      which has various symptoms (hangs, crashes). 
      
      To solve this, this patch performs modulo division of log_part_ids by the running
      node's number of LDM instances when loading a potentially 'old' (unbounded) fragment
      definition from disk.  This extends the guarantee that fragment log_part_ids will
      be < NDBMT_MAX_BLOCK_INSTANCES - 1 to cover the upgrade scenario so that the 
      getInstanceKey() bug is not exposed.  This fix is aligned with the previous modification
      which stores these 'modulo log part ids'.
    
      To further clarify the new situation, getInstanceKey() is 
      modified to no longer perform modulo division on the 
      log_part_id as it is no longer required.
      Additionally, getInstanceKey() is modified to require that 
      the resulting instance key is in range.
      Further, existing code which sets a fragment's log_part_id 
      is modified to require that the log part id is within range 
      (0 to NDBMT_MAX_BLOCK_INSTANCES -2 inclusive).
      A new define NDBMT_MAX_WORKER_INSTANCES is defined as 
      NDBMT_MAX_BLOCK_INSTANCES - 1, in which terms the range of the 
      log_part_id is 0 to NDBMT_MAX_WORKER_INSTANCES -1 inclusive.
    
      Performing %= num_ldm_instances on the log_part_id should be
      safe for non-initial restarts, as those are only allowed when
      the number of log parts is not changed.
      
      One downside of storing log parts in their 'modulo' form is that an
      increase in the number of log parts cannot remap existing fragments
      to the new log parts over an initial restart.
      That problem was introduced by the original change to store modulo
      log parts and is considered out of scope here.
    
      testUpgrade is extended to cover the scenario required here :
       - Many tables
       - Many fragments/table
       - System restart upgrade rather than node restart upgrade
          
      An execution of this upgrade test is added to the upgrade-tests
      suite for running in Autotest.
Loading