Skip to content
  • Olav Sandstaa's avatar
    5c11ddbd
    Fix for Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY · 5c11ddbd
    Olav Sandstaa authored
                         CAUSES PEFORMANCE REGRESSION
    
    The cause for the performance regression is that the access strategy for the
    GROUP BY query is changed form using "index scan" in mysql-5.1 to use "loose
    index scan" in mysql-5.5. The index used for group by is unique and thus each
    "loose scan" group will only contain one record. Since loose scan needs to
    re-position on each "loose scan" group this query will do a re-position for
    each index entry. Compared to just reading the next index entry as a normal
    index scan does, the use of loose scan for this query becomes more expensive.
    
    The cause for selecting to use loose scan for this query is that in the current
    code when the size of the "loose scan" group is one, the formula for
    calculting the cost estimates becomes almost identical to the cost of using
    normal index scan. Differences in use of integer versus floating point aritmetic
    can cause one or the other access strategy to be selected.
    
    The main issue with the formula for estimating the cost of using loose scan is
    that it does not take into account that it is more costly to do a re-position
    for each "loose scan" group compared to just reading the next index entry.
    Both index scan and loose scan estimates the cpu cost as:
    
      "number of entries needed too read/scan" * ROW_EVALUATE_COST
    
    The results from testing with the query in this bug indicates that the real
    cost for doing re-position four to eight times higher than just reading the
    next index entry. Thus, the cpu cost estimate for loose scan should be increased.
    To account for the extra work to re-position in the index we increase the
    cost for loose index scan to include the cost of navigating the index. 
    This is modelled as a function of the height of the b-tree:
    
      navigation cost= ceil(log(records in table)/log(indexes per block))
                     * ROWID_COMPARE_COST;
    
    This will avoid loose index scan being used for indexes where the "loose scan" 
    group contains very few index entries.
    5c11ddbd
    Fix for Bug#11757108 CHANGE IN EXECUTION PLAN FOR COUNT_DISTINCT_GROUP_ON_KEY
    Olav Sandstaa authored
                         CAUSES PEFORMANCE REGRESSION
    
    The cause for the performance regression is that the access strategy for the
    GROUP BY query is changed form using "index scan" in mysql-5.1 to use "loose
    index scan" in mysql-5.5. The index used for group by is unique and thus each
    "loose scan" group will only contain one record. Since loose scan needs to
    re-position on each "loose scan" group this query will do a re-position for
    each index entry. Compared to just reading the next index entry as a normal
    index scan does, the use of loose scan for this query becomes more expensive.
    
    The cause for selecting to use loose scan for this query is that in the current
    code when the size of the "loose scan" group is one, the formula for
    calculting the cost estimates becomes almost identical to the cost of using
    normal index scan. Differences in use of integer versus floating point aritmetic
    can cause one or the other access strategy to be selected.
    
    The main issue with the formula for estimating the cost of using loose scan is
    that it does not take into account that it is more costly to do a re-position
    for each "loose scan" group compared to just reading the next index entry.
    Both index scan and loose scan estimates the cpu cost as:
    
      "number of entries needed too read/scan" * ROW_EVALUATE_COST
    
    The results from testing with the query in this bug indicates that the real
    cost for doing re-position four to eight times higher than just reading the
    next index entry. Thus, the cpu cost estimate for loose scan should be increased.
    To account for the extra work to re-position in the index we increase the
    cost for loose index scan to include the cost of navigating the index. 
    This is modelled as a function of the height of the b-tree:
    
      navigation cost= ceil(log(records in table)/log(indexes per block))
                     * ROWID_COMPARE_COST;
    
    This will avoid loose index scan being used for indexes where the "loose scan" 
    group contains very few index entries.
Loading