Skip to content
  • Vasil Dimov's avatar
    a4891165
    Fix Bug#20427694 RECORDS_IN_RANGE IS +/-1 AGNOSTIC · a4891165
    Vasil Dimov authored
    
    
    The code in InnoDB that estimates the records in a given range
    (btr_estimate_n_rows_in_range()) is agnostic to off-by-one errors because
    it returns an estimate. Indeed +/- 1 is irrelevant if 4587 or 4586 is
    returned when the actual value is 4600 for example. This is fine - it is
    expected that the returned value is an approximation.
    
    The problem arises from the fact that different codepaths handle the
    range boundaries differently (include the boundary or not). One of those
    codepaths handles the case then the range fits entirely in one page and
    another when the range spans across multiple pages. So, for the same
    dataset we could get different results (+/-1) depending on the page size -
    if for a big page size the range fits in one page and for a smaller page
    size it spans across multiple pages.
    
    The solution is to tune ha_innobase::records_in_range() and
    btr_estimate_n_rows_in_range() to honor the type of the boundaries of the
    range - e.g. open, closed or unbounded from the left/right range and always
    return the exact number of rows in the range if all the pages in the range
    were sampled (see N_PAGES_READ_LIMIT).
    
    Reviewed-by: default avatarAnnamalai Gurusami <annamalai.gurusami@oracle.com>
    Reviewed-by: Guilhem Bichot <guilhem.bichot@oracle.com> (only */subquery.*)
    RB: 8034
    (cherry picked from
    commit 4452052ea9c8ef14239ab64ceb8b2b5da2c4ea5a and
    commit f60a0b0b5c6d9e2608ba4caa8fc2076cfad09438 and
    and adjustment to opt_hints.result)
    a4891165
    Fix Bug#20427694 RECORDS_IN_RANGE IS +/-1 AGNOSTIC
    Vasil Dimov authored
    
    
    The code in InnoDB that estimates the records in a given range
    (btr_estimate_n_rows_in_range()) is agnostic to off-by-one errors because
    it returns an estimate. Indeed +/- 1 is irrelevant if 4587 or 4586 is
    returned when the actual value is 4600 for example. This is fine - it is
    expected that the returned value is an approximation.
    
    The problem arises from the fact that different codepaths handle the
    range boundaries differently (include the boundary or not). One of those
    codepaths handles the case then the range fits entirely in one page and
    another when the range spans across multiple pages. So, for the same
    dataset we could get different results (+/-1) depending on the page size -
    if for a big page size the range fits in one page and for a smaller page
    size it spans across multiple pages.
    
    The solution is to tune ha_innobase::records_in_range() and
    btr_estimate_n_rows_in_range() to honor the type of the boundaries of the
    range - e.g. open, closed or unbounded from the left/right range and always
    return the exact number of rows in the range if all the pages in the range
    were sampled (see N_PAGES_READ_LIMIT).
    
    Reviewed-by: default avatarAnnamalai Gurusami <annamalai.gurusami@oracle.com>
    Reviewed-by: Guilhem Bichot <guilhem.bichot@oracle.com> (only */subquery.*)
    RB: 8034
    (cherry picked from
    commit 4452052ea9c8ef14239ab64ceb8b2b5da2c4ea5a and
    commit f60a0b0b5c6d9e2608ba4caa8fc2076cfad09438 and
    and adjustment to opt_hints.result)
Loading