Skip to content
  • Vasil Dimov's avatar
    2c247a60
    Fix Bug#20427694 RECORDS_IN_RANGE IS +/-1 AGNOSTIC · 2c247a60
    Vasil Dimov authored
    
    
    The code in InnoDB that estimates the records in a given range
    (btr_estimate_n_rows_in_range()) is agnostic to off-by-one errors because
    it returns an estimate. Indeed +/- 1 is irrelevant if 4587 or 4586 is
    returned when the actual value is 4600 for example. This is fine - it is
    expected that the returned value is an approximation.
    
    The problem arises from the fact that different codepaths handle the
    range boundaries differently (include the boundary or not). One of those
    codepaths handles the case then the range fits entirely in one page and
    another when the range spans across multiple pages. So, for the same
    dataset we could get different results (+/-1) depending on the page size -
    if for a big page size the range fits in one page and for a smaller page
    size it spans across multiple pages.
    
    The solution is to tune ha_innobase::records_in_range() and
    btr_estimate_n_rows_in_range() to honor the type of the boundaries of the
    range - e.g. open, closed or unbounded from the left/right range and always
    return the exact number of rows in the range if all the pages in the range
    were sampled (see N_PAGES_READ_LIMIT).
    
    Reviewed-by: default avatarAnnamalai Gurusami <annamalai.gurusami@oracle.com>
    Reviewed-by: Guilhem Bichot <guilhem.bichot@oracle.com> (only */subquery.*)
    RB: 8034
    2c247a60
    Fix Bug#20427694 RECORDS_IN_RANGE IS +/-1 AGNOSTIC
    Vasil Dimov authored
    
    
    The code in InnoDB that estimates the records in a given range
    (btr_estimate_n_rows_in_range()) is agnostic to off-by-one errors because
    it returns an estimate. Indeed +/- 1 is irrelevant if 4587 or 4586 is
    returned when the actual value is 4600 for example. This is fine - it is
    expected that the returned value is an approximation.
    
    The problem arises from the fact that different codepaths handle the
    range boundaries differently (include the boundary or not). One of those
    codepaths handles the case then the range fits entirely in one page and
    another when the range spans across multiple pages. So, for the same
    dataset we could get different results (+/-1) depending on the page size -
    if for a big page size the range fits in one page and for a smaller page
    size it spans across multiple pages.
    
    The solution is to tune ha_innobase::records_in_range() and
    btr_estimate_n_rows_in_range() to honor the type of the boundaries of the
    range - e.g. open, closed or unbounded from the left/right range and always
    return the exact number of rows in the range if all the pages in the range
    were sampled (see N_PAGES_READ_LIMIT).
    
    Reviewed-by: default avatarAnnamalai Gurusami <annamalai.gurusami@oracle.com>
    Reviewed-by: Guilhem Bichot <guilhem.bichot@oracle.com> (only */subquery.*)
    RB: 8034
Loading