Skip to content
  • Darshan M N's avatar
    b1dccffc
    WL#8777 InnoDB: Support for sampling table data for generating histograms · b1dccffc
    Darshan M N authored
    
    
    Description
    -----------
    
    Histogram support was provided by WL#8705 but this was without any support
    from InnoDB. The sampling was done by reading the entire table regardless of
    the sampling percentage which meant that for large tables the sampling was
    costly even when only a small part of the table needed to be sampled.
    
    This worklog provides the required support from InnoDB to read only the
    required sampling percentage of data. For ex, for SYSTEM sampling (which is
    the only supported type of sampling currently), if the required sampling
    percentage is 10, then we'd read (all the records in the page) psuedo-randomly
    only 10 leaf pages if there are about 100 leaf pages present in the table
    ensuring in the process that the data is evenly distributed and the sampling
    is deterministic.
    
    The worklog also introduces two counters sampled_pages_read and
    sampled_pages_skipped in the module "sampling" to track the number of pages
    read and skipped during the sampling.
    
    RB#: 22014
    Reviewed-by: default avatarAnnamalai Gursusami <annamalai.gurusami@oracle.com>
    Reviewed-by: default avatarMayank Prasad <mayank.prasad@oracle.com>
    b1dccffc
    WL#8777 InnoDB: Support for sampling table data for generating histograms
    Darshan M N authored
    
    
    Description
    -----------
    
    Histogram support was provided by WL#8705 but this was without any support
    from InnoDB. The sampling was done by reading the entire table regardless of
    the sampling percentage which meant that for large tables the sampling was
    costly even when only a small part of the table needed to be sampled.
    
    This worklog provides the required support from InnoDB to read only the
    required sampling percentage of data. For ex, for SYSTEM sampling (which is
    the only supported type of sampling currently), if the required sampling
    percentage is 10, then we'd read (all the records in the page) psuedo-randomly
    only 10 leaf pages if there are about 100 leaf pages present in the table
    ensuring in the process that the data is evenly distributed and the sampling
    is deterministic.
    
    The worklog also introduces two counters sampled_pages_read and
    sampled_pages_skipped in the module "sampling" to track the number of pages
    read and skipped during the sampling.
    
    RB#: 22014
    Reviewed-by: default avatarAnnamalai Gursusami <annamalai.gurusami@oracle.com>
    Reviewed-by: default avatarMayank Prasad <mayank.prasad@oracle.com>
Loading