Skip to content
  • Sreeharsha Ramanavarapu's avatar
    4736e3b6
    Bug #28086754: OPTIMIZER SKIP THE RANGE SCAN ON SECOND · 4736e3b6
    Sreeharsha Ramanavarapu authored
                   COLUMN IN A COMPOSITE INDEX
    
    Issue:
    ------
    
    CREATE TABLE test_ref (
     a INT NOT NULL,
     b VARCHAR(20),
     c VARCHAR(20),
     d VARCHAR(3),
     id INT,
     PRIMARY KEY (a),
     KEY idx1 (id, c),
     KEY idx2 (id, d));
    
    SELECT *
    FROM test_ref
    WHERE id = 3 and c LIKE 'gh%'
    ORDER BY c LIMIT 1;
    
    
    On 5.7:
    -------
    idx1 with a range access is best option for this. idx1
    covers both the conditions in the WHERE clause and will
    support ordering since column "c" is part of the index.
    But the optimizer choose ref access idx1.
    
    Reasons for this (reproducible only with data from user) :
    1) Table scan is calculated as cheaper than range access on
       idx1. So using range access is ruled out.
    2) The only other option is ref-access on idx1.
    3) Later stage switch back to range access on the same
       index (can_switch_from_ref_to_range) isn't possible
       because range access was rejected altogether in Step 1.
    
    The problem with ref-access is that it uses only one
    keypart of idx1 -> the column "id". This makes the query
    execution slower.
    
    On trunk:
    ---------
    1) Table scan is cheaper than range access on idx1. Range
       access on idx1 is rejected.
    2) But since range scan on idx1 uses more key parts than
       ref, the ref-access on idx1 is also rejected.
    3) Initially optimizer chooses a ref-access on idx2.
    4) While evaluating options for "ORDER BY" it notices that
       idx1 provides ordering and also doesn't change the
       current access method (ref). At this stage the optimizer
       is unaware of the reasons for rejecting ref-access on
       idx1 earlier (Step 2) and chooses it.
    
    Solution On both 5.7 and trunk:
    -------------------------------
    In test_quick_select() add a new parameter to ignore the
    cost of table scan. Range optimizer on the relevant key
    will be re-run by ignoring the cost of table scan (since
    it was calculated to be cheaper).
    
    
    On 5.7:
    -------
    If the "dodgy_ref_cost" flag is set, that is
    ref-access might be using less key-parts than range but was
    calculated to be cheaper, then re-run the range optimizer
    and check the same.
    
    On trunk:
    ---------
    When a ref-access is chosen, check if it is possible to
    switch to range-access if more keyparts will be used. To
    achieve this call can_switch_from_ref_to_range() from
    test_if_skip_sort_order() when index for ref-access has
    changed.
    
    In can_switch_from_ref_to_range(), add a new parameter to
    force the switch to range even if it was rejected
    previously because a table scan is cheaper.
    4736e3b6
    Bug #28086754: OPTIMIZER SKIP THE RANGE SCAN ON SECOND
    Sreeharsha Ramanavarapu authored
                   COLUMN IN A COMPOSITE INDEX
    
    Issue:
    ------
    
    CREATE TABLE test_ref (
     a INT NOT NULL,
     b VARCHAR(20),
     c VARCHAR(20),
     d VARCHAR(3),
     id INT,
     PRIMARY KEY (a),
     KEY idx1 (id, c),
     KEY idx2 (id, d));
    
    SELECT *
    FROM test_ref
    WHERE id = 3 and c LIKE 'gh%'
    ORDER BY c LIMIT 1;
    
    
    On 5.7:
    -------
    idx1 with a range access is best option for this. idx1
    covers both the conditions in the WHERE clause and will
    support ordering since column "c" is part of the index.
    But the optimizer choose ref access idx1.
    
    Reasons for this (reproducible only with data from user) :
    1) Table scan is calculated as cheaper than range access on
       idx1. So using range access is ruled out.
    2) The only other option is ref-access on idx1.
    3) Later stage switch back to range access on the same
       index (can_switch_from_ref_to_range) isn't possible
       because range access was rejected altogether in Step 1.
    
    The problem with ref-access is that it uses only one
    keypart of idx1 -> the column "id". This makes the query
    execution slower.
    
    On trunk:
    ---------
    1) Table scan is cheaper than range access on idx1. Range
       access on idx1 is rejected.
    2) But since range scan on idx1 uses more key parts than
       ref, the ref-access on idx1 is also rejected.
    3) Initially optimizer chooses a ref-access on idx2.
    4) While evaluating options for "ORDER BY" it notices that
       idx1 provides ordering and also doesn't change the
       current access method (ref). At this stage the optimizer
       is unaware of the reasons for rejecting ref-access on
       idx1 earlier (Step 2) and chooses it.
    
    Solution On both 5.7 and trunk:
    -------------------------------
    In test_quick_select() add a new parameter to ignore the
    cost of table scan. Range optimizer on the relevant key
    will be re-run by ignoring the cost of table scan (since
    it was calculated to be cheaper).
    
    
    On 5.7:
    -------
    If the "dodgy_ref_cost" flag is set, that is
    ref-access might be using less key-parts than range but was
    calculated to be cheaper, then re-run the range optimizer
    and check the same.
    
    On trunk:
    ---------
    When a ref-access is chosen, check if it is possible to
    switch to range-access if more keyparts will be used. To
    achieve this call can_switch_from_ref_to_range() from
    test_if_skip_sort_order() when index for ref-access has
    changed.
    
    In can_switch_from_ref_to_range(), add a new parameter to
    force the switch to range even if it was rejected
    previously because a table scan is cheaper.
Loading