sql/sql_executor.cc · mysql-cluster-7.6.9 · Rasoul Jahanshahi / Mysql Server

Oct 12, 2018

Bug #28086754: OPTIMIZER SKIP THE RANGE SCAN ON SECOND · 4736e3b6

Sreeharsha Ramanavarapu authored Oct 12, 2018

               COLUMN IN A COMPOSITE INDEX

Issue:
------

CREATE TABLE test_ref (
 a INT NOT NULL,
 b VARCHAR(20),
 c VARCHAR(20),
 d VARCHAR(3),
 id INT,
 PRIMARY KEY (a),
 KEY idx1 (id, c),
 KEY idx2 (id, d));

SELECT *
FROM test_ref
WHERE id = 3 and c LIKE 'gh%'
ORDER BY c LIMIT 1;


On 5.7:
-------
idx1 with a range access is best option for this. idx1
covers both the conditions in the WHERE clause and will
support ordering since column "c" is part of the index.
But the optimizer choose ref access idx1.

Reasons for this (reproducible only with data from user) :
1) Table scan is calculated as cheaper than range access on
   idx1. So using range access is ruled out.
2) The only other option is ref-access on idx1.
3) Later stage switch back to range access on the same
   index (can_switch_from_ref_to_range) isn't possible
   because range access was rejected altogether in Step 1.

The problem with ref-access is that it uses only one
keypart of idx1 -> the column "id". This makes the query
execution slower.

On trunk:
---------
1) Table scan is cheaper than range access on idx1. Range
   access on idx1 is rejected.
2) But since range scan on idx1 uses more key parts than
   ref, the ref-access on idx1 is also rejected.
3) Initially optimizer chooses a ref-access on idx2.
4) While evaluating options for "ORDER BY" it notices that
   idx1 provides ordering and also doesn't change the
   current access method (ref). At this stage the optimizer
   is unaware of the reasons for rejecting ref-access on
   idx1 earlier (Step 2) and chooses it.

Solution On both 5.7 and trunk:
-------------------------------
In test_quick_select() add a new parameter to ignore the
cost of table scan. Range optimizer on the relevant key
will be re-run by ignoring the cost of table scan (since
it was calculated to be cheaper).


On 5.7:
-------
If the "dodgy_ref_cost" flag is set, that is
ref-access might be using less key-parts than range but was
calculated to be cheaper, then re-run the range optimizer
and check the same.

On trunk:
---------
When a ref-access is chosen, check if it is possible to
switch to range-access if more keyparts will be used. To
achieve this call can_switch_from_ref_to_range() from
test_if_skip_sort_order() when index for ref-access has
changed.

In can_switch_from_ref_to_range(), add a new parameter to
force the switch to range even if it was rejected
previously because a table scan is cheaper.

4736e3b6

Bug #28086754: OPTIMIZER SKIP THE RANGE SCAN ON SECOND

Sreeharsha Ramanavarapu authored Oct 12, 2018

               COLUMN IN A COMPOSITE INDEX

Issue:
------

CREATE TABLE test_ref (
 a INT NOT NULL,
 b VARCHAR(20),
 c VARCHAR(20),
 d VARCHAR(3),
 id INT,
 PRIMARY KEY (a),
 KEY idx1 (id, c),
 KEY idx2 (id, d));

SELECT *
FROM test_ref
WHERE id = 3 and c LIKE 'gh%'
ORDER BY c LIMIT 1;


On 5.7:
-------
idx1 with a range access is best option for this. idx1
covers both the conditions in the WHERE clause and will
support ordering since column "c" is part of the index.
But the optimizer choose ref access idx1.

Reasons for this (reproducible only with data from user) :
1) Table scan is calculated as cheaper than range access on
   idx1. So using range access is ruled out.
2) The only other option is ref-access on idx1.
3) Later stage switch back to range access on the same
   index (can_switch_from_ref_to_range) isn't possible
   because range access was rejected altogether in Step 1.

The problem with ref-access is that it uses only one
keypart of idx1 -> the column "id". This makes the query
execution slower.

On trunk:
---------
1) Table scan is cheaper than range access on idx1. Range
   access on idx1 is rejected.
2) But since range scan on idx1 uses more key parts than
   ref, the ref-access on idx1 is also rejected.
3) Initially optimizer chooses a ref-access on idx2.
4) While evaluating options for "ORDER BY" it notices that
   idx1 provides ordering and also doesn't change the
   current access method (ref). At this stage the optimizer
   is unaware of the reasons for rejecting ref-access on
   idx1 earlier (Step 2) and chooses it.

Solution On both 5.7 and trunk:
-------------------------------
In test_quick_select() add a new parameter to ignore the
cost of table scan. Range optimizer on the relevant key
will be re-run by ignoring the cost of table scan (since
it was calculated to be cheaper).


On 5.7:
-------
If the "dodgy_ref_cost" flag is set, that is
ref-access might be using less key-parts than range but was
calculated to be cheaper, then re-run the range optimizer
and check the same.

On trunk:
---------
When a ref-access is chosen, check if it is possible to
switch to range-access if more keyparts will be used. To
achieve this call can_switch_from_ref_to_range() from
test_if_skip_sort_order() when index for ref-access has
changed.

In can_switch_from_ref_to_range(), add a new parameter to
force the switch to range even if it was rejected
previously because a table scan is cheaper.