Skip to content
  • Ole John Aske's avatar
    3852e151
    Fix for BUG#14058167 SELECT DISTINCT .... ORDER BY FAILS TO SORT ROWS AS SPECIFIED · 3852e151
    Ole John Aske authored
    Updated fix after review comments from Guilhem B.
    
    The optimizer today use two different fields to determine if an orderd-index
    may be used to provide rows ordered as required by GROUP BY / ORDER BY.
    
    1) enum
       {
         ordered_index_void,       // No ordered index avail.
         ordered_index_group_by,   // Use index for GROUP BY
         ordered_index_order_by    // Use index for ORDER BY
       } JOIN::ordered_index_usage;
    
      'ordered_index_usage' is set up by the optimizer as a result of test_if_skip_sort_order()
      immediately after we have determined if an ordered index should be used for
      GROUP BY / ORDER BY. It is also this field which is used to EXPLAIN the query plan.
      Once 'ordered_index_usage' is set, it could never later be cleared, and it should be
      obeyed by all later optimizer and execute components.
    
    2) bool JOIN_TAB::sorted
       Is also set by the optimizer, but in a lot more 'fuzzy' order than 'ordered_index_usage'.
       - It is initially set to 'true' for the first non-const JOIN_TAB if the query is either
         grouped or sorted - Without checking if there are any suitable indexes at all.
       - Prior to execution it is cleared by disable_sorted_access() if we will not use
         an ordered index access.
       - Furthermore it is used as argument to handler::ha_index_init() in order to specify
         whether sorted access is required or not. So 'sorted' has to be correct at this point.
    
    Having such fuzzy information as represented in JOIN_TAB::sorted is really bad. It has
    previously worked - sort of - by requesting a lot more 'sorted' access than really needed.
    See: BUG 57601: OPTIMIZER IS OVERLY EAGER TO REQUEST ORDERED ACCESS. and WL 5558.
    
    This bug was caused by a problem where the 'sorted fuzziness' kicked back the other way:
    We cleared the 'JOIN_TAB::sorted' even if 'JOIN::ordered_index_usage' specified
    that an ordered index access should be used.
    
    Furthermore this fix cleans up and entirely removes 'JOIN_TAB::sorted' 
    and instead base all ordered index usage on 'JOIN::ordered_index_usage'.
    
    The utility method JOIN_TAB::use_order() is introduced as a replacement 
    for JOIN_TAB::sorted. (Use ordered_index_usage)
    
    The optimizer is not any longer supposed to re-decide and 'turn off' 
    a previously sorted index. This has made it possible to entirely remove
    disable_sorted_access(), and together with that the bool on/off argument
    to need_sorted_output(bool sort).
    
    An assert has been introduced in the 2 places where we used to call 
    disable_sorted_acess() to check that JOIN_TAB dont specify 'use_order()'
    in those places where we used to disable_sorted_access()
    3852e151
    Fix for BUG#14058167 SELECT DISTINCT .... ORDER BY FAILS TO SORT ROWS AS SPECIFIED
    Ole John Aske authored
    Updated fix after review comments from Guilhem B.
    
    The optimizer today use two different fields to determine if an orderd-index
    may be used to provide rows ordered as required by GROUP BY / ORDER BY.
    
    1) enum
       {
         ordered_index_void,       // No ordered index avail.
         ordered_index_group_by,   // Use index for GROUP BY
         ordered_index_order_by    // Use index for ORDER BY
       } JOIN::ordered_index_usage;
    
      'ordered_index_usage' is set up by the optimizer as a result of test_if_skip_sort_order()
      immediately after we have determined if an ordered index should be used for
      GROUP BY / ORDER BY. It is also this field which is used to EXPLAIN the query plan.
      Once 'ordered_index_usage' is set, it could never later be cleared, and it should be
      obeyed by all later optimizer and execute components.
    
    2) bool JOIN_TAB::sorted
       Is also set by the optimizer, but in a lot more 'fuzzy' order than 'ordered_index_usage'.
       - It is initially set to 'true' for the first non-const JOIN_TAB if the query is either
         grouped or sorted - Without checking if there are any suitable indexes at all.
       - Prior to execution it is cleared by disable_sorted_access() if we will not use
         an ordered index access.
       - Furthermore it is used as argument to handler::ha_index_init() in order to specify
         whether sorted access is required or not. So 'sorted' has to be correct at this point.
    
    Having such fuzzy information as represented in JOIN_TAB::sorted is really bad. It has
    previously worked - sort of - by requesting a lot more 'sorted' access than really needed.
    See: BUG 57601: OPTIMIZER IS OVERLY EAGER TO REQUEST ORDERED ACCESS. and WL 5558.
    
    This bug was caused by a problem where the 'sorted fuzziness' kicked back the other way:
    We cleared the 'JOIN_TAB::sorted' even if 'JOIN::ordered_index_usage' specified
    that an ordered index access should be used.
    
    Furthermore this fix cleans up and entirely removes 'JOIN_TAB::sorted' 
    and instead base all ordered index usage on 'JOIN::ordered_index_usage'.
    
    The utility method JOIN_TAB::use_order() is introduced as a replacement 
    for JOIN_TAB::sorted. (Use ordered_index_usage)
    
    The optimizer is not any longer supposed to re-decide and 'turn off' 
    a previously sorted index. This has made it possible to entirely remove
    disable_sorted_access(), and together with that the bool on/off argument
    to need_sorted_output(bool sort).
    
    An assert has been introduced in the 2 places where we used to call 
    disable_sorted_acess() to check that JOIN_TAB dont specify 'use_order()'
    in those places where we used to disable_sorted_access()
Loading