-
Guilhem Bichot authored
Bug#29027883 INCORRECT RESULT OF LEFT JOIN Bug#29540230 LEFT JOIN + IS NOT TRUE CONDITION YIELDS WRONG RESULT The goal of this WL is to rewrite SELECT ... FROM ot WHERE oe NOT IN (SELECT ie FROM it1, it2 ...) to SELECT ... FROM ot ANTIJOIN (it1, it2 ...) ON oe=ie; ("merging" the subquery). It is like a semijoin except that rows of "ot" which have a match are rejected. The antijoin is internally handled as a special LEFT JOIN: SELECT ... FROM ot LEFT JOIN (it1, it2 ...) ON oe=ie WHERE right-side row IS NULL COMPLEMENTED; So a post-filter condition, "IS NULL COMPLEMENTED", is added to it1. As an optimization, a "Not Exists" tag is added to "it1": when a row of it1 has matched there is no need to test other rows of it. We also have the possibility to use Materialization-lookup: (it1,it2) is stored into a tmp table, which gets the IS NULL COMPLEMENTED condition and the "Not Exists" tag. Duplicate weedout is also possible. Loosescan and Materialization-scan are not because a row of 'ot' must always be emitted. Antijoin is not possible, if there are NULLs on either side of the NOT IN. For NOT EXISTS it's always possible, as it's equivalent to WHERE 1 NOT IN (SELECT 1 FROM ...). antijoin is visible in the rewritten query shown by EXPLAIN, in the optimizer trace, and through the "Not Exists" tag on the table in EXPLAIN. Two bugs found during development and posing problems were fixed (references above). Good extras: - Materialization-lookup strategy was previously forbidden for semijoins inside LEFT JOIN, now it is allowed. - WHERE oe IN (SELECT ie) IS TRUE didn't allow semijoin (as IN wasn't seen as top-level), now it does - WHERE oe IN (SELECT ie) IS NOT TRUE, allows antijoin (NULLs cause no problems then). - WHERE oe IN (SELECT ie) IS FALSE allow antijoin if there are no NULLables. A limitation: semijoin/antijoin are not done if containing an antijoin (they thus remain subqueries). top_level_item() is renamed to apply_is_true(). is_top_level_item() is renamed to ignore_unknown(). Item_func_true/false are deleted, their logic goes into parent Item_func_truth. The Bool_test enum (BOOL_IS_TRUE/etc) is used in Item_func_truth instead of boolean members. IS NULL and IS NOT NULL are not included into this change. The code converting NOT_NULLABLE_TEMPORAL_COLUMN IS NULL to NOT_NULLABLE_TEMPORAL_COLUMN='0000-00-00' is moved from optimization phase to Item_func_isnull::fix_fields. neg_transformer()s are changed to truth_transformer()s: on top of being able to push down NOT, we can also push down IS [NOT] TRUE/FALSE. And such transformers are added to Item_func_truth. So NOT (x IS TRUE) becomes x IS NOT TRUE (no NOT item) And (expr IS TRUE) IS FALSE becomes expr IS NOT TRUE (no IS FALSE item). This reduces the depth of the Item tree. If an item has no truth_transformer(), as a fallback it gets wrapped into a real Item_func_not/Item_func_truth. Thanks to the truth_transformer() Item_in_subselect gets the signal that it's, for example, inside IS NOT TRUE (and can thus do antijoin). This is done by a new function change_truth_value_of_expression(); which also does the job of negate_condition() which is suppressed from code. It calls truth_transformer(), which, likewise, also does the job of neg_transformer() which is suppressed. "::print()" functions of subquery items now take care of printing the "NOT" and "IS etc" that they absorbed. COALESCE(x_1,x_2,...) is marked as non-nullable as long as one x_i is not nullable. Approved by: Roy Lyseng <roy.lyseng@oracle.com>
Guilhem Bichot authoredBug#29027883 INCORRECT RESULT OF LEFT JOIN Bug#29540230 LEFT JOIN + IS NOT TRUE CONDITION YIELDS WRONG RESULT The goal of this WL is to rewrite SELECT ... FROM ot WHERE oe NOT IN (SELECT ie FROM it1, it2 ...) to SELECT ... FROM ot ANTIJOIN (it1, it2 ...) ON oe=ie; ("merging" the subquery). It is like a semijoin except that rows of "ot" which have a match are rejected. The antijoin is internally handled as a special LEFT JOIN: SELECT ... FROM ot LEFT JOIN (it1, it2 ...) ON oe=ie WHERE right-side row IS NULL COMPLEMENTED; So a post-filter condition, "IS NULL COMPLEMENTED", is added to it1. As an optimization, a "Not Exists" tag is added to "it1": when a row of it1 has matched there is no need to test other rows of it. We also have the possibility to use Materialization-lookup: (it1,it2) is stored into a tmp table, which gets the IS NULL COMPLEMENTED condition and the "Not Exists" tag. Duplicate weedout is also possible. Loosescan and Materialization-scan are not because a row of 'ot' must always be emitted. Antijoin is not possible, if there are NULLs on either side of the NOT IN. For NOT EXISTS it's always possible, as it's equivalent to WHERE 1 NOT IN (SELECT 1 FROM ...). antijoin is visible in the rewritten query shown by EXPLAIN, in the optimizer trace, and through the "Not Exists" tag on the table in EXPLAIN. Two bugs found during development and posing problems were fixed (references above). Good extras: - Materialization-lookup strategy was previously forbidden for semijoins inside LEFT JOIN, now it is allowed. - WHERE oe IN (SELECT ie) IS TRUE didn't allow semijoin (as IN wasn't seen as top-level), now it does - WHERE oe IN (SELECT ie) IS NOT TRUE, allows antijoin (NULLs cause no problems then). - WHERE oe IN (SELECT ie) IS FALSE allow antijoin if there are no NULLables. A limitation: semijoin/antijoin are not done if containing an antijoin (they thus remain subqueries). top_level_item() is renamed to apply_is_true(). is_top_level_item() is renamed to ignore_unknown(). Item_func_true/false are deleted, their logic goes into parent Item_func_truth. The Bool_test enum (BOOL_IS_TRUE/etc) is used in Item_func_truth instead of boolean members. IS NULL and IS NOT NULL are not included into this change. The code converting NOT_NULLABLE_TEMPORAL_COLUMN IS NULL to NOT_NULLABLE_TEMPORAL_COLUMN='0000-00-00' is moved from optimization phase to Item_func_isnull::fix_fields. neg_transformer()s are changed to truth_transformer()s: on top of being able to push down NOT, we can also push down IS [NOT] TRUE/FALSE. And such transformers are added to Item_func_truth. So NOT (x IS TRUE) becomes x IS NOT TRUE (no NOT item) And (expr IS TRUE) IS FALSE becomes expr IS NOT TRUE (no IS FALSE item). This reduces the depth of the Item tree. If an item has no truth_transformer(), as a fallback it gets wrapped into a real Item_func_not/Item_func_truth. Thanks to the truth_transformer() Item_in_subselect gets the signal that it's, for example, inside IS NOT TRUE (and can thus do antijoin). This is done by a new function change_truth_value_of_expression(); which also does the job of negate_condition() which is suppressed from code. It calls truth_transformer(), which, likewise, also does the job of neg_transformer() which is suppressed. "::print()" functions of subquery items now take care of printing the "NOT" and "IS etc" that they absorbed. COALESCE(x_1,x_2,...) is marked as non-nullable as long as one x_i is not nullable. Approved by: Roy Lyseng <roy.lyseng@oracle.com>
Loading