Skip to content
  • Erik Froseth's avatar
    3e1f79bc
    WL#13377 Add support for hash outer, anti and semi join [8/8, outer join] · 3e1f79bc
    Erik Froseth authored
    This patch implements support for hash outer join.
    
    The right table is assigned as the build input, using the join attribute as the
    hash table key. The left table is then assigned as the probe input. For each row
    from the probe input, we do a lookup in the hash table using the join attribute.
    For each matching row found in the hash table, output the joined row. If no
    matching row is found, a null-complemented row is returned.
    
    If the hash join spills to disk, we cannot output NULL-complemented rows before
    the probe row is read back from disk; a probe row can match a row from the build
    input we haven't seen yet (it's been written out to disk because the hash table
    was full). So when writing a probe row out to disk, we add one byte in front of
    the row that works as a match flag. When we read the probe row back from chunk
    file later, we can output a NULL-complemented row _only_ if the probe row did
    not have a matching row in the hash table before it was written out to disk.
    
    Mapping existing execution plans to hash outer join is tricky due to the way
    MySQL sets up outer join plans. Consider the following query:
    
      SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.i = t3.i) ON t2.i = t1.i;
    
    Traditionally, MySQL tries to collect conditions in multi-equality conditions.
    For the query above, it means that t1.i, t2.i and t3.i will be collected in
    one single multi-equality condition during optimization, namely
    (t1.i = t2.i = t3.i). When creating the query plan, the optimizer will then
    create equality conditions (x = y) by traversing the multiple equality condition
    (t1.i, t2.i, t3.i) and combining each of the fields with a field from the
    table the optimizer think it will evaluate first (in the old executor with
    nested loop, this is the same as the first table in QEP_TAB order).
    As a result, the above query ends up looking like this after optimization:
    
      SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t1.i = t3.i) ON t2.i = t1.i;
    
    Note that t2.i = t3.i has been changed to t1.i = t3.i. In a nested loop, this
    works just fine (the executor will read the tables from left to right in the
    order t1, t2, t3). But with hash join, this breaks down; the join condition for
    't2 LEFT JOIN t3' will refer to the unavailable table 't1':
    
           left hash join <-- t2.i = t1.i
           |            |
          t1       left hash join <-- t1.i = t3.i !!
                   |            |
                  t2            t3
    
    Note that if a user tries to execute the rewritten query above, the server will
    return "Unknown column 't1.i'". In order to reverse this, we must go through
    each join condition and see if we have any fields that refers to tables we
    cannot reach. Each unreachable field must be replaced with a field from the
    multi-equality that the hash join actually can reach. Since the multi-equality
    is lost at this point, this patch now saves these so that we can perform this
    operation when hash join is set up. Note that this can also happen even if the
    column is wrapped inside a function, such as '... ON ABS(t2.i) = t3.i'.
    
    Traditional EXPLAIN and EXPLAIN FORMAT=json sends a normalized version of the
    query as a warning to the client. The query is printed after the iterator tree
    is constructed, meaning that the normalized query will contain the reversal of
    column substitution that happens when creating a hash join iterator. As a result
    of this, some tests in MTR will print different join columns if the query is to
    be executed by hash join.
    
    Change-Id: I138bf992ef5dfc083ce9f91a0ab126b35855bb0b
    3e1f79bc
    WL#13377 Add support for hash outer, anti and semi join [8/8, outer join]
    Erik Froseth authored
    This patch implements support for hash outer join.
    
    The right table is assigned as the build input, using the join attribute as the
    hash table key. The left table is then assigned as the probe input. For each row
    from the probe input, we do a lookup in the hash table using the join attribute.
    For each matching row found in the hash table, output the joined row. If no
    matching row is found, a null-complemented row is returned.
    
    If the hash join spills to disk, we cannot output NULL-complemented rows before
    the probe row is read back from disk; a probe row can match a row from the build
    input we haven't seen yet (it's been written out to disk because the hash table
    was full). So when writing a probe row out to disk, we add one byte in front of
    the row that works as a match flag. When we read the probe row back from chunk
    file later, we can output a NULL-complemented row _only_ if the probe row did
    not have a matching row in the hash table before it was written out to disk.
    
    Mapping existing execution plans to hash outer join is tricky due to the way
    MySQL sets up outer join plans. Consider the following query:
    
      SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t2.i = t3.i) ON t2.i = t1.i;
    
    Traditionally, MySQL tries to collect conditions in multi-equality conditions.
    For the query above, it means that t1.i, t2.i and t3.i will be collected in
    one single multi-equality condition during optimization, namely
    (t1.i = t2.i = t3.i). When creating the query plan, the optimizer will then
    create equality conditions (x = y) by traversing the multiple equality condition
    (t1.i, t2.i, t3.i) and combining each of the fields with a field from the
    table the optimizer think it will evaluate first (in the old executor with
    nested loop, this is the same as the first table in QEP_TAB order).
    As a result, the above query ends up looking like this after optimization:
    
      SELECT * FROM t1 LEFT JOIN (t2 LEFT JOIN t3 ON t1.i = t3.i) ON t2.i = t1.i;
    
    Note that t2.i = t3.i has been changed to t1.i = t3.i. In a nested loop, this
    works just fine (the executor will read the tables from left to right in the
    order t1, t2, t3). But with hash join, this breaks down; the join condition for
    't2 LEFT JOIN t3' will refer to the unavailable table 't1':
    
           left hash join <-- t2.i = t1.i
           |            |
          t1       left hash join <-- t1.i = t3.i !!
                   |            |
                  t2            t3
    
    Note that if a user tries to execute the rewritten query above, the server will
    return "Unknown column 't1.i'". In order to reverse this, we must go through
    each join condition and see if we have any fields that refers to tables we
    cannot reach. Each unreachable field must be replaced with a field from the
    multi-equality that the hash join actually can reach. Since the multi-equality
    is lost at this point, this patch now saves these so that we can perform this
    operation when hash join is set up. Note that this can also happen even if the
    column is wrapped inside a function, such as '... ON ABS(t2.i) = t3.i'.
    
    Traditional EXPLAIN and EXPLAIN FORMAT=json sends a normalized version of the
    query as a warning to the client. The query is printed after the iterator tree
    is constructed, meaning that the normalized query will contain the reversal of
    column substitution that happens when creating a hash join iterator. As a result
    of this, some tests in MTR will print different join columns if the query is to
    be executed by hash join.
    
    Change-Id: I138bf992ef5dfc083ce9f91a0ab126b35855bb0b
Loading