sql/mysqld.h · mysql-8.0.18 · Rasoul Jahanshahi / Mysql Server

Feb 06, 2019

Erik Froseth authored Feb 06, 2019

This patch adds hash joins to MySQL. We have chosen to implement hash join
with spill to disk if the input(s) become too big to fit into memory. We
start out by trying to do everything in memory. If we run out of memory,
we partition each input into smaller partitions, also called hash join chunk.
Each pair of partitions is then processed as a normal in-memory hash join.
The amount of memory available is controlled by the variable "join_buffer_size".

In this round, we will replace block-nested loop (BNL) (both with and
without join conditions) with hash join whenever possible. That is,
if the join condition is an equi-join where each side of the condition
refers to each side of the join, and the plan can be executed using the
iterator execution engine, we will replace BNL with hash join.
This version of hash join only supports inner joins, so outer joins
will be available at a later stage.

There have been no changes to the join optimizer, so the optimizer is
still making execution plans thinking it is going to execute a BNL. This
should clearly be changed, but that is left to a later patch/worklog.

Note that we are relying on the new iterator execution engine to execute
hash joins. This means that any query that executes using the old
execution engine won't be able to execute hash joins.

We have benchmarked hash join against both block-nested loop and index
lookups. Hash join is faster than BNL in almost all cases, especially
with larger inputs; in many queries, hash join completes in seconds
where BNL is still running after several hours. Compared with
index lookups, we have seen cases in DBT-3 where an execution plan with
hash join is both on par and faster than doing a nested loop with index
lookup. But the differences here are not that big, and there are many
cases where an index lookup is faster than hash join.

Change-Id: Ib3b3daaa8abc93567f99bf36bd1ef2c7633368a7

06d27cbe

WL#2241 Implement hash join

Erik Froseth authored Feb 06, 2019

Note that we are relying on the new iterator execution engine to execute
hash joins. This means that any query that executes using the old
execution engine won't be able to execute hash joins.

Change-Id: Ib3b3daaa8abc93567f99bf36bd1ef2c7633368a7