-
Guilhem Bichot authored
We have a query with WHERE ... NOT IN (SELECT ...). The inner field is 1503 bytes (500 chars * 3, because it has utf8 charset). subquery_allows_materialization() considers subq-mat: per the fix for Bug 17566396: MATERIALIZATION IS NOT CHOSEN FOR LONG UTF8 VARCHAR FIELD, this field is small enough to _not_ be changed to a BLOB when stored into the tmp table (Item::is_blob_field() sees that 500 < CONVERT_IF_BIGGER_TO_BLOB), so subq-mat is chosen. Before that fix, the left operand of the inequality comparison would be the length in _bytes_ and subq-mat would not be chosen. In hash_sj_engine::setup(): create_tmp_table() uses MEMORY table with a regular unique index, without "unique constraint"; right after creation, setup() verifies that the table has no "unique constraint", otherwise subq-mat would fail - see comment in setup() and see how indexsubquery_engine::setup() does a regular index lookup of a key made with copy_ref_key(): to look up into "unique constraint" it would need to compute a checksum in copy_ref_key(). So far so good. At some point during materialization, MEMORY tmp table becomes full, we go into create_myisam_from_heap() then create_myisam_tmp_table(): if (keyinfo->key_length >= table->file->max_key_length() || keyinfo->user_defined_key_parts > table->file->max_key_parts() || share->uniques) { /* Can't create a key; Make a unique constraint instead of a key */ share->keys= 0; share->uniques= 1; using_unique_constraint=1; alas the first condition is 1503 > 1000 (engine is MyISAM) so we create "unique constraint" on the MyISAM table. Subq-mat is not ready for this, as we know, so we get a bad result: in fact, all look ups then return "not found", so "NOT IN" always returns TRUE. In 5.7, subq-mat has been made able to use "unique constraint" (see Bug 18227171 which was part of wl 6711); however this same testcase still crashes but for different reasons, a separate 5.7 patch will be made. For fixing 5.6: - we cannot simply undo the fix for Bug 17566396 as there was an associated customer - on the other hand we cannot modify the logic in create_myisam_tmp_table() to _not_ create a "unique constraint" in this case (it's 5.6, we must be conservative) - proposal: take a middle path: make subquery_allows_materialization() test the length (by extending is_blob_field()): if length is >=1000 bytes, then don't use subq-mat (because there is risk that MyISAM table is used and creates "unique constraint"). 1000 bytes is 333 utf8 characters. So we would have, for the limit of # of chars: - before the fix for Bug 17566396: 512/3 = 170 chars, - after it: 512 chars - after the proposed fix: 332 chars. The testcase in Bug 17566396 used 250 chars so still uses subq-mat, which is good.
Guilhem Bichot authoredWe have a query with WHERE ... NOT IN (SELECT ...). The inner field is 1503 bytes (500 chars * 3, because it has utf8 charset). subquery_allows_materialization() considers subq-mat: per the fix for Bug 17566396: MATERIALIZATION IS NOT CHOSEN FOR LONG UTF8 VARCHAR FIELD, this field is small enough to _not_ be changed to a BLOB when stored into the tmp table (Item::is_blob_field() sees that 500 < CONVERT_IF_BIGGER_TO_BLOB), so subq-mat is chosen. Before that fix, the left operand of the inequality comparison would be the length in _bytes_ and subq-mat would not be chosen. In hash_sj_engine::setup(): create_tmp_table() uses MEMORY table with a regular unique index, without "unique constraint"; right after creation, setup() verifies that the table has no "unique constraint", otherwise subq-mat would fail - see comment in setup() and see how indexsubquery_engine::setup() does a regular index lookup of a key made with copy_ref_key(): to look up into "unique constraint" it would need to compute a checksum in copy_ref_key(). So far so good. At some point during materialization, MEMORY tmp table becomes full, we go into create_myisam_from_heap() then create_myisam_tmp_table(): if (keyinfo->key_length >= table->file->max_key_length() || keyinfo->user_defined_key_parts > table->file->max_key_parts() || share->uniques) { /* Can't create a key; Make a unique constraint instead of a key */ share->keys= 0; share->uniques= 1; using_unique_constraint=1; alas the first condition is 1503 > 1000 (engine is MyISAM) so we create "unique constraint" on the MyISAM table. Subq-mat is not ready for this, as we know, so we get a bad result: in fact, all look ups then return "not found", so "NOT IN" always returns TRUE. In 5.7, subq-mat has been made able to use "unique constraint" (see Bug 18227171 which was part of wl 6711); however this same testcase still crashes but for different reasons, a separate 5.7 patch will be made. For fixing 5.6: - we cannot simply undo the fix for Bug 17566396 as there was an associated customer - on the other hand we cannot modify the logic in create_myisam_tmp_table() to _not_ create a "unique constraint" in this case (it's 5.6, we must be conservative) - proposal: take a middle path: make subquery_allows_materialization() test the length (by extending is_blob_field()): if length is >=1000 bytes, then don't use subq-mat (because there is risk that MyISAM table is used and creates "unique constraint"). 1000 bytes is 333 utf8 characters. So we would have, for the limit of # of chars: - before the fix for Bug 17566396: 512/3 = 170 chars, - after it: 512 chars - after the proposed fix: 332 chars. The testcase in Bug 17566396 used 250 chars so still uses subq-mat, which is good.
Loading