Monday 3 September 2018

Why MySQL does not use an index in the WHERE IF clause?

Because of this setting:

mysql> show global variables like '%indexes';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| log_queries_not_using_indexes | ON    |
+-------------------------------+-------+

The slow queries log keep receiving:
# Time: 120607 16:58:30
# User@Host: xbtit[xbtit] @  [123.30.53.244]
# Query_time: 0  Lock_time: 0  Rows_sent: 1  Rows_examined: 16006
SELECT * FROM xbtit_files WHERE IF(soha_id is null OR soha_id = '', info_hash, soha_id)='6d63dd4ab199190b531752067414d4d6e6568f90';

Trying to explain this query:
mysql> EXPLAIN SELECT * FROM xbtit_files WHERE IF(soha_id is null OR soha_id = '', info_hash, soha_id)='6d63dd4ab199190b531752067414d4d6e6568f90';
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table       | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | xbtit_files | ALL  | NULL          | NULL | NULL    | NULL | 16006 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+

What surprised me is why MySQL not using indexes:
mysql> show index from xbtit_files;
+-------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table       | Non_unique | Key_name  | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| xbtit_files |          0 | PRIMARY   |            1 | info_hash   | A         |       16006 |     NULL | NULL   |      | BTREE      |         |
| xbtit_files |          1 | filename  |            1 | filename    | A         |       16006 |     NULL | NULL   | YES  | BTREE      |         |
| xbtit_files |          1 | category  |            1 | category    | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| xbtit_files |          1 | uploader  |            1 | uploader    | A         |          16 |     NULL | NULL   |      | BTREE      |         |
| xbtit_files |          1 | bin_hash  |            1 | bin_hash    | A         |       16006 |       20 | NULL   |      | BTREE      |         |
| xbtit_files |          1 | ix_sohaid |            1 | soha_id     | A         |       16006 |     NULL | NULL   | YES  | BTREE      |         |
+-------------+------------+-----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

FORCE INDEX also doesn't work:
mysql> EXPLAIN SELECT * FROM xbtit_files force index (PRIMARY) WHERE IF(soha_id is null OR soha_id = '', info_hash, soha_id)='6d63dd4ab199190b531752067414d4d6e6568f90';
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table       | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | xbtit_files | ALL  | NULL          | NULL | NULL    | NULL | 16006 | Using where |
+----+-------------+-------------+------+---------------+------+---------+------+-------+-------------+

Must I split this query into 2 operations?

In MySQL, you cannot create indexes on expressions, and the optimizer is not smart enough to split your query against two indexes.
Use this:
SELECT  *
FROM    xbtit_files
WHERE   soha_id = '6d63dd4ab199190b531752067414d4d6e6568f90'
UNION ALL
SELECT  *
FROM    xbtit_files
WHERE   soha_id = ''
        AND info_hash = '6d63dd4ab199190b531752067414d4d6e6568f90'
UNION ALL
SELECT  *
FROM    xbtit_files
WHERE   soha_id IS NULL
        AND info_hash = '6d63dd4ab199190b531752067414d4d6e6568f90'

Each query uses its own index.
You can just combine it into a single query:
SELECT  *
FROM    xbtit_files
WHERE   (
        soha_id = '6d63dd4ab199190b531752067414d4d6e6568f90'
        OR
        (soha_id = '' AND info_hash = '6d63dd4ab199190b531752067414d4d6e6568f90')
        OR
        (soha_id IS NULL AND info_hash = '6d63dd4ab199190b531752067414d4d6e6568f90')
        )

and create a composit index on (soha_id, info_hash) for this to work fast.
MySQL is also able to merge results from two indexes together, using index_merge, so there is a chance you would see this in the plan for the second query even if you don't create a composite index.

0 comments:

Post a Comment