Since MySQL 5.7 one can put indexes on expressions, aka functional indexes, using
generated columns. Basically you first need to use the generated column to define the functional expression, then indexed this column.
Quite useful when dealing with JSON functions, you can find an example
here and the documentation
there.
Let’s see how with a quick practical example.
Below salaries table structure:
1
2
3
4
5
6
7
8
9
10
11
12
|
mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
Table: salaries
Create Table: CREATE TABLE `salaries` (
`emp_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`emp_no`,`from_date`),
CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0,00 sec)
|
It contains some data
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
mysql> SELECT count(*) FROM salaries;
+----------+
| count(*) |
+----------+
| 2844047 |
+----------+
mysql> SELECT * FROM salaries LIMIT 3;
+--------+--------+------------+------------+
| emp_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 10001 | 60117 | 1986-06-26 | 1987-06-26 |
| 10001 | 62102 | 1987-06-26 | 1988-06-25 |
| 10001 | 66074 | 1988-06-25 | 1989-06-25 |
+--------+--------+------------+------------+
|
Let’s focus on the following query:
SELECT * FROM salaries WHERE YEAR(to_date)=1985
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985;
+--------+--------+------------+------------+
| emp_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 14688 | 42041 | 1985-07-06 | 1985-08-08 |
...snip...
| 498699 | 40000 | 1985-09-25 | 1985-09-28 |
+--------+--------+------------+------------+
89 rows in set (0,80 sec)
mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: Using where
|
We have a full table scan (
type: ALL
), meaning no index is used. Perhaps because there is no index on column
to_date…
So let’s add an index on
to_date !
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
mysql> ALTER TABLE salaries ADD INDEX idx_to_date (to_date);
Query OK, 0 rows affected (17,13 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
Table: salaries
Create Table: CREATE TABLE `salaries` (
`emp_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`emp_no`,`from_date`),
KEY `idx_to_date` (`to_date`),
CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
|
And run again the query with the hope of a better execution plan
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 2838426
filtered: 100.00
Extra: Using where
|
Ouch! Still have a full table scan !
The index can’t be used because of the use of a function (
YEAR()) on the indexed column (
to_date).
BTW if you’re really surprise, maybe you should read
this.
This is the case when you need a functional index!
|
mysql> ALTER TABLE salaries ADD INDEX idx_year_to_date((YEAR(to_date)));
Query OK, 0 rows affected (20,04 sec)
Records: 0 Duplicates: 0 Warnings: 0
|
The syntax is very similar of the creation of a “regular” index. Although you must be aware of the double parentheses: (( <expression> ))
We can now see our new index named idx_year_to_date and the indexed expression year(to_date) :
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
mysql> SHOW CREATE TABLE salaries\G
*************************** 1. row ***************************
Table: salaries
Create Table: CREATE TABLE `salaries` (
`emp_no` int(11) NOT NULL,
`salary` int(11) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`emp_no`,`from_date`),
KEY `idx_to_date` (`to_date`),
KEY `idx_year_to_date` ((year(`to_date`))),
CONSTRAINT `salaries_ibfk_1` FOREIGN KEY (`emp_no`) REFERENCES `employees` (`emp_no`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
mysql> SELECT INDEX_NAME, EXPRESSION
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA='employees'
AND TABLE_NAME = "salaries"
AND INDEX_NAME='idx_year_to_date';
+------------------+-----------------+
| INDEX_NAME | EXPRESSION |
+------------------+-----------------+
| idx_year_to_date | year(`to_date`) |
+------------------+-----------------+
|
Let’s test our query again
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
mysql> explain SELECT * FROM salaries WHERE YEAR(to_date)=1985\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: ref
possible_keys: idx_year_to_date
key: idx_year_to_date
key_len: 5
ref: const
rows: 89
filtered: 100.00
Extra: NULL
mysql> SELECT * FROM salaries WHERE YEAR(to_date)=1985;
+--------+--------+------------+------------+
| emp_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 14688 | 42041 | 1985-07-06 | 1985-08-08 |
...snip...
| 498699 | 40000 | 1985-09-25 | 1985-09-28 |
+--------+--------+------------+------------+
89 rows in set (0,00 sec)
|
Here we go!
Now the query is able to use the index. And in this case we have a positive impact on the execution time.
It is also interesting to note that it is possible to use idx_to_date, the first index created (the non functional one) if we can rewrite the original query:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
|
mysql> EXPLAIN SELECT *
FROM salaries
WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: salaries
partitions: NULL
type: range
possible_keys: idx_to_date
key: idx_to_date
key_len: 3
ref: NULL
rows: 89
filtered: 100.00
Extra: Using index condition
mysql> SELECT *
FROM salaries
WHERE to_date BETWEEN '1985-01-01' AND '1985-12-31'
+--------+--------+------------+------------+
| emp_no | salary | from_date | to_date |
+--------+--------+------------+------------+
| 20869 | 40000 | 1985-02-17 | 1985-03-01 |
...snip...
| 45012 | 66889 | 1985-08-16 | 1985-12-31 |
+--------+--------+------------+------------+
89 rows in set (0,00 sec)
|
This saves an index, I mean less indexes to maintain for the engine. Also speaking of maintenance cost, the cost to maintain a functional index is higher than the cost of a regular one.
In the other side the execution plan is less good (query cost higher) and obviously you must rewrite the query.
Requirements and restrictions.
A primary key cannot be a functional index:
|
mysql> CREATE TABLE t1 (i INT, PRIMARY KEY ((ABS(i))));
ERROR 3756 (HY000): The primary key cannot be a functional index
|
You can not index non-deterministic functions (RAND(), UNIX_TIMESTAMP(), NOW()…)
|
mysql> CREATE TABLE t1 (i int, KEY ((RAND(i))));
ERROR 3758 (HY000): Expression of functional index 'functional_index' contains a disallowed function.
|
SPATIAL and FULLTEXT indexes cannot have functional key parts.
Conclusion
Functional index is an interesting and a relevant feature, it could be very useful to optimize your queries without rewrite them and especially when dealing with
JSON documents and other complex types.
Obviously all the details you must know are in the MySQL documentation:
Functional Key PartsIf you interested in the
high level architecture and the
low level design please read the
workload.
0 comments:
Post a Comment