Wednesday, 27 November 2019

Optimization to skip index dives with FORCE INDEX.

Index dives are performed during the optimization phase to help decide which index to use. Currently when user specifies FORCE INDEX, optimizer still always calculates cost using index dives. Under some circumstances it is possible to avoid the index dives and this could speed up execution.

With WL#6526 (present in mysql-8.0.3) optimizer skips index dives when the query has a FORCE INDEX.
What is an index dive?
For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving  down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan.
For the below queries, two index dives will be performed for each range (one each for min and max values in the range).
SELECT * FROM t1 WHERE (c1 > 1 AND c1 < 10) OR (c1 > 10 AND c1 < 20) ;
SELECT * FROM t1 WHERE c1 IN (11, 22) ;
Index dives can be skipped when there is a FORCE INDEX because the choice of index is pre-decided and access method depends on the WHERE clause (ref-access is used only when a single equality condition is present, in all other cases range access is chosen).
Optimizer skips index dives when:
  • Only a single table is accessed in the query.
  • FORCE INDEX applies to a single index.
  • No subquery is present.
  • Fulltext Index is not involved.
  • No GROUP-BY or DISTINCT clause.
  • No ORDER-BY clause.
This optimization currently does not apply to multi-table queries.
Identify whether a query is using this optimization:
  • EXPLAIN:
    • EXPLAIN FORMAT=TRADITIONAL FOR CONNECTION will see the following changes:
      • Number of “rows” will change to NULL for the table where index dive is skipped.
      • The value for “filtered” will be NULL.
    • EXPLAIN FORMAT=JSON FOR CONNECTION will see the following changes:
      • “rows_examined_per_scan” and “rows_produced_per_join” will not be
        mentioned when index dive is skipped.
      • Number of “rows” will change to NULL for the table where index dive is
        skipped.
      • The value for “filtered” will be NULL.
    • No change in output of EXPLAIN FORMAT=TRADITIONAL and EXPLAIN FORMAT=JSON
  • Optimizer trace:
    • Optimizer trace contains the tag “skipped_due_to_force_index”.

       
So what kind of queries could benefit from this optimization?
  1. Queries that contain large number of values in IN clause OR
  2. Queries with a large number of range conditions. (see queries below).
In the example below the first query has no FORCE INDEX and hence this optimization doesn’t apply. The second query has FORCE INDEX and takes significantly less time for the “statistics” phase of the query execution.

This is a feature request from Facebook.
The optimization applies by default starting from MySQL 8.0.3. You do not need to enable any settings or run any commands to turn it on, Please try it out, and let us know your results.

0 comments:

Post a Comment