Tuesday 4 September 2018

Optimize a MySQL query with multiple joins

One of the queries used by a web app we're running is as follows:

SELECT
       p.id, r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients AS p
INNER JOIN
    patients_reports AS pr ON pr.patient_id = p.id
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

The explain plan looks like this:

+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+
| id | select_type | table | type   | possible_keys | key       | key_len | ref               | rows   | Extra                                        |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | p     | const  | PRIMARY       | PRIMARY   | 4       | const             |      1 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | rs    | ALL    | PRIMARY       | NULL      | NULL    | NULL              | 152817 |                                              |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY       | PRIMARY   | 4       | demo.rs.report_id |      1 |                                              |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY       | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index                     |
|  1 | SIMPLE      | tr    | ref    | result_id     | result_id | 5       | demo.rs.id        |      1 | Using where                                  |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+----------------------------------------------+

The query returns 27371 rows. There are 152730 rows in test_results at the moment. This is just a small amount of demo data.

I've tried to get the query to be more efficient, but I'm having trouble getting it to execute more quickly. I've had a look at various articles on documentation and questions on stackoverflow, but have not been able to fix this.

I tried removing one of the joins as follows:

SELECT
       pr.patient_id, r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients_reports AS pr
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

The query plan is then as follows:

+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+
| id | select_type | table | type   | possible_keys | key       | key_len | ref               | rows   | Extra                           |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+
|  1 | SIMPLE      | rs    | ALL    | PRIMARY       | NULL      | NULL    | NULL              | 152817 | Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY       | PRIMARY   | 4       | demo.rs.report_id |      1 |                                 |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY       | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index        |
|  1 | SIMPLE      | tr    | ref    | result_id     | result_id | 5       | demo.rs.id        |      1 | Using where                     |
+----+-------------+-------+--------+---------------+-----------+---------+-------------------+--------+---------------------------------+

So not much different.

I've tried rearranging the query and using STRAIGHT_JOIN amongst other things, but I'm not getting anywhere.

I'd appreciate some suggestions on how to optimize the query. Thanks.

EDIT: Argh! I did not have an index on results.report_id, but it does not seem to have helped:

+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+
| id | select_type | table | type   | possible_keys     | key       | key_len | ref               | rows   | Extra                           |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+
|  1 | SIMPLE      | rs    | ALL    | PRIMARY,report_id | NULL      | NULL    | NULL              | 152817 | Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY           | PRIMARY   | 4       | demo.rs.report_id |      1 |                                 |
|  1 | SIMPLE      | pr    | eq_ref | PRIMARY           | PRIMARY   | 8       | const,demo.r.id   |      1 | Using where; Using index        |
|  1 | SIMPLE      | tr    | ref    | result_id         | result_id | 5       | demo.rs.id        |      1 | Using where                     |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+--------+---------------------------------+

EDIT2:

patients_reports looks like this:

+------------+---------+------+-----+---------+-------+
| Field      | Type    | Null | Key | Default | Extra |
+------------+---------+------+-----+---------+-------+
| patient_id | int(11) | NO   | PRI | 0       |       |
| report_id  | int(11) | NO   | PRI | 0       |       |
+------------+---------+------+-----+---------+-------+

EDIT3:

After adding the results.report_id index and trying the STRAIGHT_JOIN again as suggested by @DRapp:

SELECT STRAIGHT_JOIN
       r.id AS report_id, tr.result_id,
       r.report_date, r.department, r.reportStatus, rs.specimen,
       tr.name, tr.value, tr.flag, tr.unit, tr.reference_range
FROM patients_reports AS pr
INNER JOIN
    reports AS r ON pr.report_id = r.id
INNER JOIN
    results AS rs ON r.id = rs.report_id
INNER JOIN
    test_results AS tr ON rs.id = tr.result_id
WHERE pr.patient_id = 17548
ORDER BY rs.specimen, tr.name, r.report_date;

the plan looks like this:

+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+
| id | select_type | table | type   | possible_keys     | key       | key_len | ref               | rows | Extra                                        |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+
|  1 | SIMPLE      | pr    | ref    | PRIMARY           | PRIMARY   | 4       | const             | 3646 | Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | r     | eq_ref | PRIMARY           | PRIMARY   | 4       | demo.pr.report_id |    1 |                                              |
|  1 | SIMPLE      | rs    | ref    | PRIMARY,report_id | report_id | 5       | demo.r.id         |  764 | Using where                                  |
|  1 | SIMPLE      | tr    | ref    | result_id         | result_id | 5       | demo.rs.id        |    1 | Using where                                  |
+----+-------------+-------+--------+-------------------+-----------+---------+-------------------+------+----------------------------------------------+

So I think that looks much better, but I'm not sure exactly how to tell. Also the query still seems to take about the same about of time as before.

Is results.report_id indexed? It's failing to find a key and doing a table scan it looks like. I'm assuming results.id is actually the primary key.

Also, if it report_id was the primary key, and it's INNODB, it should be clustered on that index, so absolutely no clue why that isn't screaming fast if it is configured that way.

Tech Blog

Tuesday 4 September 2018

Optimize a MySQL query with multiple joins

One of the queries used by a web app we're running is as follows:

0 comments:

Post a Comment

Total Pageviews

Achievement

Live Traffic

Followers

About Me

I V RAMANA

Recent Comments

Categories

Popular Posts

Hot Topics

Video

News

Comments

Recent

Bottom Ad [Post Page]

Recent Posts

Mysql - How to reset the administrator password in ISPConfig 3

Socialize

Blog Archive

Search This Blog

Post Top Ad

Archive

Post Bottom Ad

Author Details

About Me

Tags

Full width home advertisement

Pages

Post Page Advertisement [Top]

Climb the mountains

Tuesday 4 September 2018

0 comments:

Post a Comment

Total Pageviews

Achievement

Live Traffic

Subscribe To

Followers

About Me

I V RAMANA

Recent Comments

Categories

Popular Posts

Hot Topics

Video

News

Comments

Recent

Bottom Ad [Post Page]

Recent Posts

Mysql - How to reset the administrator password in ISPConfig 3

Socialize

Blog Archive

Search This Blog

Post Top Ad

Archive

Post Bottom Ad

Author Details

About Me

Tags

Full width home advertisement

Pages

Post Page Advertisement [Top]

Climb the mountains