Friday, 2 November 2018

Mysql: Does the join order matter in SQL?

Disregarding performance, will I get the same result from query A and B below? How about C and D?
-- A
select *
from   a left join b
           on <blahblah>
       left join c
           on <blahblan>


-- B
select *
from   a left join c
           on <blahblah>
       left join b
           on <blahblan>  

-- C
select *
from   a join b
           on <blahblah>
       join c
           on <blahblan>


-- D
select *
from   a join c
           on <blahblah>
       join b
           on <blahblan>  

 Answers


For INNER joins, no, the order doesn't matter. The queries will return same results, as long as you change your selects from SELECT * to SELECT a.*, b.*, c.*.

For (LEFTRIGHT or FULLOUTER joins, yes, the order matters - and (updated) things are much more complicated.
First, outer joins are not commutative, so a LEFT JOIN b is not the same as b LEFT JOIN a
Outer joins are not associative either, so in your examples which involve both (commutativity and associativity) properties:
a LEFT JOIN b 
    ON b.ab_id = a.ab_id
  LEFT JOIN c
    ON c.ac_id = a.ac_id
is equivalent to:
a LEFT JOIN c 
    ON c.ac_id = a.ac_id
  LEFT JOIN b
    ON b.ab_id = a.ab_id
but:
a LEFT JOIN b 
    ON  b.ab_id = a.ab_id
  LEFT JOIN c
    ON  c.ac_id = a.ac_id
    AND c.bc_id = b.bc_id
is not equivalent to:
a LEFT JOIN c 
    ON  c.ac_id = a.ac_id
  LEFT JOIN b
    ON  b.ab_id = a.ab_id
    AND b.bc_id = c.bc_id

Another (hopefully simpler) associativity example. Think of this as (a LEFT JOIN b) LEFT JOIN c:
a LEFT JOIN b 
    ON b.ab_id = a.ab_id          -- AB condition
 LEFT JOIN c
    ON c.bc_id = b.bc_id          -- BC condition
This is equivalent to a LEFT JOIN (b LEFT JOIN c):
a LEFT JOIN  
    b LEFT JOIN c
        ON c.bc_id = b.bc_id          -- BC condition
    ON b.ab_id = a.ab_id          -- AB condition
only because we have "nice" ON conditions. Both ON b.ab_id = a.ab_id and c.bc_id = b.bc_id are equality checks and do not involve NULL comparisons.
You can even have conditions with other operators or more complex ones like: ON a.x <= b.x or ON a.x = 7 or ON a.x LIKE b.x or ON (a.x, a.y) = (b.x, b.y) and the two queries would still be equivalent.
If however, any of these involved IS NULL or a function that is related to nulls like COALESCE(), for example if the condition was b.ab_id IS NULL, then the two queries would not be equivalent.



If you try joining c on a field from b before joining b, i.e.:
SELECT a.x, a.y, a.z FROM a 
   INNER JOIN c
       on b.x = c.x
   INNER JOIN b
       on a.x = b.x
your query will fail, so in this case the order matters.

0 comments:

Post a Comment