Wednesday 5 September 2018

MySQL Left Outer Join, Exclude items from the second table owned by the user

I have two tables in my MySQL database, one is a library of all of the books in the database, and the other is containing individual rows corresponding to which books are in a user's library.

For example:
Library Table
 `id`        `title`...
=====        ===========
  1          Moby Dick
  2          Harry Potter

Collection Table
 `id`        `user`      `book`
=====        ======      =======
  1            1            2
  2            2            2
  3            1            1

What I want to do is run a query that will show all the books that are not in a user's collection. I can run this query to show all the books not in any user's collection:
SELECT *
FROM  `library`
LEFT OUTER JOIN  `collection` ON  `library`.`id` =  `collection`.`book`
WHERE  `collection`.`book` IS NULL

This works just fine as far as I can tell. Running this in PHPMyAdmin will result in all of the books that aren't in the collection table.
However, how do I restrict that to a certain user? For example, with the above dummy data, I want book 1 to result if user 2 runs the query, and no books if user 1 runs the query.
Just adding a AND user=[id] doesn't work, and with my extremely limited knowledge of JOIN statements I'm not getting anywhere really.
Also, the ID of the results being returned (of query shown, which doesn't do what I want but does function) is 0-- how do I make sure the ID returned is that of library.id?

You'll have to narrow down your LEFT JOIN selection to only the books that a particular user has, then whatever is NULL in the joined table will be rows(books) for which the user does not have in his/her collection:
SELECT
    a.id,
    a.title
FROM
    library a
LEFT JOIN
    (
        SELECT book
        FROM collection
        WHERE user = <userid>
    ) b ON a.id = b.book
WHERE
    b.book IS NULL

An alternative is:
SELECT
    a.id,
    a.title
FROM
    library a
WHERE
    a.id NOT IN
    (
        SELECT book
        FROM collection
        WHERE user = <userid>
    )

However, the first solution is more optimal as MySQL will execute the NOT IN subquery once for each row rather than just once for the whole query. Intuitively, you would expect MySQL to execute the subquery once and use it as a list, but MySQL is not smart enough to distinguish between correlated and non-correlated subqueries.
As stated here:
"The problem is that, for a statement that uses an IN subquery, the optimizer rewrites it as a correlated subquery."

0 comments:

Post a Comment