Tuesday 4 September 2018

The SQL query does not return the expected results after LEFT JOIN (schema and images included)

I created a query earlier which concatenates staff members associated with a customer. There is only ever 2 customers assigned to each customer. My problem is after the query is executed, for each customer, it associates ALL staff members with each customer, not 2. Here is what I have so far:

SELECT C.customer_ID, C.l_Name AS Surname,
    C.f_Name AS 'First Name', C.travel_Date,
    T.tour_Name,
    GROUP_CONCAT(S.f_Name, S.l_Name ) AS Staff_Concat
    FROM Customers AS C 

    LEFT JOIN Tour AS T ON C.tour_ID = T.tour_ID
    LEFT JOIN Staff_Day AS SD ON C.tour_ID = SD.tour_ID
    LEFT JOIN Staff_Day AS SD_2 ON SD_2.sd_Date = C.travel_Date
    LEFT JOIN Staff AS S ON SD.staff_ID = S.staff_ID

    WHERE
    C.travel_Date >= '2014-07-08 00:00:00'
    AND C.travel_Date <= '2014-07-08 23:59:59'

    AND C.customer_ID NOT IN (SELECT O.customer_ID
    FROM Customers AS C, Orders AS O
    WHERE C.travel_Date >= '2014-07-08 00:00:00'
    AND C.travel_Date <= '2014-07-08 23:59:59' AND C.customer_ID = O.customer_ID )
    GROUP BY C.customer_ID, Surname, 'First Name', C.travel_Date, T.tour_Name

This query successfully returns the correct amount of passengers. However, it is returning incorrect data for staff members. There should only be 2 staff members assigned to each passenger. This query seems to return all staff members for each customer. The result I'm getting is:
6166    customer_Name   2014-07-08 Wildthing    Staff1, Staff2, staff3...staff10

The expected result should be:
6166    customer_Name   2014-07-08 Wildthing    Staff1, staff2

Attached images for the Staff, Staff_Day and Tour table:
Staff_Day
Staff
Tour
Attached is structure for the Customers, Staff, Staff_Day and Tour table:
Table structure for table Customers
Column  Type    Null    Default
customer_ID int(11) No
f_Name  varchar(30) Yes NULL
l_Name  varchar(30) No
address varchar(100)    No
suburb  varchar(30) No
state   varchar(30) No
country varchar(30) No
postcode    varchar(30) No
email   varchar(50) No
phone   varchar(20) No
child_1 varchar(30) Yes NULL
child_2 varchar(30) Yes NULL
child_3 varchar(30) Yes NULL
child_4 varchar(30) Yes NULL
travel_Date datetime    No
signature   text    No
terms   tinyint(1)  No
interested_video    tinyint(1)  Yes 0
specials    tinyint(1)  Yes NULL
tour_ID int(11) Yes NULL
updated timestamp   No  CURRENT_TIMESTAMP

Table structure for table Staff
Column  Type    Null    Default
f_Name  varchar(30) No
l_Name  varchar(30) No
comm_Value  int(3)  No
staff_Type  varchar(50) No
staff_ID    int(11) No
bus_ID  int(11) No  

Table structure for table Staff_Day
Column  Type    Null    Default
sd_ID   int(11) No
staff_ID    int(4)  No
tour_ID int(4)  No
sd_Date datetime    No  

Table structure for table Tour
Column  Type    Null    Default
tour_Name   varchar(30) No
tour_ID int(11) No
bus_ID  int(11) No

I have been told there may be an issue in the JOIN part of the SQL query. Any help/guidance would be greatly appreciated. Apologies if this is a repost. I have done my research but I'm just running into brick walls now.

First, a very slight restructure of your query. You don't need two queries to the staff-day table on both ID and another by date, that was creating a Cartesian result on you. Instead, use the one table join on BOTH PARTS, and instead of hard-coding the from/to date compared to the staff-day table. You can just do a DATE_ADD() and be LESS than 1 full day... unless you are storing time portions on the travel date, then you would be better to do based on the date matching regardless of the time.
So, the first left-join on tour for the description, no problem. Second on the staff-day and then staff based on matching tour AND date.
Finally, you have a NOT IN (subselect). I changed this to a join format too, but just to clarify. It APPEARS you are trying to get a list of all customers who may have signed up for a given tour, but never actually placed and committed an order. But.. what if someone signs up for other orders on different days for the same customer, they will have an order on file... but you are then forcing the qualifier on the customer travel date. Without knowing more details behind the dates and orders table for exceptions... I am not sure what other impacts may be exposed.
EDIT
Since you are still getting issues with Cartesian results (but doesn't make sense), change your joins to INNER JOINS, thus the tour IS and SHOULD BE known and valid
SELECT
      C.customer_ID,
      C.l_Name AS Surname,
      C.f_Name AS 'First Name',
      C.travel_Date,
      T.tour_Name,
      GROUP_CONCAT(S.f_Name, S.l_Name ) AS Staff_Concat
   FROM
      Customers AS C
         JOIN Tour AS T
            ON C.tour_ID = T.tour_ID
         JOIN Staff_Day AS SD
            ON C.tour_ID = SD.tour_ID
           AND C.Travel_Date >= SD.sd_Date
           AND SD.sd_date < Date_add( C.Travel_Date, INTERVAL 1 DAY )
            JOIN Staff AS S
               ON SD.staff_ID = S.staff_ID
   WHERE
          C.travel_Date >= '2014-07-08 00:00:00'
      AND C.travel_Date <= '2014-07-08 23:59:59'
      AND C.customer_ID NOT IN
         ( SELECT
                 C.customer_ID
              FROM
                 Customers AS C
                    JOIN Orders AS O
                       ON C.customer_ID = O.customer_ID
              WHERE
                     C.travel_Date >= '2014-07-08 00:00:00'
                 AND C.travel_Date <= '2014-07-08 23:59:59' )
   GROUP BY
      C.customer_ID,
      Surname,
      'First Name',
      C.travel_Date,
      T.tour_Name

Suggestion on restructure of tables. I would have it more like...
Tours - generically TourSpecific - specific unique ID that also has ID for the tour it is based on but also has the specific time slot being offered. This way, when a customer signs up for the 10am vs 2pm tour, they would be assigned SpecificTourID, then you don't have to deal with timestamp comparisons. Similarly, the staff for the particular day and time slot would have the SpecificTourID, then you will have less confusion and simpler queries.
Ex:
Tours
TourID   TourName
10       Jammin
11       Thriller
12       Thunderstruck
13       Wildthing

SpecificTours
SpecificTourID   TourID   TourDateTime
1                10       8/1/2014 8:00am
2                10       8/1/2014 1:15pm
3                10       8/4/2014 8:00am
4                10       8/4/2014 1:15pm

Customers
CustomerID   FirstName  LastName
1            Bill       Board
2            Eileen     Dover
3            Seymore    Hair

Orders
OrderID   CustomerID   SpecificTourID
1         1            1
2         2            1
3         1            4

StaffDay (similar approach, but using the specific tourID they are assigned
This way, the order for Order #2 is for specific tour 1 which is tour 10 which is Jammin.

0 comments:

Post a Comment