Monday, 19 November 2018

Mysql Pivot table


I tried to search posts, but I only found solutions for SQL Server/Access. I need a solution in MySQL (5.X).
I have a table (called history) with 3 columns: hostid, itemname, itemvalue.
If I do a select (select * from history), it will return
   +--------+----------+-----------+
   | hostid | itemname | itemvalue |
   +--------+----------+-----------+
   |   1    |    A     |    10     |
   +--------+----------+-----------+
   |   1    |    B     |     3     |
   +--------+----------+-----------+
   |   2    |    A     |     9     |
   +--------+----------+-----------+
   |   2    |    c     |    40     |
   +--------+----------+-----------+
How do I query the database to return something like
   +--------+------+-----+-----+
   | hostid |   A  |  B  |  C  |
   +--------+------+-----+-----+
   |   1    |  10  |  3  |  0  |
   +--------+------+-----+-----+
   |   2    |   9  |  0  |  40 |
   +--------+------+-----+-----+

 Answers



I'm going to add a somewhat longer and more detailed explanation of the steps to 
take to solve this problem. 

I'll start out with the base you've given and use it to define a couple of terms that I'll use 
for the rest of this post. This will be the base table:
select * from history;

+--------+----------+-----------+
| hostid | itemname | itemvalue |
+--------+----------+-----------+
|      1 | A        |        10 |
|      1 | B        |         3 |
|      2 | A        |         9 |
|      2 | C        |        40 |
+--------+----------+-----------+
This will be our goal, the pretty pivot table:
select * from history_itemvalue_pivot;

+--------+------+------+------+
| hostid | A    | B    | C    |
+--------+------+------+------+
|      1 |   10 |    3 |    0 |
|      2 |    9 |    0 |   40 |
+--------+------+------+------+
Values in the history.hostid column will become y-values in the pivot table. 
Values in the history.itemname column will become x-values (for obvious reasons).

When I have to solve the problem of creating a pivot table, I tackle it using a three-step 
process (with an optional fourth step):
  1. select the columns of interest, i.e. y-values and x-values
  2. extend the base table with extra columns -- one for each x-value
  3. group and aggregate the extended table -- one group for each y-value
  4. (optional) prettify the aggregated table
Let's apply these steps to your problem and see what we get:
Step 1: select columns of interest. In the desired result, hostid provides the y-values 
and itemname provides the x-values.
Step 2: extend the base table with extra columns. We typically need one column per 
x-value. Recall that our x-value column is itemname:
create view history_extended as (
  select
    history.*,
    case when itemname = "A" then itemvalue end as A,
    case when itemname = "B" then itemvalue end as B,
    case when itemname = "C" then itemvalue end as C
  from history
);

select * from history_extended;

+--------+----------+-----------+------+------+------+
| hostid | itemname | itemvalue | A    | B    | C    |
+--------+----------+-----------+------+------+------+
|      1 | A        |        10 |   10 | NULL | NULL |
|      1 | B        |         3 | NULL |    3 | NULL |
|      2 | A        |         9 |    9 | NULL | NULL |
|      2 | C        |        40 | NULL | NULL |   40 |
+--------+----------+-----------+------+------+------+
Note that we didn't change the number of rows -- we just added extra columns. 
Also note the pattern of NULLs -- a row with itemname = "A" has a non-null value 
for new column A, and null values for the other new columns.
Step 3: group and aggregate the extended table. We need to group by hostid
since it provides the y-values:
create view history_itemvalue_pivot as (
  select
    hostid,
    sum(A) as A,
    sum(B) as B,
    sum(C) as C
  from history_extended
  group by hostid
);

select * from history_itemvalue_pivot;

+--------+------+------+------+
| hostid | A    | B    | C    |
+--------+------+------+------+
|      1 |   10 |    3 | NULL |
|      2 |    9 | NULL |   40 |
+--------+------+------+------+
(Note that we now have one row per y-value.) Okay, we're almost there! We just need to 
get rid of those ugly NULLs.
Step 4: prettify. We're just going to replace any null values with zeroes so the result set is nicer to look at:
create view history_itemvalue_pivot_pretty as (
  select 
    hostid, 
    coalesce(A, 0) as A, 
    coalesce(B, 0) as B, 
    coalesce(C, 0) as C 
  from history_itemvalue_pivot 
);

select * from history_itemvalue_pivot_pretty;

+--------+------+------+------+
| hostid | A    | B    | C    |
+--------+------+------+------+
|      1 |   10 |    3 |    0 |
|      2 |    9 |    0 |   40 |
+--------+------+------+------+
And we're done -- we've built a nice, pretty pivot table using MySQL.

Considerations when applying this procedure:
  • what value to use in the extra columns. I used itemvalue in this example
  • what "neutral" value to use in the extra columns. I used NULL, but it could also 
  • be 0 or "", depending on your exact situation
    what aggregate function to use when grouping. I used sum, but count and max are 
    also often used (max is often used when building one-row "objects" that had been spread 
    across many rows)
  • using multiple columns for y-values. This solution isn't limited to using a single 
  • column for the y-values -- just plug the extra columns into the group by
    clause (and don't forget to select them)
Known limitations:
  • this solution doesn't allow n columns in the pivot table -- each pivot column needs to 
  • be manually added when extending the base table. So for 5 or 10 x-values, this solution
    is nice. For 100, not so nice. There are some solutions with stored procedures generating a
    query, but they're ugly and difficult to get right. I currently don't know of a good way to
    solve this problem when the pivot table needs to have lots of columns.




let's reduce it to only one query:
select
    history.*,
    coalesce(sum(case when itemname = "A" then itemvalue end), 0) as A,
    coalesce(sum(case when itemname = "B" then itemvalue end), 0) as B,
    coalesce(sum(case when itemname = "C" then itemvalue end), 0) as C
from history
group by hostid




I edit Agung Sagita's answer from subquery to join. I'm not sure about how much difference between this 2 way, but just for another reference.
SELECT  hostid, T2.VALUE AS A, T3.VALUE AS B, T4.VALUE AS C
FROM TableTest AS T1
LEFT JOIN TableTest T2 ON T2.hostid=T1.hostid AND T2.ITEMNAME='A'
LEFT JOIN TableTest T3 ON T3.hostid=T1.hostid AND T3.ITEMNAME='B'
LEFT JOIN TableTest T4 ON T4.hostid=T1.hostid AND T4.ITEMNAME='C'




I make that into Group By hostId then it will show only first row with values,
like:
A   B  C
1  10
2      3




My solution :
select h.hostid, sum(ifnull(h.A,0)) as A, sum(ifnull(h.B,0)) as B, sum(ifnull(h.C,0)) as  C from (
select
hostid,
case when itemName = 'A' then itemvalue end as A,
case when itemName = 'B' then itemvalue end as B,
case when itemName = 'C' then itemvalue end as C
  from history 
) h group by hostid
It produces the expected results in the submitted case.

0 comments:

Post a Comment