The
GROUP BY
clause permits a WITH ROLLUP
modifier that causes summary output to include extra rows that represent higher-level (that is, super-aggregate) summary operations. ROLLUP
thus enables you to answer questions at multiple levels of analysis with a single query. For example, ROLLUP
can be used to provide support for OLAP (Online Analytical Processing) operations.
Suppose that a
sales
table has year
, country
, product
, and profit
columns for recording sales profitability:CREATE TABLE sales
(
year INT,
country VARCHAR(20),
product VARCHAR(32),
profit INT
);
To summarize table contents per year, use a simple
GROUP BY
like this:mysql> SELECT year, SUM(profit) AS profit
FROM sales
GROUP BY year;
+------+--------+
| year | profit |
+------+--------+
| 2000 | 4525 |
| 2001 | 3010 |
+------+--------+
The output shows the total (aggregate) profit for each year. To also determine the total profit summed over all years, you must add up the individual values yourself or run an additional query. Or you can use
ROLLUP
, which provides both levels of analysis with a single query. Adding a WITH ROLLUP
modifier to the GROUP BY
clause causes the query to produce another (super-aggregate) row that shows the grand total over all year values:mysql> SELECT year, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP;
+------+--------+
| year | profit |
+------+--------+
| 2000 | 4525 |
| 2001 | 3010 |
| NULL | 7535 |
+------+--------+
The
NULL
value in the year
column identifies the grand total super-aggregate line.ROLLUP
has a more complex effect when there are multiple GROUP BY
columns. In this case, each time there is a change in value in any but the last grouping column, the query produces an extra super-aggregate summary row.
For example, without
ROLLUP
, a summary of the sales
table based on year
, country
, and product
might look like this, where the output indicates summary values only at the year/country/product level of analysis:mysql> SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2001 | Finland | Phone | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
+------+---------+------------+--------+
With
ROLLUP
added, the query produces several extra rows:mysql> SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 |
| 2000 | NULL | NULL | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+--------+
Now the output includes summary information at four levels of analysis, not just one:
- Following each set of product rows for a given year and country, an extra super-aggregate summary row appears showing the total for all products. These rows have the
product
column set toNULL
. - Following each set of rows for a given year, an extra super-aggregate summary row appears showing the total for all countries and products. These rows have the
country
andproducts
columns set toNULL
. - Finally, following all other rows, an extra super-aggregate summary row appears showing the grand total for all years, countries, and products. This row has the
year
,country
, andproducts
columns set toNULL
.
The
NULL
indicators in each super-aggregate row are produced when the row is sent to the client. The server looks at the columns named in the GROUP BY
clause following the leftmost one that has changed value. For any column in the result set with a name that matches any of those names, its value is set to NULL
. (If you specify grouping columns by column position, the server identifies which columns to set to NULL
by position.)
Because the
NULL
values in the super-aggregate rows are placed into the result set at such a late stage in query processing, you can test them as NULL
values only in the select list or HAVING
clause. You cannot test them as NULL
values in join conditions or the WHERE
clause to determine which rows to select. For example, you cannot add WHERE product IS NULL
to the query to eliminate from the output all but the super-aggregate rows.
The
NULL
values do appear as NULL
on the client side and can be tested as such using any MySQL client programming interface. However, at this point, you cannot distinguish whether a NULL
represents a regular grouped value or a super-aggregate value. In MySQL 8.0, you can use the GROUPING()
function to test the distinction.
The following discussion lists some behaviors specific to the MySQL implementation of
ROLLUP
.
When you use
ROLLUP
, you cannot also use an ORDER BY
clause to sort the results. In other words, ROLLUP
and ORDER BY
are mutually exclusive in MySQL. However, you still have some control over sort order. To work around the restriction that prevents using ROLLUP
with ORDER BY
and achieve a specific sort order of grouped results, generate the grouped result set as a derived table and apply ORDER BY
to it. For example:mysql> SELECT * FROM
(SELECT year, SUM(profit) AS profit
FROM sales GROUP BY year WITH ROLLUP) AS dt
ORDER BY year DESC;
+------+--------+
| year | profit |
+------+--------+
| 2001 | 3010 |
| 2000 | 4525 |
| NULL | 7535 |
+------+--------+
In this case, the super-aggregate summary rows sort with the rows from which they are calculated, and their placement depends on sort order (at the beginning for ascending sort, at the end for descending sort).
LIMIT
can be used to restrict the number of rows returned to the client. LIMIT
is applied after ROLLUP
, so the limit applies against the extra rows added by ROLLUP
. For example:mysql> SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP
LIMIT 5;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
+------+---------+------------+--------+
Using
LIMIT
with ROLLUP
may produce results that are more difficult to interpret, because there is less context for understanding the super-aggregate rows.
A MySQL extension permits a column that does not appear in the
GROUP BY
list to be named in the select list. (For information about nonaggregated columns and GROUP BY
, see Section 12.20.3, “MySQL Handling of GROUP BY”.) In this case, the server is free to choose any value from this nonaggregated column in summary rows, and this includes the extra rows added by WITH ROLLUP
. For example, in the following query, country
is a nonaggregated column that does not appear in the GROUP BY
list and values chosen for this column are nondeterministic:mysql> SELECT year, country, SUM(profit) AS profit
FROM sales
GROUP BY year WITH ROLLUP;
+------+---------+--------+
| year | country | profit |
+------+---------+--------+
| 2000 | India | 4525 |
| 2001 | USA | 3010 |
| NULL | USA | 7535 |
+------+---------+--------+
This behavior is permitted when the
ONLY_FULL_GROUP_BY
SQL mode is not enabled. If that mode is enabled, the server rejects the query as illegal because country
is not listed in the GROUP BY
clause. With ONLY_FULL_GROUP_BY
enabled, you can still execute the query by using the ANY_VALUE()
function for nondeterministic-value columns:mysql> SELECT year, ANY_VALUE(country) AS country, SUM(profit) AS profit FROM sales GROUP BY year WITH ROLLUP; +------+---------+--------+ | year | country | profit | +------+---------+--------+ | 2000 | India | 4525 | | 2001 | USA | 3010 | | NULL | USA | 7535 | +------+---------+--------+
0 comments:
Post a Comment