In today’s follow-up, we’ll use the COUNT() function in more sophisticated ways to tally unique values as well as those which satisfy a condition.
Distinct Counts
The COUNT(DISTINCT) function returns the number of rows with unique non-NULL values. Hence, the inclusion of the DISTINCT keyword eliminates duplicate rows from the count. Its syntax is:
COUNT(DISTINCT expr,[expr...])
As with the regular COUNT() function, the expr parameters above can be any given expression, including specific columns, all columns (*), function return values, or expression such as IF/CASE statements.
A Simple Example
Say that we had the following table of clients:
+------------+-------------+| last_name | first_name |
+------------+-------------+
| Tannen | Biff |
+------------+-------------+
| McFly | Marty |
+------------+-------------+
| Brown | Dr. Emmett |
+------------+-------------+
| McFly | George |
+------------+-------------+
Invoking COUNT(*) will return the number of all rows (4) while a COUNT DISTINCT on the last_name will count each row with a duplicated last name as one, so that we get a total of 3:
SELECT COUNT(*), COUNT(DISTINCT last_name) FROM clients; +----------+---------------------------+
| COUNT(*) | COUNT(DISTINCT last_name) |
+----------+---------------------------+
| 4 | 3 |
+----------+---------------------------+
Conditional Counts using Expressions
As mentioned above, COUNT() function parameters are not limited to column names; function return values and expressions such as IF/CASE statements are also fair game.
Here’s a table that contains several users’ telephone numbers and sex (limited to two for simplicity):
+------------+---------+| tel | sex |
+------------+---------+
| 7136609221 | male |
+------------+---------+
| 7136609222 | male |
+------------+---------+
| 7136609223 | female |
+------------+---------+
| 7136609228 | male |
+------------+---------+
| 7136609222 | male |
+------------+---------+
| 7136609223 | female |
+------------+---------+
Say that we wanted to build a query that told us how many distinct women and men there are in the table. The person is identified by their telephone ('tel') number. It is possible for the same 'tel' to appear multiple times, but that tel’s gender should only be counted one time.
Here's one option using a separate COUNT DISTINCT for each column:
SELECT COUNT(DISTINCT tel) gender_count,COUNT(DISTINCT CASE WHEN gender = 'male' THEN tel END) male_count,
COUNT(DISTINCT CASE WHEN gender = 'female' THEN tel END) female_count
FROM people
This SELECT statement would yield the following:
+--------------+------------+---------------+| gender_count | male_count | female_count |
+--------------+------------+---------------+
| 4 | 3 | 1 |
+--------------+------------+---------------+
BONUS! Grouping and Including a Grand Total
You can also stack counts vertically using GROUP BY:
+---------+-------+| GroupId | Count |
+---------+-------+
| 1 | 5 |
+---------+-------+
| 2 | 4 |
+---------+-------+
| 3 | 7 |
+---------+-------+
| Total: | 11 |
+---------+-------+
The “Total:” was produced using the SQL GROUPING() function, which was added in MySQL 8.0.1. It distinguishes between a NULL representing the set of all values in a super-aggregate row (produced by a ROLLUP) from a NULL in a regular row.
Here’s the full SQL:
Select Case When Grouping(GroupId) = 1Then 'Total:'
Else GroupId
End As GroupId,
Count(*) Count
From user_groups
Group By GroupId With Rollup
Order By Grouping(GroupId), GroupId
0 comments:
Post a Comment