GROUPING_ID
Is a function that computes the level of grouping. GROUPING_ID can be used only in the SELECT <select> list, HAVING, or ORDER BY clauses when GROUP BY is specified.
GROUPING_ID ( <column_expression>[ ,...n ] )
<column_expression>
Is a column_expression in a GROUP BY clause.
BIGINT
The GROUPING_ID's <column_expression> must exactly match the expression in the GROUP BY list. For example, if you are grouping by user_id, use GROUPING_ID (user_id); or if you are grouping by name, use GROUPING_ID (name).
GROUPING_ID(<column_expression> [ ,...n ]) inputs the equivalent of the GROUPING(<column_expression>) return for each column in its column list in each output row as a string of ones and zeros. GROUPING_ID interprets that string as a base-2 number and returns the equivalent integer. For example consider the following statement: SELECT a, b, c, SUM(d), GROUPING_ID(a,b,c) FROM T GROUP BY <group by list>. The following table shows the GROUPING_ID() input and output values.
| Columns aggregated | GROUPING_ID (a, b, c) input = GROUPING(a) + GROUPING(b) + GROUPING(c) | GROUPING_ID () output |
|---|---|---|
a | 100 | 4 |
b | 010 | 2 |
c | 001 | 1 |
ab | 110 | 6 |
ac | 101 | 5 |
bc | 011 | 3 |
abc | 111 | 7 |
Each GROUPING_ID argument must be an element of the GROUP BY list. GROUPING_ID() returns an integer bitmap whose lowest N bits may be lit. A lit bit indicates the corresponding argument is not a grouping column for the given output row. The lowest-order bit corresponds to argument N, and the (N-1)ᵗʰ lowest-order bit corresponds to argument 1.
For a single grouping query, GROUPING (<column_expression>) is equivalent to GROUPING_ID(<column_expression>), and both return 0. For example, the following statements are equivalent:
Statement A:
SELECT GROUPING_ID(A,B) FROM T GROUP BY CUBE(A,B)
Statement B:
SELECT 3 FROM T GROUP BY () UNION ALL SELECT 1 FROM T GROUP BY A UNION ALL SELECT 2 FROM T GROUP BY B UNION ALL SELECT 0 FROM T GROUP BY A,B
Before starting our example, We first prepare the following data.
CREATE TABLE employee ( uid INT, name VARCHAR(32), level VARCHAR(32), title VARCHAR(32), department VARCHAR(32), hiredate DATE ) UNIQUE KEY(uid) DISTRIBUTED BY HASH(uid) BUCKETS 1 PROPERTIES ( "replication_num" = "1" ); INSERT INTO employee VALUES (1, 'Abby', 'Senior', 'President', 'Board of Directors', '1999-11-13'), (2, 'Bob', 'Senior', 'Vice-President', 'Board of Directors', '1999-11-13'), (3, 'Candy', 'Senior', 'System Engineer', 'Technology', '2005-3-7'), (4, 'Devere', 'Senior', 'Hardware Engineer', 'Technology', '2006-7-9'), (5, 'Emilie', 'Senior', 'System Analyst', 'Technology', '2003-8-28'), (6, 'Fredrick', 'Senior', 'Sales Manager', 'Sales', '2004-9-7'), (7, 'Gitel', 'Assistant', 'Business Executive', 'Sales', '2003-3-19'), (8, 'Haden', 'Trainee', 'Sales Assistant', 'Sales', '2007-6-30'), (9, 'Irene', 'Assistant', 'Business Executive', 'Sales', '2005-10-20'), (10, 'Jankin', 'Senior', 'Marketing Supervisor', 'Marketing', '2001-4-13'), (11, 'Louis', 'Trainee', 'Marketing Assistant', 'Marketing', '2007-8-2'), (12, 'Martin', 'Trainee', 'Marketing Assistant', 'Marketing', '2007-7-1'), (13, 'Nasir', 'Assistant', 'Marketing Executive', 'Marketing', '2004-9-3');
Here is the result.
+------+----------+-----------+----------------------+--------------------+------------+ | uid | name | level | title | department | hiredate | +------+----------+-----------+----------------------+--------------------+------------+ | 1 | Abby | Senior | President | Board of Directors | 1999-11-13 | | 2 | Bob | Senior | Vice-President | Board of Directors | 1999-11-13 | | 3 | Candy | Senior | System Engineer | Technology | 2005-03-07 | | 4 | Devere | Senior | Hardware Engineer | Technology | 2006-07-09 | | 5 | Emilie | Senior | System Analyst | Technology | 2003-08-28 | | 6 | Fredrick | Senior | Sales Manager | Sales | 2004-09-07 | | 7 | Gitel | Assistant | Business Executive | Sales | 2003-03-19 | | 8 | Haden | Trainee | Sales Assistant | Sales | 2007-06-30 | | 9 | Irene | Assistant | Business Executive | Sales | 2005-10-20 | | 10 | Jankin | Senior | Marketing Supervisor | Marketing | 2001-04-13 | | 11 | Louis | Trainee | Marketing Assistant | Marketing | 2007-08-02 | | 12 | Martin | Trainee | Marketing Assistant | Marketing | 2007-07-01 | | 13 | Nasir | Assistant | Marketing Executive | Marketing | 2004-09-03 | +------+----------+-----------+----------------------+--------------------+------------+ 13 rows in set (0.01 sec)
The following example returns the count of employees by department and level. GROUPING_ID() is used to create a value for each row in the Job Title column that identifies its level of aggregation.
SELECT department, CASE WHEN GROUPING_ID(department, level) = 0 THEN level WHEN GROUPING_ID(department, level) = 1 THEN CONCAT('Total: ', department) WHEN GROUPING_ID(department, level) = 3 THEN 'Total: Company' ELSE 'Unknown' END AS 'Job Title', COUNT(uid) AS 'Employee Count' FROM employee GROUP BY ROLLUP(department, level) ORDER BY GROUPING_ID(department, level) ASC;
Here is the result.
+--------------------+---------------------------+----------------+ | department | Job Title | Employee Count | +--------------------+---------------------------+----------------+ | Board of Directors | Senior | 2 | | Technology | Senior | 3 | | Sales | Senior | 1 | | Sales | Assistant | 2 | | Sales | Trainee | 1 | | Marketing | Senior | 1 | | Marketing | Trainee | 2 | | Marketing | Assistant | 1 | | Board of Directors | Total: Board of Directors | 2 | | Technology | Total: Technology | 3 | | Sales | Total: Sales | 4 | | Marketing | Total: Marketing | 4 | | NULL | Total: Company | 13 | +--------------------+---------------------------+----------------+ 13 rows in set (0.01 sec)
In the following code, to return only the rows that have the count of senior in department.
SELECT department, CASE WHEN GROUPING_ID(department, level) = 0 THEN level WHEN GROUPING_ID(department, level) = 1 THEN CONCAT('Total: ', department) WHEN GROUPING_ID(department, level) = 3 THEN 'Total: Company' ELSE 'Unknown' END AS 'Job Title', COUNT(uid) FROM employee GROUP BY ROLLUP(department, level) HAVING `Job Title` IN ('Senior');
Here is the result.
+--------------------+-----------+--------------+ | department | Job Title | count(`uid`) | +--------------------+-----------+--------------+ | Board of Directors | Senior | 2 | | Technology | Senior | 3 | | Sales | Senior | 1 | | Marketing | Senior | 1 | +--------------------+-----------+--------------+ 5 rows in set (0.01 sec)
GROUPING_ID
For more information, see also: