SQL Server 2008 introduces a number of enhancements
and changes to the grouping aggregate relational result set. These
changes include the following:
ROLLUP and CUBE Operator Syntax Changes
The ROLLUP and CUBE operators produce additional aggregate groupings and are appended to the GROUP BY clause. Prior to SQL Server 2008, to include ROLLUP or CUBE groupings, you had to specify the WITH ROLLUP or WITH CUBE options in the GROUP BY clause after the list of grouping columns. In SQL Server 2008, the syntax now follows the ANSI standard for ROLLUP and CUBE; you first designate the ROLLUP or CUBE
option and then provide the grouping columns to these operators as a
comma-separated list enclosed in parentheses. The new syntax is
GROUP BY [ROLLUP | CUBE ( non-aggregate_column_list ) ]
Following are examples using the pre-2008 syntax:
SELECT type, pub_id, AVG(price) AS average
FROM titles
GROUP BY type, pub_id
WITH CUBE
SELECT pub_id, type, SUM(ytd_sales) as ytd_sales
FROM dbo.titles
where type like '%cook%' or type = 'business'
GROUP BY type, pub_id
WITH ROLLUP
An example of the new ANSI standard syntax supported in SQL Server 2008 is as follows:
SELECT type, pub_id, AVG(price) AS average
FROM titles
GROUP BY CUBE ( type, pub_id)
SELECT pub_id, type, SUM(ytd_sales) as ytd_sales
FROM dbo.titles
where type like '%cook%' or type = 'business'
GROUP BY ROLLUP (type, pub_id)
Note
The old-style CUBE and ROLLUP
syntax is still supported for backward-compatibility purposes but is
being deprecated. You should convert any existing queries using the
pre-2008 WITH CUBE or WITH ROLLUP syntax to the new syntax to ensure future compatibility.
GROUPING SETS
The CUBE and ROLLUP
operators allow you to run a single query and generate multiple sets of
groupings. However, the sets of groupings are fixed. For example, if
you use GROUP BY ROLLUP (A, B, C), you get aggregates generated for the following groupings of nonaggregate columns:
If you use GROUP BY CUBE (A, B, C), you get aggregates generated for the following groupings of nonaggregate columns:
SQL Server 2008 introduces the GROUPING SETS operator in addition to the CUBE and ROLLUP operators for performing several groupings in a single query. With GROUPING SETS, only the specified groups are aggregated instead of the full set of aggregations generated by CUBE or ROLLUP. GROUPING SETS enables you to generate results with multiple groupings in a single query, without having to resort to writing multiple GROUP BY queries and combining the results using a UNION ALL statement.
The GROUPING SETS
operator supports concatenating column groupings and an optional super
aggregate row. The syntax for defining grouping sets is as follows:
GROUP BY [ GROUPING SETS ( ( ) | grouping_set_item | grouping_set_item_list
[, ...n ] ) ]
The GROUPING SETS items can be single columns or a list of columns. The null field list "( )"
can also be used to generate a super-aggregate (that is, a grand total
for the entire result set). A non-nested list of columns works as
separate simple GROUP BY statements, which are then combined in an implied UNION ALL. A nested list of columns in parentheses within the GROUPING SETS item list works as a GROUP BY on that set of columns. Table 1 demonstrates examples of GROUPING SETS clauses and the corresponding groupings that the query generates.
Table 1. Grouping Sets Examples
GROUPING SETS Clause | Equivalent Statement |
---|
GROUP BY GROUPING SETS (A,B,C) | GROUP BY A UNION ALL GROUP BY B UNION ALL GROUP BY C
|
GROUP BY GROUPING SETS ((A,B,C)) | GROUP BY A,B,C |
GROUP BY GROUPING SETS (A,(B,C)) | GROUP BY A UNION ALL GROUP BY B,C
|
GROUP BY GROUPING SETS ((A,C),(B,C)) | GROUP BY A,C UNION ALL GROUP BY B,C
|
Listing 1 demonstrates how to use the GROUPING SETS operator to perform three groupings on three individual columns in a single query.
Listing 1. GROUPING SETS Example
/*** ** Perform a grouping by type, grouping by pub_id, and grouping by price ***/ SELECT type, pub_id, price, sum(isnull(ytd_sales, 0)) AS ytd_sales FROM titles where pub_id < '9' GROUP BY GROUPING SETS ( type, pub_id, price) go
type pub_id price ytd_sales ------------ ------ --------------------- ----------- NULL NULL NULL 0 NULL NULL 0.0006 111 NULL NULL 0.0017 750 NULL NULL 14.3279 4095 NULL NULL 14.595 18972 NULL NULL 14.9532 14294 NULL NULL 14.9611 4095 NULL NULL 15.894 40968 NULL NULL 15.9329 3336 NULL NULL 17.0884 2045 NULL NULL 17.1675 8780 NULL 0736 NULL 28286 NULL 0877 NULL 44219 NULL 1389 NULL 24941 business NULL NULL 30788 mod_cook NULL NULL 24278 popular_comp NULL NULL 12875 psychology NULL NULL 9939 trad_cook NULL NULL 19566
|
In the output in Listing 1, the first 11 rows are the results grouped by price, the next 3 rows are grouped by pub_id, and the bottom 5 rows are grouped by type. Now, you can modify this query to include a super-aggregate for all rows by adding a null field list, as shown in Listing 2.
Listing 2. GROUPING SETS Example with Null Field List to Generate Super-Aggregate
SELECT type, pub_id, price, sum(isnull(ytd_sales, 0)) AS ytd_sales FROM titles where pub_id < '9' GROUP BY GROUPING SETS ( type, pub_id, price, () ) go
type pub_id price ytd_sales ------------ ------ --------------------- ----------- NULL NULL NULL 0 NULL NULL 0.0006 111 NULL NULL 0.0017 750 NULL NULL 14.3279 4095 NULL NULL 14.595 18972 NULL NULL 14.9532 14294 NULL NULL 14.9611 4095 NULL NULL 15.894 40968 NULL NULL 15.9329 3336 NULL NULL 17.0884 2045 NULL NULL 17.1675 8780 NULL NULL NULL 97446 NULL 0736 NULL 28286 NULL 0877 NULL 44219 NULL 1389 NULL 24941 business NULL NULL 30788 mod_cook NULL NULL 24278 popular_comp NULL NULL 12875 psychology NULL NULL 9939 trad_cook NULL NULL 19566
|
If you look closely at the results in Listing 2, you see there are two rows with NULL values for all three columns for type, pub_id, and price.
How can you determine definitively which row is the super-aggregate of
all three rows, and which is a row grouped by price where the value of price is NULL? This is where the new grouping_id() function comes in.
The grouping_id() Function
The grouping_id() function, new in SQL Server 2008, can be used to determine the level of grouping in a query using GROUPING SETS or the CUBE and ROLLUP operators. Unlike the GROUPING() function, which takes only a single column expression as an argument and returns a 1 or 0 to indicate whether that individual column is being aggregated, the grouping_id() function accepts multiple column expressions and returns a bitmap to indicate which columns are being aggregated for that row.
For example, you can add the grouping_id() and grouping() functions to the query in Listing 2 and examine the results (see Listing 3).
Listing 3. Using the grouping_id() Function
SELECT type, pub_id, price, sum(isnull(ytd_sales, 0)) AS ytd_sales, grouping_id(type, pub_id, price) as grping_id, grouping(type) type_rlp, grouping(pub_id) pub_id_rlp, grouping(price) price_rlp FROM titles where pub_id < '9' GROUP BY GROUPING SETS ( type, pub_id, price, () ) go
type pub_id price ytd_sales grping_id type_rlp pub_id_rlp price_rlp ------------ ------ ------- --------- --------- -------- ---------- --------- NULL NULL NULL 0 6 1 1 0 NULL NULL 0.0006 111 6 1 1 0 NULL NULL 0.0017 750 6 1 1 0 NULL NULL 14.3279 4095 6 1 1 0 NULL NULL 14.595 18972 6 1 1 0 NULL NULL 14.9532 14294 6 1 1 0 NULL NULL 14.9611 4095 6 1 1 0 NULL NULL 15.894 40968 6 1 1 0 NULL NULL 15.9329 3336 6 1 1 0 NULL NULL 17.0884 2045 6 1 1 0 NULL NULL 17.1675 8780 6 1 1 0 NULL NULL NULL 97446 7 1 1 1 NULL 0736 NULL 28286 5 1 0 1 NULL 0877 NULL 44219 5 1 0 1 NULL 1389 NULL 24941 5 1 0 1 business NULL NULL 30788 3 0 1 1 mod_cook NULL NULL 24278 3 0 1 1 popular_comp NULL NULL 12875 3 0 1 1 psychology NULL NULL 9939 3 0 1 1 trad_cook NULL NULL 19566 3 0 1 1
|
Unlike the grouping() function, which takes only a single column name as an argument, the grouping_id() function accepts all columns that participate in any grouping set. The grouping_id()
function produces an integer result that is a bitmap, where each bit
represents a different column, producing a unique integer for each
grouping set. The bits in the bitmap indicate whether the columns are
being aggregated in the grouping set (bit value is 1) or if the column is used to determine the grouping set (bit value is 0) used to calculate the aggregate value.
The bit values are assigned to columns from right to left in the order the columns are listed in the grouping_id() function. For example, in the query in Listing 42.5, price is the rightmost bit value, bit 1; pub_id is assigned the next bit value, bit 2, and type is assigned the leftmost bit value, bit 3. When the grouping_id() value equals 6, that means the bits 2 and 3 are turned on (4 + 2 + 0 = 6). This indicates that the type and pub_id columns are being aggregated in the grouping set, and the price column defines the grouping set.
The grouping_id() column can thus be used to determine which of the two rows where type, pub_id, and price are all NULL is the row with the super-aggregate of all three columns (grouping_id = 7), and which row is an aggregate rolled up where the value of price is NULL (grouping_id = 6).
The values returned by the grouping_id()
function can also be used for further filtering your grouping set
results or for sorting your grouping set results, as shown in Listing 4.
Listing 4. Using the grouping_id() Function to Sort Results
SELECT type, pub_id, price, sum(isnull(ytd_sales, 0)) AS ytd_sales,
grouping_id(type, pub_id, price) as grping_id
FROM titles
where pub_id < '9'
GROUP BY GROUPING SETS ( type, pub_id, price, () )
order by grping_id
go
type pub_id price ytd_sales grping_id
------------ ------ -------- ----------- -----------
business NULL NULL 30788 3
mod_cook NULL NULL 24278 3
popular_comp NULL NULL 12875 3
psychology NULL NULL 9939 3
trad_cook NULL NULL 19566 3
NULL 0736 NULL 28286 5
NULL 0877 NULL 44219 5
NULL 1389 NULL 24941 5
NULL NULL NULL 0 6
NULL NULL 0.0006 111 6
NULL NULL 0.0017 750 6
NULL NULL 14.3279 4095 6
NULL NULL 14.595 18972 6
NULL NULL 14.9532 14294 6
NULL NULL 14.9611 4095 6
NULL NULL 15.894 40968 6
NULL NULL 15.9329 3336 6
NULL NULL 17.0884 2045 6
NULL NULL 17.1675 8780 6
NULL NULL NULL 97446 7