Let’s face it: Users usually want to see data in
tabular format, which is a bit of a challenge given that data in SQL
Server is most often stored in a highly relational form. PIVOT is a new T-SQL operator that you can specify in your FROM clause to rotate rows into columns and create a traditional crosstab query.
Using PIVOT is easy. In your SELECT
statement, you specify the values you want to pivot on. The following
example using the AdventureWorks database uses the order years
(calculated using the DatePart function) as the columns. The FROM clause looks normal except for the PIVOT
statement. This statement creates the value you want to show in the
rows of the newly created columns. This example uses the aggregate SUM of TotalDue (a calculated field in the FROM clause). Then we use the FOR operator to list the values we want to pivot on in the OrderYear column. The example is shown here:
Listing 1. Creating tabular results with the PIVOT operator
SELECT CustomerID, [2001] AS Y2001, [2002] AS Y2002, [2003] AS Y2003, [2004] AS Y2004 FROM ( SELECT CustomerID, DATEPART(yyyy, OrderDate) as OrderYear, TotalDue FROM Sales.SalesOrderHeader ) piv PIVOT ( SUM (TotalDue) FOR OrderYear IN ([2001], [2002], [2003], [2004]) ) AS chld ORDER BY CustomerID
|
Here are the results:
CustomerID Y2001 Y2002 Y2003 Y2004
-----------------------------------------------------------------
1 40732.6067 72366.1284 NULL NULL
2 NULL1 5653.6715 12118.0275 4962.2705
3 39752.8421 168393.7021 219434.4265 51925.3549
4 NULL 263025.3113 373484.299 143525.6018
5 NULL 33370.6901 60206.9999 20641.1106
6 NULL NULL 668.4861 2979.3473
7 NULL 6651.036 3718.7804 NULL
8 NULL NULL 19439.2466 10900.0347
9 NULL 320.6283 11401.5975 5282.8652
10 NULL 96701.7401 291472.2172 204525.9634
[more rows]
That’s
all there is to it. Of course, this example is simplified to show you
the new concept; other, more sophisticated aggregates are possible, and
you can even use CTEs in the FROM clause. In any case, using PIVOT is simple.
Listing 2 shows PIVOT in an SQL statement and the corresponding Visual C# code to fill the DataGrid.
Listing 2. Using PIVOT in Visual C#
private void Page_Load(object sender, System.EventArgs e) { //Set up the connection string to SQL Server 2005. //If using integrated security, you have to grant permission to the //ASP.NET worker process.
SqlConnection conn =new SqlConnection("server=KILIMANJARO;integrated security=SSPI; database=AdventureWorks"); //Open the connection to SQL Server (SQL Server 2005) conn.Open();
//Create a new SQL string using the new T-SQL keyword PIVOT. //You have to know something about your data. string strSQL= @" SELECT CustomerID, [2001] AS Y2001, [2002] AS Y2002, [2003] AS Y2003, [2004] AS Y2004 FROM ( SELECT CustomerID, datepart(yyyy, OrderDate) AS OrderYear, TotalDue FROM Sales.SalesOrderHeader ) piv PIVOT ( SUM (TotalDue) FOR OrderYear IN ([2001], [2002], [2003], [2004]) ) AS chld ORDER BY CustomerID";
//Can set a command to a SQL text and connection SqlCommand cmd = new SqlCommand(strSQL, conn); SqlDataReader dr;
//Open a data reader. dr = cmd.ExecuteReader(CommandBehavior.CloseConnection); //Do the data binding to a simple data grid. //Using a data grid for code simplicity to avoid complex data binding. DataGrid1.DataSource = dr; DataGrid1.DataBind(); }
|
Using UNPIVOT
You can use the new UNPIVOT
operator to normalize data that is already pivoted. For example,
suppose you have a table in your database that looks like this:
VendorID | Emp1 | Emp2 | Emp3 | Emp4 | Emp5 |
---|
1 | 4 | 3 | 5 | 4 | 4 |
2 | 4 | 1 | 5 | 5 | 5 |
3 | 4 | 3 | 5 | 4 | 4 |
4 | 4 | 2 | 5 | 4 | 4 |
5 | 5 | 1 | 5 | 5 | 5 |
You might want to unpivot the data to display columns for VendorID, Employee, and Number of Orders. Listing 3 shows how to use the UNPIVOT operator to achieve this goal.
Listing 3. Using the UNPIVOT operator
SELECT VendorID, Employee, Orders AS NumberOfOrders FROM (SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5 FROM pvt) p UNPIVOT (Orders FOR Employee IN (Emp1, Emp2, Emp3, Emp4, Emp5) )AS unpvt GO
|
Here are the results:
VendorID Employee NumberOfOrders
------------------------------------------
1 Emp 14
1 Emp 23
1 Emp 35
1 Emp 44
2 Emp 14
[more rows]
Dynamically Pivoting Columns
The problem with PIVOT is the same problem with CASE and other methods: You have to specify the columns. Consider this code.
Listing 4. Statically driven PIVOT
SELECT * FROM (SELECT CustomerID, YEAR(OrderDate) AS orderyear, TotalDue FROM Sales.SalesOrderHeader) as Header PIVOT(SUM(TotalDue) FOR orderyear IN([2002],[2003],[2004])) as Piv
|
The results show us a nice crosstab query with the years displayed as columns:
CustomerID 2002 2003 2004
----------------------------------------------------------
14324 NULL 2264.2536 3394.9247
22814 NULL 5.514 NULL
11407 NULL 59.659 NULL
28387 NULL NULL 645.2869
19897 NULL NULL 659.6408
15675 2699.9018 2682.9953 2580.1529
24165 NULL 2699.9018 666.8565
[more rows]
Because
this data only goes up to 2004, what happens when you add 2005 to the
data? Do you want to go into all your queries and add the current year
to the ININ
clause and then programmatically writing the entire SQL statement. Once
you have dynamically written the SQL statement, you can execute it
using sp_executesql, as shown in Listing 5. The results are exactly the same as those shown following Listing 2-14, except that as new yearly data is added to the table, the query dynamically adds the column for it. clause? We can accommodate new years in the data by dynamically building the
Listing 5. Dynamically driven PIVOT
--Dynamic Pivot DECLARE @tblOrderDate AS TABLE(y int NOT NULL PRIMARY KEY) INSERT INTO @tblOrderDate SELECT DISTINCT YEAR(OrderDate) FROM Sales.SalesOrderHeader -- Construct the column list for the IN clause -- e.g., [2002],[2003],[2004] DECLARE @cols AS nvarchar(MAX), @years AS int SET @years = (SELECT MIN(y) FROM @tblOrderDate) SET @cols = N'' WHILE @years IS NOT NULL BEGIN SET @cols = @cols + N',['+CAST(@years AS nvarchar(10))+N']' SET @years = (SELECT MIN(y) FROM @tblOrderDate WHERE y > @years) END SET @cols = SUBSTRING(@cols, 2, LEN(@cols)) -- Construct the full T-SQL statement and execute it dynamically. DECLARE @sql AS nvarchar(MAX) SET @sql = N'SELECT * FROM (SELECT CustomerID, YEAR(OrderDate) AS orderyear, TotalDue FROM Sales.SalesOrderHeader) as Header PIVOT(SUM(TotalDue) FOR orderyear IN(' + @cols + N')) AS Piv' --PRINT @sql -- for debugging EXEC sp_executesql @sql
|
You can accomplish the same results using the new CTE syntax instead of using the TABLE variable:
Listing 6. Dynamic CTE PIVOT
--Dynamic Pivot with CTE and "SELECT @cols = ..." syntax -- Construct the column list for the IN clause -- e.g., [2002],[2003],[2004] DECLARE @cols AS nvarchar(MAX) ; WITH YearsCTE AS (SELECT DISTINCT YEAR(OrderDate) as [Year] FROM Sales.SalesOrderHeader) SELECT @cols = ISNULL(@cols + ',[', '[') + CAST([YEAR] AS nvarchar(10)) + ']' FROM YearsCTE ORDER BY [YEAR]
-- Construct the full T-SQL statement and execute it dynamically. DECLARE @sql AS nvarchar(MAX) SET @sql = N'SELECT * FROM (SELECT CustomerID, YEAR(OrderDate) AS orderyear, TotalDue FROM Sales.SalesOrderHeader) as Header PIVOT(SUM(TotalDue) FOR orderyear IN(' + @cols + N')) AS Piv' PRINT @sql -- for debugging EXEC sp_executesql @sql
|
The APPLY Operator
APPLY is a new operator that you specify in the FROM clause of a query. It enables you to invoke a table-valued function for each row of an outer table. The flexibility of APPLY is evident when you use the outer table’s columns as your function’s arguments. The APPLY operator has two forms: CROSS APPLY and OUTER APPLY. CROSS APPLY doesn’t return the outer table’s row if the table-valued function returns an empty set for it; the OUTER APPLY returns a row with NULL values instead of the function’s columns.
To see how APPLY works, we’ll first create a FUNCTION that returns a TABLE. Here is a simple function that returns as a TABLE the top n rows from the SalesOrderHeader table:
Listing 7. Returning a TABLE
USE AdventureWorks GO CREATE FUNCTION fn_getcustomerorders(@CustomerID int, @TopRecords bigint) RETURNS TABLE AS RETURN SELECT TOP (@TopRecords) * FROM Sales.SalesOrderHeader WHERE CustomerID = @CustomerID ORDER BY OrderDate DESC
|
After creating the TABLE returning function fn_getcustomerorders, we call it from the query:
Listing 8. Executing query with APPLY
SELECT * FROM Sales.Customer cust CROSS APPLY fn_getcustomerorders(cust.CustomerID, 647)
|
This
query returns all the records from the Customers table and then, as
additional fields, the records from the Orders table (by way of the fn_getcustomerorders function) that match for the customer ID because that’s what’s being passed in dynamically as an argument.