programming4us
programming4us
DATABASE

Exploring the T-SQL Enhancements in SQL Server 2005 : Common Table Expressions

- How To Install Windows Server 2012 On VirtualBox
- How To Bypass Torrent Connection Blocking By Your ISP
- How To Install Actual Facebook App On Kindle Fire
10/9/2010 5:44:17 PM
A common table expression (CTE) closely resembles a nonpersistent view. It is a temporary named result set that you define in your query that will be used by the FROM clause of the query. Each CTE is defined only once (but can be referred to as many times as it is in scope) and lives for as long as the query lives. You can use CTEs to perform recursive operations. Here is the syntax to create a CTE:
WITH <name of your CTE>(<column names>)
AS
(
<actual query>
)

SELECT * FROM <name of your CTE>

Note

SQL Server 2005 also ships with a new database called AdventureWorks, which is like a Northwind database on steroids. Most of our examples will use AdventureWorks, which has much more functionality than Northwind. We will call out anything new and exciting about it as we bump into the features, but for most of this book all you have to know is that it is a basic order-entry system.


An example of a simple CTE using AdventureWorks is shown here:

Listing 1. A simple CTE
WITH AllMRContacts
AS
(SELECT * FROM Person.Contact WHERE Title='Mr.')

SELECT LastName + ', ' + FirstName AS Contact
FROM AllMRContacts
ORDER BY LastName

The results are shown here:

Contact
-------
Abbas, Syed
Achong, Gustavo
Adams, Jay
Adams, Ben
Adina, Ronald
Agcaoili, Samuel
[more rows]

The following example gets a count of all the sales a salesperson made in the AdventureWorks orders system as a CTE and then executes a simple inner join with the SalesPerson table to return more information about the salesperson, such as his quota. This demonstrates how a CTE is joined to your calling query. You can do this without a CTE, but think about all the times you have created a temp table or a throwaway view and joined back to it—now you can use a CTE instead and keep the complexity of aggregating in the CTE only, thereby simplifying your code. The code is shown here:

Listing 2. CTE-to-query join
USE AdventureWorks
GO

WITH OrderCountCTE(SalesPersonID, OrderCount)
AS
(
SELECT SalesPersonID, COUNT(*)
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID
)
SELECT sp.SalesPersonID, oc.OrderCount, sp.SalesYTD
FROM Sales.SalesPerson sp
INNER JOIN OrderCountCTE oc ON oc.SalesPersonID=sp.SalesPersonID
ORDER BY oc.OrderCount

The results look like this:

SalesPersonID     OrderCount      SalesYTD
------------------------------------------
288 16 219088.8836
284 39 636440.251
268 48 677558.4653
280 60 0.00
290 109 1758385.926
289 130 2241204.0424
287 140 1931620.1835
[more rows]

CTEs can also eliminate self-joins in some of your queries. Take a look at this example. We will create a table called Products and insert duplicates into the Product Name field.

Listing 3. Inserting duplicates into AdventureWorks’ Product_Name
CREATE TABLE Products (
Product_ID int NOT NULL,
Product_Name varchar (25),
Price money NULL,
CONSTRAINT PK_Products PRIMARY KEY NONCLUSTERED (Product_ID)
)
GO
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (1, 'Widgets', 25)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (2, 'Gadgets', 50)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (3, 'Thingies', 75)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (4, 'Whoozits', 90)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (5, 'Whatzits', 5)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (6, 'Gizmos', 15)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (7, 'Widgets', 24)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (8, 'Gizmos', 36)
INSERT INTO Products (Product_ID, Product_Name, Price) VALUES (9, 'Gizmos', 36)
GO


One common problem found in databases is having duplicate product names with different product IDs. If you run a duplicate-finding query, that query will return all the records (the duplicates and the good values). This increases the difficulty of automatically deleting duplicates. If you want to find the Product Name duplicates without also including the first instance of the name in the table, you can use a self-join like this:

Listing 4. Self-join without CTE
SELECT * FROM Products WHERE Product_ID NOT IN
(SELECT MIN(Product_ID) FROM Products AS P
WHERE Products.Product_Name = P.Product_Name)

The self-join returns data like this:

Product_ID       Product_Name        Price
------------------------------------------
8 Gizmos 36.00
9 Gizmos 36.00
7 Widgets 24.00

You can also rewrite your query using a CTE to eliminate the confusing-looking self-join and get the same results. This technique does not offer a performance gain over self-joins; it is just a convenience for code maintainability. The previous self-join example is rewritten here as a CTE and yields the same results:

Listing 5. Self-join as a CTE
WITH CTEMinProductRecords AS (
SELECT MIN(Product_ID) AS Product_ID, Product_Name
FROM Products
GROUP BY Product_Name
HAVING COUNT(*) > 1)
SELECT *
FROM Products JOIN CTEMinProductRecords ON
Products.Product_Name = CTEMinProductRecords.Product_Name
AND Products.Product_ID > CTEMinProductRecords.Product_ID

After you investigate your duplicates using the preceding CTE, you might want to delete the duplicate data. You might also want to update any foreign keys in related tables to use the original Product_ID value. If your duplicate data does not have any related child rows in another table, or if you have updated them to the correct Product_ID, you can delete the duplicate data by just rewriting the CTE like this, replacing the SELECT * with a DELETE:

Listing 6. Deleting duplicates in a CTE
WITH CTEMinProductRecords AS (
SELECT MIN(Product_ID) AS Product_ID, Product_Name
FROM Products
GROUP BY Product_Name
HAVING COUNT(*) > 1)
DELETE Products
FROM Products JOIN CTEMinProductRecords ON
Products.Product_Name = CTEMinProductRecords.Product_Name
AND Products.Product_ID > CTEMinProductRecords.Product_ID

Recursive Queries with CTEs

The true power of CTEs emerges when you use them recursively to perform hierarchical queries on tree-structured data. In fact, this was a major reason that Microsoft built CTEs, in addition to SQL-92 compliance. A recursive CTE is constructed from a minimum of two queries. The first, the anchor member (AM), is a nonrecursive query; the second, the recursive member (RM), is the recursive query. Within your CTE’s parentheses (after the AS clause), you define queries that are independent or refer back to the same CTE. The AM and RM are separated by a UNION ALL statement. AMs are invoked only once; RMs are invoked repeatedly until the query returns no rows. You can append multiple AMs to each other using a UNION or UNION ALL operator, depending on whether you want to eliminate duplicates. (You must append recursive members using a UNION ALL operator.) Here is the syntax:

WITH SimpleRecursive(field names)
AS
(
<Select Statement for the Anchor Member>

UNION ALL

<Select Statement for the Recursive Member>
)

SELECT * FROM SimpleRecursive

The following example demonstrates this feature. We’ll create a table of employees and a self-referencing field back to Employee_ID called ReportsTo. We’ll then write a query that returns all the employees who report to Stephen (Employee_ID=2) and all the employees who report to Stephen’s subordinates.

Listing 7. Example table for recursive CTE queries
--create a table with tree data
CREATE TABLE Employee_Tree (Employee_NM nvarchar(50), Employee_ID int PRIMARY KEY, Repor
tsTo int)
--insert some data, build a reporting tree
INSERT INTO Employee_Tree VALUES('Richard', 1, NULL)
INSERT INTO Employee_Tree VALUES('Stephen', 2, 1)
INSERT INTO Employee_Tree VALUES('Clemens', 3, 2)
INSERT INTO Employee_Tree VALUES('Malek', 4, 2)
INSERT INTO Employee_Tree VALUES('Goksin', 5, 4)
INSERT INTO Employee_Tree VALUES('Kimberly', 6, 1)
INSERT INTO Employee_Tree VALUES('Ramesh', 7, 5)


Our table looks like this:

Employee_NMEmployee_IDReportsTo
Richard1Null
Stephen21
Clemens32
Malek42
Goksin54
Kimberly61
Ramesh75

Here’s the recursive query to determine which employees report to Stephen:

Listing 8. Recursive CTE query
--Recursive Query
WITH SimpleRecursive(Employee_NM, Employee_ID, ReportsTo)
AS
(SELECT Employee_NM, Employee_ID, ReportsTo
FROM Employee_Tree WHERE Employee_ID = 2
UNION ALL
SELECT p.Employee_NM, p.Employee_ID, p.ReportsTo
FROM Employee_Tree p INNER JOIN
SimpleRecursive A ON A.Employee_ID = P.ReportsTo
)
SELECT sr.Employee_NM AS Employee, et.Employee_NM AS Boss
FROM SimpleRecursive sr INNER JOIN Employee_Tree et
ON sr.ReportsTo = et.Employee_ID

Here are the results:

Employee       Boss
-------------------
Stephen Richard
Clemens Stephen
Malek Stephen
Goksin Malek
Ramesh Goskin

This recursion starts where Employee_ID = 2 (the Anchor Member or the first SELECT). It picks up that record and then, using the Recursive Member (the SELECT after the UNION ALL), picks up all the records that report to Stephen and that record’s children. (Goksin reports to Malek, and Malek reports to Stephen.) Each subsequent recursion tries to find more children that have as parents the employees found by the previous recursion. Eventually the recursion returns no results, and that is what causes the recursion to stop (the reason why Kimberly is not returned). If the anchor member is changed to Employee_ID = 1, then Kimberly will also be returned in the results.

By design, the recursive member keeps looking for children and can cycle on indefinitely. If you suspect many cycles will occur and want to limit the number of recursive invocations, you can specify the MAXRECURSION option right after the outer query using the OPTION clause:

OPTION(MAXRECURSION 25)

This option causes SQL Server 2005 to raise an error when the CTE exceeds the specified limit. By default, the limit is 100 (if you’ve omitted the option). To specify no option, you must set MAXRECURSION to 0. You can also run the same query to find direct reports and subordinates only one level deep (including direct reports Clemens and Malek and Malek’s subordinate Goksin but skipping Ramesh, which is three levels deep).

Listing 9. Recursive query with MAXRECURSION
--Recursive Query with MAXRECURSION
WITH SimpleRecursive(Employee_NM, Employee_ID, ReportsTo)
AS
(SELECT Employee_NM, Employee_ID, ReportsTo
FROM Employee_Tree WHERE Employee_ID = 2
UNION ALL
SELECT p.Employee_NM, p.Employee_ID, p.ReportsTo
FROM Employee_Tree p INNER JOIN
SimpleRecursive A ON A.Employee_ID = p.ReportsTo
)
SELECT sr.Employee_NM AS Employee, et.Employee_NM AS Boss
FROM SimpleRecursive sr INNER JOIN Employee_Tree et
ON sr.ReportsTo = et.Employee_ID
OPTION(MAXRECURSION 2)

Here are the results:

Employee       Boss
-------------------
Stephen Richard
Clemens Stephen
Malek Stephen
Goksin Malek

You will also see that the query raises the following error message:

Msg 530, Level 16, State 1, Line 2
The statement terminated. The maximum recursion 2 has been exhausted before statement completion.


One way to avoid the exception is to use a generated column to keep track of the level you are on and include that in the WHERE clause instead of MAXRECURSION. The following example returns the same data as the previous example but without the error.

Listing 10. Controlling recursion without MAXRECURSION
WITH SimpleRecursive(Employee_NM, Employee_ID, ReportsTo, SubLevel)
AS
(SELECT Employee_NM, Employee_ID, ReportsTo, 0
FROM Employee_Tree WHERE Employee_ID = 2
UNION ALL
SELECT p.Employee_NM, p.Employee_ID, p.ReportsTo, SubLevel + 1
FROM Employee_Tree p INNER JOIN
SimpleRecursive A ON A.Employee_ID = p.ReportsTo
)
SELECT sr.Employee_NM AS Employee, et.Employee_NM AS Boss
FROM SimpleRecursive sr INNER JOIN Employee_Tree et
ON sr.ReportsTo = et.Employee_ID
WHERE SubLevel <=2
Other  
 
Top 10
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 2) - Wireframes,Legends
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Finding containers and lists in Visio (part 1) - Swimlanes
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Formatting and sizing lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Adding shapes to lists
- Microsoft Visio 2013 : Adding Structure to Your Diagrams - Sizing containers
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 3) - The Other Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 2) - The Data Properties of a Control
- Microsoft Access 2010 : Control Properties and Why to Use Them (part 1) - The Format Properties of a Control
- Microsoft Access 2010 : Form Properties and Why Should You Use Them - Working with the Properties Window
- Microsoft Visio 2013 : Using the Organization Chart Wizard with new data
REVIEW
- First look: Apple Watch

- 3 Tips for Maintaining Your Cell Phone Battery (part 1)

- 3 Tips for Maintaining Your Cell Phone Battery (part 2)
programming4us programming4us
programming4us
 
 
programming4us