programming4us
programming4us
DATABASE

SQL Server 2008 : Transact-SQL Programming - Common Table Expressions

- How To Install Windows Server 2012 On VirtualBox
- How To Bypass Torrent Connection Blocking By Your ISP
- How To Install Actual Facebook App On Kindle Fire
5/30/2011 5:22:11 PM
A common table expression (CTE) is an ANSI SQL-99 expression that produces a table that is referred to by name within the context of a single query. The general syntax for a CTE is as follows:
WITH expression_name [ ( column_name [ ,...n ] ) ]
AS ( CTE_query_definition )

The WITH clause, in effect, defines a table and its columns. Note that the syntax of the WITH clause is similar to that of a view. You can think of a CTE as a temporary view that lasts only for the life of the query that defines the CTE. Listing 1 shows an example of a simple CTE. This CTE is used to return the average and maximum sales quantities for each store. The CTE is then joined to the sales table to return the average and maximum sales quantity for the store, along with sales records for a specific title_id.

Listing 1. An Example of a Simple CTE
with sales_avg (stor_id, avg_qty, max_qty)
as (select stor_id, avg(qty), max(qty)
from sales
group by stor_id)
select top 5 s.stor_id, s.ord_num,
convert(varchar(10), ord_date, 101) as ord_date,
qty, title_id, avg_qty, max_qty
from sales s
join sales_avg a on s.stor_id = a.stor_id
where s.title_id = 'DR8514'
go

stor_id ord_num ord_date qty title_id avg_qty max_qty
------- -------------------- ---------- ------ -------- ------- -------
A004 ONGGGGGGGGGGGGGGG 09/13/2006 1224 DR8514 1008 1716
A068 ONEEEEEEEEEEE 09/02/2007 1572 DR8514 961 1572
A071 ONWWWWWWWWWWWWWWWWWW 08/20/2006 1728 DR8514 948 1728
A161 ONDDDDDDDDDDDD 05/25/2006 624 DR8514 829 1668
A203 ONGGGGGGGGGGGGGGGGGG 11/16/2006 1572 DR8514 1056 1692

Note

If the WITH clause for a CTE is not the first statement in the batch, you should delimit it from the preceding statement by placing a semicolon (;) in front of it. The semicolon is used to avoid ambiguity with other uses of the WITH clause (for example, for table hints). Including a semicolon is not necessary in all cases, but it is recommended that you use it consistently to avoid problems.


It is also possible to define multiple CTEs in a single query, with each CTE delimited by a comma. Each CTE is able to refer to previously defined CTEs. Listing 2 shows an example of a nested CTE that calculates the minimum, maximum, and difference of counts of store orders.

Listing 2. An Example of Multiple CTEs in a Single Query
WITH store_orders(stor_id, cnt)
AS ( SELECT stor_id, COUNT(*) FROM sales
GROUP BY stor_id ),
MinMaxCTE(MN, MX, Diff)
AS ( SELECT MIN(Cnt), MAX(Cnt), MAX(Cnt)-MIN(Cnt)
FROM store_orders )
SELECT * FROM MinMaxCTE
go

MN MX Diff
----------- ----------- -----------
1 22 21

A CTE must be followed by a single SELECT, INSERT, UPDATE, or DELETE statement that references some or all of the CTE columns. A CTE can also be specified in a CREATE VIEW statement as part of the defining SELECT statement of the view.

Listing 3 shows an example of a CTE used in a DELETE statement.

Listing 3. An Example of a CTE in a DELETE
with oldest_sales (stor_id, ord_num, ord_date)
as (select top 100 stor_id, ord_num, ord_date from sales_big order by ord_date)
delete sales_big from sales_big s, oldest_sales o
where s.stor_id = o.stor_id
and s.ord_num = o.ord_num
and s.ord_date = o.ord_date
go

Most valid SELECT statement constructs are allowed in a CTE, except the following:

  • COMPUTE or COMPUTE BY

  • ORDER BY (except when a TOP clause is specified)

  • INTO

  • OPTION clause with query hints

  • FOR XML

  • FOR BROWSE

Recursive Queries with CTEs

Nonrecursive CTEs are ANSI SQL-99 compliant expressions that provide T-SQL coding flexibility. However, for each nonrecursive CTE, there is usually another T-SQL construct that can be used to achieve the same results (for example, derived tables). The real power and capability of CTEs is revealed when you use them to create recursive queries.

A recursive CTE can help simplify the code required to run a recursive query within a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. Recursive queries are often useful for expanding a hierarchy stored in a relational table (for example, displaying employees in an organizational chart). In previous versions of SQL Server, a recursive query usually required using temporary tables, cursors, and logic to control the flow of the recursive steps.

A CTE is considered recursive when it refers to itself within the CTE definition. Recursive CTEs are constructed from at least two queries. One is a nonrecursive query, also referred to as the anchor member (AM). The other is the recursive query, also referred to as the recursive member (RM). The queries are combined using the UNION ALL operator.

The following pseudocode defines the basic structure of a recursive CTE:

WITH cte_name ( column_name [,...n] )
AS
( CTE_query_definition1 -- Anchor member (AM) is defined.
UNION ALL
CTE_query_definition2 -- Recursive member (RM) is referencing cte_name.
)
-- Statement using the CTE
SELECT col_list FROM cte_name
...

Logically, you can think of the algorithm implementing the recursive CTE as follows:

1.
The anchor member is activated, and the initial result set (R) is generated.

2.
The recursive member is activated, using the initial result set (Rn) as input and generating result set Rn+1.

3.
The logic of step 2 is run repeatedly, incrementing the step number (n) until an empty set is returned.

4.
The outer query is executed, getting the cumulative (UNION ALL) result of all the previous steps when referring to the recursive CTE.

You can have more than two members in a recursive CTE, but only the UNION ALL operator is allowed between a recursive member and another recursive or nonrecursive member. Other operators, such as UNION, are allowed only between nonrecursive members. Recursive CTEs also require an exact match of the columns in all members, including the same data type, length, and precision.

Listing 4 shows a simple recursive CTE that generates a list of sequential numbers. Note that the AM generates the base result, and the RM following the UNION ALL controls the recursion. It is important in this example that a valid endpoint be defined to avoid infinite recursion.

Listing 4. An Example of a Simple Recursive CTE
with numlist (val)
as (select 1
union all
select val + 1
from numlist
where val < 10)
select * from numlist
go

val
-----------
1
2
3
4
5
6
7
8
9
10

The following sections present some examples and uses of recursive CTEs.

Using Recursive CTEs for Expanding a Hierarchy

For this hierarchy example, we use the parts table in the bigpubs2008 database. This table contains a simplified hierarchy of car parts, as shown in Figure 1.

Figure 1. The parts table hierarchy.

In the parts table, any part that is a subpart of another part has the parent part ID stored in the parentpartid column. The parentpartid column is a foreign key that references the partid column. Therefore, the parentpartid must either correspond to a valid partid within the table or be NULL. For example, the car itself has NULL in the parentpartid column.

Following are some common requests that might be run on the parts table:

  • Return all the parts for the engine.

  • Show me all parts that are two levels below the drivetrain.

  • Show me all the parts in such a way that it is easy to see their hierarchical dependencies.

The first request is probably the most common one: returning a part (for example, the engine, which has partid = 2) and all subparts. The recursive CTE shown in Listing 5 provides a solution to this request.

Listing 5. A Recursive CTE to Return a Part and All Subparts
WITH PartsCTE(partid, partname, parentpartid, lvl)
AS
(
SELECT partid, partname, parentpartid, 0
FROM PARTS
WHERE partid = 2 -- Engine
UNION ALL

SELECT P.partid, P.partname, P.parentpartid, PP.lvl+1
FROM Parts as P
JOIN PartsCTE as PP
ON P.parentpartid = PP.Partid
)
SELECT PartID, Partname, ParentPartid, lvl
FROM PartsCTE
go

PartID Partname ParentPartid lvl
----------- ------------------------------ ------------ -----------
2 Engine 1 0
5 Radiator 2 1
6 Intake Manifold 2 1
7 Exhaust Manifold 2 1
8 Carburetor 2 1
13 Piston 2 1
14 Crankshaft 2 1
21 Piston Rings 13 2
11 Float Valve 8 2


Notice that the lvl value is repeatedly incremented with each recursive invocation of the CTE. You can use this level counter to limit the number of iterations in the recursion. For example, Listing 6 shows an example of a CTE that returns all parts two levels below the drivetrain.

Listing 6. A Recursive CTE to Return All Subparts Two Levels Below a Part
WITH PartsCTE(partid, partname, parentpartid, lvl)
AS
(
SELECT partid, partname, parentpartid, 0
FROM PARTS
WHERE partid = 1 -- Drivetrain
UNION ALL

SELECT P.partid, P.partname, P.parentpartid, PP.lvl+1
FROM Parts as P
JOIN PartsCTE as PP
ON P.parentpartid = PP.Partid
where lvl < 2
)
SELECT PartID, Partname, ParentPartid, lvl
FROM PartsCTE
where lvl = 2
go

PartID Partname ParentPartid lvl
----------- ------------------------------ ------------ -----------
9 Flywheel 3 2
10 Clutch 3 2
16 Gear Box 3 2
5 Radiator 2 2
6 Intake Manifold 2 2
7 Exhaust Manifold 2 2
8 Carburetor 2 2
13 Piston 2 2
14 Crankshaft 2 2



In Listing 6, the filter WHERE lvl < 2 in the recursive member is used as a recursion termination check; recursion stops when lvl = 2. The filter on the outer query (WHERE lvl = 2) is used to remove all parts up to the second level. Logically, the filter in the outer query (lvl = 2) is sufficient by itself to return only the desired rows, but for performance reasons, you should include the filter in the recursive member to stop the recursion as soon as two levels below the drivetrain are returned.

SQL Server allows the use of local variables in a CTE to help make the query more generic. For example, you can use variables instead of constants for the part ID and level, as shown in Listing 7.

Listing 7. Using Local Variables in a Recursive CTE
DECLARE @partid AS INT, @lvl AS INT;
SET @partid = 22; -- Car
SET @lvl = 2; -- two levels
WITH PartsCTE(partid, partname, parentpartid, lvl)
AS
(
SELECT partid, partname, parentpartid, 0
FROM PARTS
WHERE partid = @partid
UNION ALL

SELECT P.partid, P.partname, P.parentpartid, PP.lvl+1
FROM Parts as P
JOIN PartsCTE as PP
ON P.parentpartid = PP.Partid
WHERE lvl < @lvl
)
SELECT PartID, Partname, ParentPartid, lvl
FROM PartsCTE
Go

PartID Partname ParentPartid lvl
----------- ------------------------------ ------------ -----------
22 Car NULL 0
1 DriveTrain 22 1
23 Body 22 1
24 Frame 22 1
2 Engine 1 2
3 Transmission 1 2
4 Axle 1 2
12 Drive Shaft 1 2



You can also use recursive CTEs to perform aggregations, such as counting the total number of subparts that make up each parent part, as shown in Listing 8.

Listing 8. Performing Aggregation with a Recursive CTE
WITH PartsCTE(parentpartid, lvl)
AS
(
SELECT parentpartid, 0
FROM PARTS
WHERE parentpartid is not null
UNION ALL
SELECT P.parentpartid, lvl+1
FROM Parts as P
JOIN PartsCTE as PP
ON PP.parentpartid = P.Partid
WHERE P.parentpartid is not null
)
SELECT C.parentpartid, P.PartName, COUNT(*) AS cnt
FROM PartsCTE C
JOIN PArts P on C.ParentPartID = P.PartID
GROUP BY C.parentpartid, P.PArtName
go

parentpartid PartName cnt
------------ ------------------------------ -----------
1 DriveTrain 20
2 Engine 8
3 Transmission 8
8 Carburetor 1
13 Piston 1
16 Gear Box 5
22 Car 23



In the example in Listing 8, the anchor member returns a row with the parentpartid for each part, being sure to filter out the NULL value in the parentpartid column because it is essentially the top of the hierarchy and represents no parent part. The recursive member returns the parentpartid of each parent of the previously returned parts, again excluding any NULL values. Eventually, the CTE contains, for each part, as many occurrences as their direct or indirect number of subparts. The outer query is then left with the tasks of grouping the results by parentpartid and returning the count of occurrences. A join to Parts is included to get the corresponding partname for each parent part to provide more meaningful results.

Suppose you want to generate a report that is a bit more readable, with the subparts sorted and indented according to hierarchical dependencies. Listing 9 provides a way you could accomplish this.

Listing 9. Generating a Formatted Report with a Recursive CTE
WITH PartsCTE(partid, partname, parentpartid, lvl, sortcol)
AS
(
SELECT partid, partname, parentpartid, 0, cast(partid as varbinary(max))
FROM Parts
WHERE partid = 22
UNION ALL
SELECT P.partid, P.partname, P.parentpartid, PP.lvl+1,
CAST(sortcol + CAST(P.partid AS BINARY(4)) AS VARBINARY(max))
FROM Parts AS P
JOIN PartsCTE AS PP
ON P.parentpartID = PP.PartID
)
SELECT
REPLICATE('--', lvl)
+ right('>',lvl)
+ partname AS partname
FROM PArtsCTE
order by sortcol
go

partname
-----------------------------------
Car
-->DriveTrain
---->Engine
------>Radiator
------>Intake Manifold
------>Exhaust Manifold
------>Carburetor
-------->Float Valve
------>Piston
-------->Piston Rings
------>Crankshaft
---->Transmission
------>Flywheel
------>Clutch
------>Gear Box
-------->Reverse Gear
-------->First Gear
-------->Second Gear
-------->Third Gear
-------->Fourth Gear
---->Axle
---->Drive Shaft
-->Body
-->Frame



In this example, you use a varbinary string as the sortcol to sort subparts according to the partid value. The anchor member is the starting point, generating a binary value for the partid of the root part. In each iteration, the recursive member appends the current part ID, converted to a binary value, to the parent part ID’s sortcol. The outer query then sorts the result by sortcol, which groups the subparts under each immediate parent part.

Setting the MAXRECURSION Option

To help avoid infinite recursion in CTEs, SQL Server, by default, sets a MAXRECURSION value of 100. If a recursive CTE attempts to perform more than 100 recursions, it is aborted, with the following error message:

Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before
statement completion.


You can override the default MAXRECURSION setting by using the OPTION(MAXRECURSION value) query hint to force termination of the query after a specific number of recursive iterations have been invoked. Listing 10 shows an example.
Listing 10. Controlling the Number of Recursions with MAXRECURSION
WITH PartsCTE(partid, partname, parentpartid, lvl)
AS
(
SELECT partid, partname, parentpartid, 0
FROM PARTS
WHERE partid = 22 -- Car
UNION ALL

SELECT P.partid, P.partname, P.parentpartid, PP.lvl+1
FROM Parts as P
JOIN PartsCTE as PP
ON P.partid = PP.Partid
)
SELECT PartID, Partname, ParentPartid, lvl
FROM PartsCTE
OPTION (MAXRECURSION 10)
go

Msg 530, Level 16, State 1, Line 2
The statement terminated. The maximum recursion 10 has been exhausted before
statement completion.



Keep in mind that if you use MAXRECURSION to control the number of levels of recursion in a CTE, your application receives the error message. It is not considered good programming practice to use code that returns errors in valid situations. Certain applications may discard query results if an error message is received. Instead, it is recommended that you use the level counter to limit recursion, in Listing 7. You should use the MAXRECURSION hint as a safeguard against infinite loops due to bad data or as a coding safeguard.

Other  
  •  SQL Server 2008 : Transact-SQL Programming - The OUTPUT Clause
  •  SQL Server 2008 : Transact-SQL Programming - TOP Enhancements
  •  SQL Server 2008 : Transact-SQL Programming - The max Specifier
  •  SQL Server 2008 : T-SQL Tips and Tricks (part 3) - Generating T-SQL Statements with T-SQL & De-Duping Data with Ranking Functions
  •  SQL Server 2008 : T-SQL Tips and Tricks (part 2) - Using CONTEXT_INFO & Working with Outer Joins
  •  SQL Server 2008 : T-SQL Tips and Tricks (part 1) - Date Calculations & Sorting Results with the GROUPING Function
  •  SQL Server 2008 : General T-SQL Performance Recommendations
  •  SQL Server 2008 : General T-SQL Coding Recommendations (part 2) - Avoid SQL Injection Attacks When Using Dynamic SQL & Comment Your T-SQL Code
  •  SQL Server 2008 : General T-SQL Coding Recommendations (part 1) - Provide Explicit Column Lists & Qualify Object Names with a Schema Name
  •  SQL Server 2008 : Advanced Stored Procedure Programming and Optimization - Using Extended Stored Procedures
  •  
    programming4us
     
     
    programming4us