More About Common Table Expressions in SQL Server 2005
This is the second part of a tutorial focusing on common table expressions in SQL Server 2005. In the previous article, I explained non-recursive CTEs. Therefore, in this article, I shall mainly focus on recursive CTEs in SQL Server 2005.
If you are new to CTEs, I strongly suggest you go through my first article on the topic here.
All the examples in this article were tested using SQL Server 2005 Enterprise Edition. Please note that I didn’t really test the examples on any of other versions/editions or similar suites of Microsoft products (even though I strongly believe that the examples will work on all SQL Server 2005 editions).
You also need to consider that the examples are not at all optimized for performance. They may not be used for a real production environment. I just wanted to introduce the concept of recursive CTEs through this article without any discussion of performance tuning (which is beyond the scope of this article).
Coming to the topic, what exactly is a “recursive CTE”? A CTE which includes references to itself within its own body is called a recursive CTE. Recursion is generally specified using any SET operator (like UNION ALL, UNION, INTERSECT, or EXCEPT) within the CTE. The second part of the query (which is after the SET operator) must refer to the CTE itself, using it as a key to the next level of recursion.
Some clauses like DISTINCT, GROUP BY, HAVING, TOP, LEFT, RIGHT, OUTER JOIN, (INNER JOIN is allowed) along with scalar aggregations, sub queries and so forth are not allowed within a CTE recursive member definition.
A recursive common table expression: an example
Before talking too much, let us start with a simple example.
SELECT
EmployeeID
FROM HumanResources.Employee
WHERE ManagerID = 140
The above example (even though it is not a CTE) simply displays all EmployeeIDs, who are under manager 140 (the manager id or employee id of the manager). But the employees under manager 140 may have their own subordinates. So within the above list we are covering only the first level of subordinates under manager 140, and not really listing all the levels of subordinates under him or her.
Now let us work with our new CTE example:
WITH EmpCTE
AS
(
SELECT
EmployeeID, ManagerID
FROM HumanResources.Employee
WHERE ManagerID = 140
UNION ALL
SELECT
b.EmployeeID, b.ManagerID
FROM HumanResources.Employee b
JOIN EmpCTE AS a ON a.EmployeeID = b.ManagerID
)
SELECT EmployeeID,ManagerID
FROM EmpCTE
The sample output for the above CTE would be something like the following:
EmployeeID ManagerID
---------- ---------
30 140
71 140
103 140
139 140
59 139
94 139
130 139
.
.
.
82 30
154 30
191 30
I shall explain the above script in the next section.
This section explains the script provided in the previous section. Let me explain it part by part. Let us first consider the following code fragment:
SELECT
EmployeeID, ManagerID
FROM HumanResources.Employee
WHERE ManagerID = 140
It is not surprising that it retrieves the list of all employees working under manager 140. Let us add a very funny structure to the above SELECT statement which ends up as follows:
WITH EmpCTE
AS
(
SELECT
EmployeeID, ManagerID
FROM HumanResources.Employee
WHERE ManagerID = 140
Even though we don’t come across any remarkable modifications, it has its own depth. Every row retrieved by the SELECT statement will be treated as part of the temporary table named “EmpCTE.” At this point, it is not of much interest. Let us proceed with the following statement now:
SELECT
b.EmployeeID, b.ManagerID
FROM HumanResources.Employee b
JOIN EmpCTE AS a ON a.EmployeeID = b.ManagerID
This is the most confusing statement you would ever see (or understand). The above statement works with every row present in “EmpCTE.” It primarily retrieves all the subordinates of each employee existing in “EmpCTE” and adds them again to the same “EmpCTE” with the help of the UNION ALL operator.
This process gets executed recursively for every new row added (or being added) to the same “EmpCTE.” Finally we retrieve all the information from “EmpCTE” using the following statement:
Let us start with another example, which gives the level of every employee.
SELECT
EmployeeID, ManagerID,0 AS EmployeeLevel
FROM HumanResources.Employee
WHERE ManagerID is Null
The above example simply displays EmployeeID, ManagerID and EmployeeLevel as "0," for those who are not working under any manager (in this case, it gives us only one employee, who is the CEO).
Now let us work with a recursive CTE:
WITH EmpCTE
AS
(
SELECT
EmployeeID, ManagerID,0 AS EmployeeLevel
FROM HumanResources.Employee
WHERE ManagerID is Null
UNION ALL
SELECT
b.EmployeeID, b.ManagerID,EmployeeLevel+1
FROM HumanResources.Employee b
INNER JOIN EmpCTE AS a ON a.EmployeeID = b.ManagerID
)
SELECT EmployeeID,ManagerID,EmployeeLevel
FROM EmpCTE
Order By EmployeeLevel
The sample output for the above CTE would be something like the following:
EmployeeID ManagerID EmployeeLevel
---------- --------- -------------
109 NULL 0
6 109 1
12 109 1
.
.
268 273 2
284 273 2
.
.
.
47 30 3
70 30 3
82 30 3
.
.
.
31 210 4
45 210 4
56 210 4
68 210 4
81 210 4
I shall explain the above script in the next section.
This section explains the script provided in previous section. Let me explain it part by part. Let us first consider the following code fragment:
SELECT
EmployeeID, ManagerID,0 AS EmployeeLevel
FROM HumanResources.Employee
WHERE ManagerID is Null
It is not surprising that it retrieves the list of all employees who are not working under any manager. Let us add a very funny structure to the SELECT statement, just like the previous example, which ends up as follows:
WITH EmpCTE
AS
(
SELECT
EmployeeID, ManagerID,0 AS EmployeeLevel
FROM HumanResources.Employee
WHERE ManagerID is Null
Every row retrieved by the SELECT statement will be treated as part of the temporary table named “EmpCTE” (very similar to the first example). Let us proceed with the following statement now:
SELECT
b.EmployeeID, b.ManagerID,EmployeeLevel+1
FROM HumanResources.Employee b
INNER JOIN EmpCTE AS a ON a.EmployeeID = b.ManagerID
The above SELECT statement is considered to be the recursive part (which really enforces the recursion). This statement is similar to the previous example; the only difference we find is with the “EmployeeLevel.” The above statement works with every row present in “EmpCTE.”
It gets executed for every row present in “EmpCTE,” and if it happens to give any output of rows, those rows will again be added to “EmpCTE” and again call itself for every newly added row.
This process gets executed recursively for every new row added (or being added) to the same “EmpCTE.” Finally we retrieve all the information from “EmpCTE” using the following statement:
Till now we have simple examples. Let us start extending the above example to include both employee names and manager names by creating a recursive CTE using views.
create view vEmpManagerDetails
as
WITH EmployeeManagersDetails(EmployeeID, EmployeeName, ManagerID, Manager)
AS (SELECT a.EmployeeID, a.Name, a.ManagerID,b.Name AS Manager
SELECT ManagerID, Manager,EmployeeID,EmployeeName, EmployeeLevel +1 as EmployeeLevel
FROM EmployeemanagerDetails order by EmployeeLevel;
GO
I hope everybody can understand the above script. It is similar to the previous example. The only difference we can find is that it displays the Names of Employees and Managers along with their IDs and levels. I also added “EmployeeLevel + 1” just to eliminate the “0” in the output.
The sample output for the above CTE would be something like the following: