Introducing Common Table Expressions in SQL Server 2005
This article mainly focuses on the new concept of "Common Table Expression" introduced in Microsoft SQL Server 2005. I shall emphasize simple code. I will use different approaches not seen in any of the previous versions of SQL Server, together with several examples covering a variety of scenarios. If you're a beginner, this article will help get you off to a good start.
I am assuming that the readers of this article will have some knowledge of RDBMS along with some exposure to either SQL Server 2000 or SQL Server 2005. I suggest readers take a look at my previous articles on Microsoft SQL Server 2005 concepts.
Introduction to common table expression
A common table expression is an expression that returns a temporary named result set from a simple or complex query, defined within the execution scope of a SELECT, INSERT, UPDATE, or DELETE statement. A CTE can work with both DDL and DML statements.
A common table expression looks very similar to a temporary table or view. The only difference between a CTE and a view is that views are created as database objects, whereas CTEs are not created as objects in the database. A CTE is available only for a single statement, and in general, every CTE can also be a part of a view.
There are mainly two types of CTEs: non-recursive CTEs, and recursive CTEs. In this article, I shall concentrate on non-recursive common table expressions, and in another upcoming article, we will continue with recursive common table expressions.
A CTE which doesn't include references to itself within its own body is called a non-recursive CTE. Some clauses such as COMPUTE, COMPUTE BY, ORDER BY (ORDER BY can be used if we specify the TOP clause), INTO, OPTION clause with query hints, FOR XML, FOR BROWSE and so on cannot be used within a CTE definition.
Before I start to give you examples of non-recursive common table expressions, please take a look at Microsoft's sample databases. Microsoft changed its strategy a bit towards sample databases. Instead of simply providing OLTP based databases (like Northwind), it provides the samples for both OLTP and OLAP. To work with the latest sample databases, we need to select "sample databases" during SQL Server 2005 installation. You can also download the sample databases separately from the Microsoft web site and install them as necessary.
Microsoft introduced a new sample database called AdventureWorks. In fact AdventureWorks has been extended to give samples covering Business Intelligence (BI) as well. This difference makes the AdventureWorks database the more powerful sample when compared with the Northwind database. In this article, all examples are focused on the AdventureWorks database.
If we recollect the creation of temporary tables in previous versions of SQL Server, we can observe that there is some inconvenience (such as maintaining too many temporary tables). We can remove some of the inconvenience by using CTE. Let me show you how with an example.
I am trying to design a CTE which displays a (temporary) result set that includes all employee names and their respective manager names.
HumanResources.Employee ON Person.Contact.ContactID = HumanResources.Employee.ContactID) AS b ON a.ManagerID = b.EmployeeID
)
select * from EmployeeManagers order by EmployeeName
Execute the above script in SQL Server Management Studio (using the "new query" option) and you should be able to view all employee names with the names of their respective bosses. I shall explain the above script in the next section.
Explaining the example
In this section, I shall explain to you the script I presented in the previous section. I shall explain it part by part. Let us first start with the following:
HumanResources.Employee ON Person.Contact.ContactID = HumanResources.Employee.ContactID) AS b
The above statement retrieves several things. First, it simply retrieves the EmployeeID. It also retrieves the Name of the employee, by combining FirstName, MiddleName and LastName fields using the concatenation operator "+" and replacing the null value as an empty string using the "isnull" function. It retrieves the ManagerID by combining two different tables available in two different schemas (HumanResources.Employee, Person.Contact).
To combine the information from both tables we are using "INNER JOIN," concentrating on the column "ContactID" (from the tables Contact and Employee). The entire result set retrieved by the above SELECT is identified with the alias "b." To work with the same result set once again, I used the same SELECT statement as above but with the alias "a."
HumanResources.Employee ON Person.Contact.ContactID = HumanResources.Employee.ContactID) AS b ON a.ManagerID = b.EmployeeID
The above statement retrieves EmployeeName, ManagerName from two tables (or result sets) with aliases "a" and "b." These two tables are combined using the "LEFT OUTER JOIN" command, which specifies that all rows from the left table (or result set) with matching rows from the right table (or result set) based on the condition "a.ManagerID = b.EmployeeID" need to be retrieved. I used "LEFT OUTER JOIN" simply to include even the employees who don't have bosses!
With EmployeeManagers(EmployeeName,Manager) as
(
)
The common table expression is created using the WITH statement followed by the CTE name. The name of the CTE will be followed by the column names representing the same columns retrieved by the SELECT statement defined within the WITH statement.
That means, entire result set will be treated as a temporary table with the name "EmployeeManagers" and columns "EmployeeName" and "Manager." Not only that, we need to select the columns from that CTE to show our output (which is also a part of the syntax defined within the WITH statement). In fact, every WITH must be provided with the respective SELECT. In this case, it looks like the following snippet:
select * from EmployeeManagers order by EmployeeName
You can do everything that a SELECT is meant to do. But please be aware that this is applicable only to the SELECT statement which is outside the WITH (and not the internal one). You can use almost all clauses in the outer SELECT statement.
As part of an example, let us rewrite the above sample to display the number of employees grouped by their managers. Consider the following script:
Up until now we have used only bare SELECT statements. We didn't use any views (or any other database objects). Let us now create another CTE which is based on an existing view. Even though this example demonstrates only views, you are free to use other database objects also (such as user-defined functions, aggregates, and so on).
Before creating the common table expression, let us look at the view "vSalesPerson" which is already created in the AdventureWorks database.
a.AddressLine1, a.AddressLine2, a.City, sp.Name AS StateProvinceName, a.PostalCode, cr.Name AS CountryRegionName, st.Name AS TerritoryName,
st.[Group] AS TerritoryGroup, s.SalesQuota, s.SalesYTD, s.SalesLastYear
FROM Sales.SalesPerson AS s INNER JOIN
HumanResources.Employee AS e ON e.EmployeeID = s.SalesPersonID LEFT OUTER JOIN
Sales.SalesTerritory AS st ON st.TerritoryID = s.TerritoryID INNER JOIN
Person.Contact AS c ON c.ContactID = e.ContactID INNER JOIN
HumanResources.EmployeeAddress AS ea ON e.EmployeeID = ea.EmployeeID INNER JOIN
Person.Address AS a ON ea.AddressID = a.AddressID INNER JOIN
Person.StateProvince AS sp ON sp.StateProvinceID = a.StateProvinceID INNER JOIN
Person.CountryRegion AS cr ON cr.CountryRegionCode = sp.CountryRegionCode
The above view simply provides you with the entire details for every sales person. Let us start creating a simple non-recursive common table expression using the above view.
With SalesPersonDetails(ID,[Name],State,Country,Territory)
as
(
select SalesPersonID,(FirstName+MiddleName+LastName) as [Name],StateProvinceName,CountryRegionName,TerritoryName
from Sales.vSalesPerson
)
Select * from SalesPersonDetails
Order by ID
The above CTE retrieves the SalesPersonID, Name (which is concatenated with FirstName+MiddleName+LastName fields), state, country and territory of every sales person.
In the previous section, we created a CTE based on a view. In this section, we shall create a view based on a CTE. Let us go through the following script:
create view Sales.vSales
as
With SalesPersonDetails(ID,[Name],Position,Quota,YearToDate,LastYear)
as
(
select SalesPersonID,(FirstName+MiddleName+LastName) as [Name],JobTitle,SalesQuota,SalesYTD,SalesLastYear
from Sales.vSalesPerson
)
select * from SalesPersonDetails
You can observe (from the above script) that creating a view involves just adding the "CREATE VIEW" syntax to an already existing/tested CTE.
All the examples in this article were tested using SQL Server 2005 Enterprise Edition. Please note that I didn't really test the examples on any other versions/editions or similar suites of Microsoft products (even though I believe that the examples will work on all SQL Server 2005 editions).
You also need to consider that the examples are not at all optimized for performance. Indeed, they should not be used in a real production environment. I just wanted to introduce the concept of CTEs through this article without any performance tuning (which is beyond the scope of this article).
My upcoming articles will go further in-depth on CTEs. So stay tuned!
Any comments, suggestions, ideas, improvements, bugs, errors, feedback etc. are highly appreciated at jag_chat@yahoo.com.