Introduction to Common Table Expressions
You can use common table expressions (CTEs) starting with SQL Server 2005 to simplify your T-SQL queries. CTEs allow you to separate part of your T-SQL logic from your main query instead of using a view, correlated subquery, or temp table. You define the CTE at the beginning of the statement, and then you can use the CTE in your main query just as if it were a table or view. You can also use a CTE to write a recursive query, which will not be covered in this article.
All the examples in this article will work with the 2005 version of the AdventureWorks database.
CTE Syntax
Once you get acquainted with CTEs, you will probably find them easy to use. Start with the word WITH, followed by a name for your CTE. Follow the name of the CTE with the word AS and a set of parentheses. Inside the parentheses, type in a valid SELECT query. You can then use the CTE in your main query just as if the CTE was a table or view. You can use the CTE in a SELECT, UPDATE, INSERT or DELETE statement, but the CTE itself will be a SELECT statement. Listing 1 shows two sample queries.
Listing 1: Simple CTE examples
WITH emp AS (
SELECT EmployeeID, FirstName, LastName, E.Title, ManagerID
FROM HumanResources.Employee AS E
INNER JOIN Person.Contact AS C ON E.ContactID = C.ContactID
)
SELECT A.EmployeeID, A.FirstName, A.LastName, A.Title,
A.ManagerID, B.FirstName AS MgrFirstName,
B.LastName AS MgrLastName, B.Title AS MgrTitle
FROM emp AS A
INNER JOIN emp AS B ON A.ManagerID = B.EmployeeID; WITH emp AS (
SELECT EmployeeID, FirstName, LastName, E.Title, ManagerID
FROM HumanResources.Employee AS E
INNER JOIN Person.Contact AS C ON E.ContactID = C.ContactID
),
mgr (EmployeeID, MgrFirstName, MgrLastName, MgrTitle) AS (
SELECT EmployeeID, FirstName, LastName, E.Title
FROM HumanResources.Employee AS E
INNER JOIN Person.Contact AS C ON E.ContactID = C.ContactID
)
SELECT emp.EmployeeID, FirstName, LastName, Title, ManagerID,
MgrFirstName, MgrLastName, MgrTitle
FROM emp
INNER JOIN mgr ON emp.ManagerID = mgr.EmployeeID;
The first query demonstrates how the CTE can be used multiple times within the main query. The main query joins emp to itself to list all the employees who have a manager along with the manager’s name and title. The second query demonstrates that you can define more than one CTE in the same statement. Separate the CTE definitions with a comma. The second CTE, defining the managers, also shows how you can specify aliases for the CTE right after the CTE name instead of relying on the names in the CTE definition. One more thing to keep in mind is that, if the CTE is part of a batch, the previous statement must end in a semi-colon. I have often seen CTE queries with a semi-colon right before the word WITH as a precautionary measure.
Uses for CTEs
Now that you know how to create a CTE, you may wonder when you should use one. One of my favorite uses is to fix a poorly performing correlated subquery. Developers often use correlated subqueries to add one or more aggregate expressions to an otherwise non-aggregate query. Listing 2 shows a correlated subquery and how to get the same results using a better performing CTE.
Listing 2: Use a CTE to fix a correlated subquery
SELECT SalesOrderID, CustomerID,
(SELECT COUNT(*) FROM Sales.SalesOrderHeader WHERE CustomerID = S.CustomerID) AS CountOfSales,
(SELECT AVG(TotalDue) FROM Sales.SalesOrderHeader WHERE CustomerID = S.CustomerID) AS AvgSale,
(SELECT MIN(TotalDue) FROM Sales.SalesOrderHeader WHERE CustomerID = S.CustomerID) AS LowestSale,
(SELECT MAX(TotalDue) FROM Sales.SalesOrderHeader WHERE CustomerID = S.CustomerID) AS HighestSale
FROM Sales.SalesOrderHeader AS S; WITH csales AS (
SELECT COUNT(*) AS CountOfSales, AVG(TotalDue) AS AvgSale,
MIN(TotalDue) AS LowestSale, MAX(TotalDue) AS HighestSale,
CustomerID
FROM Sales.SalesOrderHeader
GROUP BY CustomerID)
SELECT SalesOrderID, S.CustomerID, CountOfSales, AvgSale,
LowestSale, HighestSale
FROM Sales.SalesOrderHeader AS S
INNER JOIN csales ON S.CustomerID = csales.CustomerID;
The first query lists all of the sales orders and the customer ID for each. It uses four subqueries to calculate the count of sales, average sale, minimum and maximum sales for the customers. The second query produces the same results by including an aggregate query in the CTE. On my computer, the first computer takes three times longer to run than the second query!
Another fantastic T-SQL feature you will find in SQL Server 2005 and 2008 is the ROW_NUMBER and other ranking functions. You can add row numbers to your queries using ROW_NUMBER, but you can not filter the results by including the ROW_NUMBER function in the WHERE clause. Listing 3 shows how you can combine the ROW_NUMBER function with a CTE to solve this problem.
Listing 3: Use the ROW_NUMBER function with a CTE
WITH emp AS ( SELECT ROW_NUMBER() OVER(ORDER BY LastName, FirstName) AS RowNumber,
EmployeeID, FirstName, LastName
FROM HumanResources.Employee AS e
INNER JOIN Person.Contact AS c ON e.ContactID = c.ContactID)
SELECT RowNumber, EmployeeID, FirstName, LastName
FROM emp
WHERE RowNumber BETWEEN 11 AND 20
ORDER BY RowNumber;
Another interesting way you can use a CTE is to join a CTE to a previous CTE. Take a look at Listing 4 to see how this works.
Listing 4: One CTE based on another CTE
WITH maxDate AS (
SELECT MAX(OrderDate) AS MaxOrderDate, CustomerID
FROM Sales.SalesOrderHeader
GROUP BY CustomerID),
orders AS (
SELECT SalesOrderID, soh.CustomerID, OrderDate
FROM Sales.SalesOrderHeader AS soh
INNER JOIN maxDate ON soh.CustomerID = maxDate.CustomerID
AND soh.OrderDate = maxDate.MaxOrderDate)
SELECT CustomerID,ProductID, sod.SalesOrderID,OrderDate
FROM orders
INNER JOIN Sales.SalesOrderDetail AS sod ON sod.SalesOrderID = orders.salesOrderID;
The query in Listing 4 returns a list of the customers along with the products ordered on the most recent order. The first CTE, maxDate, returns the maximum order date for each customer. The second CTE joins the first CTE, maxDate, to the Sales.SalesOrderHeader table, returning a list of customers and the most recent order ID. Finally, the main query joins the Sales.SalesOrderDetail table to the orders CTE.
All of the queries within the CTE definitions in the previous listings can run on their own. The second CTE, orders in Listing 4 can not run by itself. It depends on the first CTE.
Conclusion
You can use CTEs to separate part of the logic of your query instead of using correlated subqueries and temp tables. Once you start using CTEs in your T-SQL code, you will probably find lots of ways to use them.
Kathi is the author of Beginning T-SQL 2008. Even if you have little or no knowledge of T-SQL, Beginning T-SQL 2008 will bring you up to intermediate level and teach you best practices along the way. You'll learn how to write code that will help you to achieve the best-performing applications possible.
You'll find an introduction to databases, normalization, and SQL Server Management Studio. You'll understand how data is stored in a database and learn how to use at least one of the available tools to get to that data.
Each subsequent chapter teaches an aspect of T-SQL, building on the skills learned in previous chapters. Exercises are included in each chapter because the only way to learn T-SQL is to write some code.
This book will do more than just give the syntax and examples. It will teach you techniques to help you avoid common errors and create robust and well-performing code.