By now, you have heard of subqueries. You have also heard of Common Table Expressions. I am sure you know what a derived table is and that you get a derived table through either a subquery or CTE. How familiar are you with the subquery flavor of a derived table though?
I encountered something about derived tables recently that I had never seen, let alone heard of up to that point. Let’s start with the Microsoft documentation on the topic. If you browse to this page, you will find a description for column_alias immediately following the description of derived table. What you don’t get is an example of how it is applicable. Or do you?
If you look in the example of the derived table on that same page, you will see the following code (formatting added for readability).
[codesyntax lang=”php”]
SELECT * FROM ( VALUES (1, 2), (3, 4), (5, 6), (7, 8), (9, 10) ) AS MyTable(a, b);
[/codesyntax]
Here, we can see that column_alias is optionally supplied after the table_alias for the derived table. In this example, we have supplied two new column aliases called a and b.
If we want, we can take this a step further and see the same sort of example supplied by Sybase.
[codesyntax lang=”sql”]
select dt_b.book_title, dt_b.tot_sales from (select title, total_sales from titles) dt_b (book_title, tot_sales) where dt_b.book_title like "%Computer%"
[/codesyntax]
You can read the documentation about derived table syntax in Sybase, here, if you so desire. The point of this is to show similar code and documentation between SQL Server and its resuscitated predecessor.
And for grins, you actually have the same sort of optional syntax available for the derived table known as a CTE. You can see the documentation, from Microsoft, on that here.
So, how do we put this to use? Well, I am glad you asked that. I have an example ready to go.
[codesyntax lang=”tsql”]
DECLARE @StudentId INT = 1 ,@RequestDate DATE = '7/1/2011' ,@RequestNbr INT = 2 ,@SchoolYear INT = 2011 ,@RequestTypeID INT = 1 ; WITH Request AS ( SELECT RequestNbr = 2 , RequestDate = @RequestDate , StudentId = 1 , SchoolYear = 2011 , RequestTypeID = 1 , CompletionDate = null) SELECT Request.StudentId ,Request.RequestNbr ,Request.RequestDate FROM Request JOIN (SELECT StudentId AS StudentNumber, max(RequestNbr) AS RequestNb FROM Request WHERE RequestTypeId = @RequestTypeId AND StudentId = @StudentId AND CompletionDate IS NULL AND SchoolYear= @SchoolYear GROUP BY StudentId) AS Latest (StudentId, RequestNbr) ON Request.StudentId = Latest.StudentId AND Request.RequestNbr = Latest.RequestNbr
[/codesyntax]
In this example, I have a derived table implemented through a subquery. The alias of this derived table is “Latest.” Note that there is an additional set of parenthesis after that table alias. Inside this set of parenthesis, you will see a couple of column names. Those columns are called StudentId and RequestNbr.
Now, I want you to take a look inside that derived table and note the names of the columns I provided in the aliases there. See how those column_aliases are different than the column_aliases provided after the table_alias? By looking at the query, can you tell which takes precedence? Aliases supplied for columns in the optional column_alias outside of the derived table override the column_aliases of those provided inside the derived table. You can verify that by looking at the join conditions provided after those aliases were defined.
Running this script, you will see it execute without error. Using this kind of syntax could be useful in certain cases. I think that it could make finding those column names considerably easier. It could also help with readability.
Let’s take a quick look at the same kind of setup, but using a CTE instead.
[codesyntax lang=”tsql”]
DECLARE @StudentId INT = 1 ,@RequestDate DATE = '7/1/2011' ,@RequestNbr INT = 2 ,@SchoolYear INT = 2011 ,@RequestTypeID INT = 1 ; WITH Request AS ( SELECT RequestNbr = 2 , RequestDate = @RequestDate , StudentId = 1 , SchoolYear = 2011 , RequestTypeID = 1 , CompletionDate = NULL ), Latest (StudentId, RequestNbr) AS ( SELECT StudentId AS StudentNumber, max(RequestNbr) AS RequestNb FROM Request WHERE RequestTypeId = @RequestTypeId AND StudentId = @StudentId AND CompletionDate IS NULL AND SchoolYear= @SchoolYear GROUP BY StudentId ) SELECT Request.StudentId ,Request.RequestNbr ,Request.RequestDate FROM Request JOIN Latest AS Latest ON Request.StudentId = Latest.StudentId AND Request.RequestNbr = Latest.RequestNbr
[/codesyntax]
Note that I moved that entire derived table from subquery to be a new CTE defined immediately after Request. Now take note of the difference in declaration between Request and Latest. In Latest, I define the column names up front and have the columns aliased differently inside the CTE. I do not define the column_alias list for the Request derived table. You can also note that the colum_alias defined prior to the guts of the Latest derived table take precedence over any column_alias defined inside that particular derived table.
I hope this was new information to somebody else. If you learned something new, let me know.