In this series I’m going to detail the 5 T-SQL Commands that review in my 5 T-SQL Commands I’ve been Missing, but were there the whole time presentation. The gist of this presentation details my past as an uninformed developer, as it pertains to T-SQL and its vast features. This presentation, and accompanying blog series, will highlight some of the commands I’ve found useful.
I will be using the same DBA.StackOverflow Database that I used in the first post of the series.
Today we’ll be discussing the APPLY operator. This operator has been available to us since SQL 2005. The always helpful MSDN documentation tells us about the APPLY operator.
The APPLY operator allows you to invoke a table-valued function for each row returned by an outer table expression of a query.
For a basic example we simply use cross apply to return a table-value function
select U.Id , U.DisplayName , Location , Reputation , R.Position from DBA.Users U cross apply DBA.getUserRanking(U.Id) R where AboutMe like '%MVP%'
Not very exciting. Most DBA’s would’ve seen this methodology against the myriad of dynamic management functions that are available to us. What I find useful in the CROSS APPLY is the ability to clean up your code and make it more readable. For example, take this query here. We have query that is pulling anyone that has the words ‘MVP’ in their About Me section of their profile. We want to see who they are, where they’re from, how active they have been, and how many post they mark as FAVORITE (VoteTypeId=5)
select Id , DisplayName , Location , Reputation , (select count(2) from DBA.Votes where VoteTypeId=5 and UserId=U.Id) from DBA.Users U where AboutMe like '%MVP%'
This isn’t too crazy, but what if we want to also return the most recent post they marked as favorite to see what has piqued their interest. Let’s add a second correlated subquery.
select Id , DisplayName , Location , Reputation , FavoritePost = (select count(2) from DBA.Votes where VoteTypeId=5 and UserId=U.Id) , LastPost = (select max(PostId) from DBA.Votes where VoteTypeId=5 and UserId=U.Id) from DBA.Users U where AboutMe like '%MVP%' order by Reputation desc
Now we’re getting a little crazy in the readability department, so lets see what a CROSS APPLY can do for us. We can take both of those aggregates and put them in a single CROSS APPLY.
select Id , DisplayName , Location , Reputation , Favs.FavoritePost , Favs.LastPost from DBA.Users U CROSS APPLY (select count(2) as FavoritePost, max(PostId) as LastPost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Favs where AboutMe like '%MVP%' order by Reputation desc
Now its fairly clear what we are doing here, but…. there are caveats. When we do this method our performance. According to our IO Stats we are reading a royal boatload of data for this simple query.
We can fix that, let’s try to keep our readability, while getting our performance back, let’s split those queries into individual CROSS APPLYs. As you can see, you can have multiple CROSS APPLYs in one query.
select Id , DisplayName , Location , Reputation , Favs.FavoritePost , Late.LastPost from DBA.Users U CROSS APPLY (select count(2) as FavoritePost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Favs CROSS APPLY (select max(Postid) as LastPost from DBA.Votes where VoteTypeId=5 and UserId=U.Id) Late where AboutMe like '%MVP%' order by Reputation desc
There we go, our performance is back where it should be.
One final trick I like to use CROSS APPLY is to reduce errors in complex formulas that might be used in multiple places in a query. In this query we can see we calculate a ‘Agreeability’ by putting DownVotes over UpVotes and getting a percentage, and then we order by that. I have seen at times where a formula might be the basis for a second formula, and a change in the first, might be missed in a change in the second.
select Id , DisplayName , Location , UpVotes , DownVotes , [Agreeabilitiy] = convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100) from DBA.Users U where AboutMe like '%MVP%' order by convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100) desc
You can see here where we simply pull our formula into a dedicated CROSS APPLY and put our formula there, now you only have to make a change in one place, safety AND readability.
select Id , DisplayName , Location , UpVotes , DownVotes , CA.Agreeability from DBA.Users U CROSS APPLY( select convert(decimal(5,2), DownVotes*1.0/case when UpVotes=0 then 1 else UpVotes end*100) as Agreeability) CA where AboutMe like '%MVP%' order by CA.Agreeability desc
I can think back to several places where the CROSS APPLY operator would have made my life so much easier, not only in reading and understanding what was going on in a query, but also in reducing errors when modifying a query. As with any new feature you implement, ensure you test before and after to ensure that you are maintaining or improving the performance of your queries, as new and shiny doesn’t help if it only slows you down.