This post will cover the IN clause and another way to rewrite the same logic. I don’t intend to say that IN is better or worse than other T-SQL, I just want to showcase what it does.
We’ll jump right into the demos. I’m using StackOverflow2010 for this, and I’m going to create an index to be used by this query.
CREATE NONCLUSTERED INDEX ix_DisplayName_Age on Users (DisplayName) INCLUDE (Age)
SELECT DisplayName FROM Users as u WHERE Age IN (96,97,98,99,100)
Okay, cool. Let’s take a look at the Index Scan.
I want to focus on the bottom part. Even though we didn’t write this query with any OR statements, the optimizer interpreted the IN as a series of ORs.
This has interesting implications for how our query was optimized. Let’s take a look at another way of writing the same query.
SELECT DisplayName FROM Users as u JOIN (SELECT 96 as Age UNION ALL SELECT 97 as Age UNION ALL SELECT 98 as Age UNION ALL SELECT 99 as Age UNION ALL SELECT 100 as Age ) as A1 on A1.Age = u.Age
This query will get the same result set. This time, let’s run both queries in the same batch. Ignore the Query Cost, just look at the different execution plans.
This is what query tuning is all about. Rewriting a query in two different ways to see if the optimizer will pick a different execution plan.
Comparing the IN clause vs the UNION ALL
There’s two significant differences in these plans. Let’s take a look at the execution plans, using the execution plan comparison feature in Management Studio.
Click on the whole comparison if you’re curious. I’ll zoom in on the differences below. The first plan using the IN clause is on the left. The second plan using the UNION ALL is on the right.
Let’s compare the memory grants
Okay maybe I like talking about memory grants. Let’s start there.
The plan on the left, using the IN clause, did not need any memory grants.
The plan on the right was granted 1 MB of memory and used about 248 KB. That’s not very bad, but it is an important difference. If you have memory pressure, you can decide to avoid a memory grant by rewriting your query.
What about the CPU differences?
Keep in mind, this is just one example set.
So the UNION ALL query was slightly faster. When we’re talking about a difference of 15 milliseconds, I’d want to test a lot before deciding that it was better. I’d also note that the estimated cost for that plan was higher.
Moral of the post
The point of this post is that query tuning can be as simple as re-writing the IN clause. I don’t want to say that either way is better, since it depends on your environment.
Don’t rewrite code unless there’s already a performance issue, and if you do, test it thoroughly.