August 12, 2016 at 9:22 am
Hey everyone,
When you include columns in an index, what is the engine actually doing to "include" those columns? I'm having trouble understanding that if the data is all in a heap, how it's referencing those columns that are included to perform faster searches.
August 12, 2016 at 9:33 am
When you use the INCLUDE to add non-indexes columns in an index, that data is actually copied from the heap or clustered index into the nonclustered index. The included columns are only stored in the leaf nodes of the nonclustered index.
This does add some overhead as when data is updated it requires that the indexes where data is included also be updated even if the index itself doesn't need to be updated due to changes values of the indexed columns.
August 12, 2016 at 9:39 am
So it really should only be used with columns that are not updated very often, have large amounts of data, and those columns are often used together?
August 12, 2016 at 9:50 am
nsadams87xx (8/12/2016)
So it really should only be used with columns that are not updated very often, have large amounts of data, and those columns are often used together?
It depends. I use covering indexes where appropriate to improve system performance. The problem comes when you start seeing people creating covering indexes that include nearly all the columns of the table. They basically are creating pseudo clustered indexes and doubling (or more) the size of the tables.
Use them where they are needed and improve performance, just understand that there is a price that has to paid for that performance.
August 12, 2016 at 9:55 am
Since included columns don't form part of the actual index key, it's unlikely they'll be useful in WHERE clauses, JOINs and so on. What they are useful for is when you need that extra column in your SELECT list to be in the index so that a lookup back to the clustered index (or heap) is not needed.
John
August 12, 2016 at 10:21 am
Depending on the query, the column(s) in the INCLUDE clause could benefit the WHERE clause or the ORDER BY clause. It really comes back to "It depends."
August 12, 2016 at 10:26 am
If you're quick you can see a video from Brent Ozar which explains indexes with include quite nicely -
Expires on 16th of August (2016, for anybody who reads this thread years later).
August 12, 2016 at 10:43 am
I think those videos will be perfect. Thank you everyone!
August 12, 2016 at 10:46 am
nsadams87xx (8/12/2016)
So it really should only be used with columns that are not updated very often, have large amounts of data, and those columns are often used together?
Adding to what Lynn said...
It's important to understand the added overhead created by including columns to your index. It's equally important to understand the performance implications of a lookup (e.g. bookmark or RID) in the execution plan. When the optimizer decides to use a nonclustered index and has to perform a lookup against a clustered index or heap to columns that are not included there's a rough performance hit.
My strategy (which I think has served me well) has been not to leverage NC indexes that require a lookup.
-- Itzik Ben-Gan 2001
August 12, 2016 at 11:09 am
If you love heaps as much as I do, you'll be adding the primary key column as an included column for most of your indexes. Many of my queries are just joining a bunch of indexes and only getting data from the first and last tables in the chain.
If you create a unique index, you'll be forced to add additional columns as includes...to be able to enforce uniqueness on only the subset of columns.
And if you have small master tables, create lots of different covering indexes; they take very little space.
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply