October 1, 2008 at 11:55 am
I figured an indexing question would best be answered by folks who focus on performance tuning, if there was a better forum for this please let me know.
Anyway I was looking at: http://msdn.microsoft.com/en-us/library/ms177484(SQL.90).aspx and an article talking about indexing.
With included columns you can add them to a NONclustered index so that when a query uses that index additional columns the query SELECTs can be "included columns" and returned as part of the INDEX SEEK instead of having to hop over to the table, lookup the table row, and then scan that row of the table to get whatever columns are needed.
Classic example:
nonclustered index on: FirstName, LastName.
included columns: EmailAddress, PhoneNumber
The table would also of course have a PK which is part of the clustered index.
The clustered index might have other columns, maybe LastName or "whatever".
It sorts the table and is used to quickly locate records.
Here is what I don't get:
I do my query: SELECT FirstName, LastName, EmailAddress, PhoneNumber
FROM table
WHERE FirstName = 'John' AND LastName = 'Doe'
It then uses my NONclustered index to perform my query AND it would then locate the record(s) that match my criteria.
What happens then?
It looks like it would then have a pointer to the clustered index which it would use to locate the record in the table, and retrieve the data.
Except in this case it is a COVERING INDEX... and it has Email and Phone stored in the NONclustered index's leaf level.
So... what does it do?
Does it grab Email and Phone and THEN still jump to the clustered index to locate the row in the table so it can pull out First and Last Name?
What I'm getting at is this:
CLUSTERED indexes store the actual table at their leaf level, is that correct?
NONclustered stores pointers to the clustered index (the table) or to artificial keys in a heap (tables with no clustered index), right?
So... how does the included column on a NONclustered index allow the query to avoid having to lookup the data in the table itself?
That's the part I don't get.
It seems like it grabs Phone and Email from its leaf level, but still has to jump to the clustered index to get the Fname and Lname since it doesn't actually store those data values in the index, just pointers to them...
So what am I not following/understanding?
Thanks!
October 1, 2008 at 12:18 pm
The columns defined in the INCLUDE statement are stored at the leaf level of the non-clustered index. There's no need to go to the cluster or a heap to retrieve the data. Storing the data increases the size of the index, but it doesn't change the size and levels of the B-Tree since the index keys are still the same. That means, for the most part, the index seek operations should be as fast as ever.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
October 1, 2008 at 12:59 pm
Maxer (10/1/2008)
It seems like it grabs Phone and Email from its leaf level, but still has to jump to the clustered index to get the Fname and Lname since it doesn't actually store those data values in the index, just pointers to them...
FName and LName are the index keys, so they are stored at all levels of the index. They have to be, how else would SQL be able seek on those columns.
In a nonclustered index, the nonclustered index keys and the clustered index keys are stored at all levels of the index. The include columns are only stored at the leaf level. The clustering keys are there as a 'pointer' to the row, in case a bookmark lookup is required.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
October 1, 2008 at 1:34 pm
GilaMonster (10/1/2008)
Maxer (10/1/2008)
It seems like it grabs Phone and Email from its leaf level, but still has to jump to the clustered index to get the Fname and Lname since it doesn't actually store those data values in the index, just pointers to them...FName and LName are the index keys, so they are stored at all levels of the index. They have to be, how else would SQL be able seek on those columns.
In a nonclustered index, the nonclustered index keys and the clustered index keys are stored at all levels of the index. The include columns are only stored at the leaf level. The clustering keys are there as a 'pointer' to the row, in case a bookmark lookup is required.
Ah, of course. Indeed otherwise how would it be able to seek on those columns 🙂
Well that explains that.
Thanks for pointing out my obvious logical error there. I knew I was failing to take something into account, and that would certainly be it!
November 23, 2010 at 1:01 pm
Actually, the included columns make the leaf index entries larger. Therefore, fewer entries per page, which translates to more B-tree above. So there is a size cost for the entire tree structure based on the percent increase in size in the index entry. Increase in tree depth and seek time will be logarithmic increase.
Still, well defined include columns can yield a big improvement.
David Lathrop
DBA
WA Dept of Health
November 16, 2012 at 8:13 am
I know it's an old topic but how is included columns affecting memory footprint? Some say higher use to put index in.memory cache. Some say lower since less pages are read and cached since it is in the index.
K
November 17, 2012 at 10:28 am
Kristian Ask (11/16/2012)
I know it's an old topic but how is included columns affecting memory footprint? Some say higher use to put index in.memory cache. Some say lower since less pages are read and cached since it is in the index.K
There is less memory footprint because the idea behind a non-clustered index with INCLUDE columns is to help avoid bookmark lookups in the query plan. Bookmark lookups are operations that go and fetch data from the clustered index (ie. the actual table) to help return columns that are in the SELECT portion of the query. Imagine a very wide table with tens or hundreds of columns being loaded into memory for the bookmark lookup. The memory footprint will be higher than if only a select few columns are instead loaded into memory as include columns in a noncl. Index.
Does my somewhat muddled explanation make sense? 🙂
__________________________________________________________________________________
SQL Server 2016 Columnstore Index Enhancements - System Views for Disk-Based Tables[/url]
Persisting SQL Server Index-Usage Statistics with MERGE[/url]
Turbocharge Your Database Maintenance With Service Broker: Part 2[/url]
November 19, 2012 at 3:01 am
I got it... sound logical. One more thing though.
If the included columns is not creating a covering index, will the included columns still be read or is it then only using the table?
November 19, 2012 at 4:03 am
The included columns will still be read.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
November 19, 2012 at 4:35 am
Kristian Ask (11/19/2012)
If the included columns is not creating a covering index, will the included columns still be read or is it then only using the table?
yes they will be retrieved with the help of keylook up or RID lookup.
-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done;-)
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply