January 8, 2016 at 2:28 am
I need an explanation may be pointing to some whitepaper for the explanation of below:
In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂
January 8, 2016 at 2:38 am
er.mayankshukla (1/8/2016)
I need an explanation may be pointing to some whitepaper for the explanation of below:In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂
I don't think this is correct. Any nonclustered index (whether unique or not) adds the clustered index column(s) to the leaf pages only.
You probably misunderstood something Gail said. I started to watch the video but it's a one-hour talk and I don't have that much time to spare.
January 8, 2016 at 3:03 am
Yeah I am interested in this as I thought for a non unique nonclustered index that the CI key needs to go up the tree in order to efficiently locate rows in the leaf level.
January 8, 2016 at 4:23 am
Yes, May be Gail can answer this
January 8, 2016 at 6:40 am
Yeah, I too thought the keys were stored at the leaf level. I'm positive Gail knows that. It must be a misstatement or misinterpretation of what she said.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
January 8, 2016 at 8:26 am
Hugo Kornelis (1/8/2016)
er.mayankshukla (1/8/2016)
I need an explanation may be pointing to some whitepaper for the explanation of below:In a unique non clustered index, the clustered key comes to leaf level only
in a non unique non clustered index, the clustered key comes to leaf and non leaf level
I was watching Gail's video
https://www.youtube.com/watch?v=y_bl9dArtmA
So may be Gail can help me in this 🙂
I don't think this is correct. Any nonclustered index (whether unique or not) adds the clustered index column(s) to the leaf pages only.
You probably misunderstood something Gail said. I started to watch the video but it's a one-hour talk and I don't have that much time to spare.
The subject matter for this post starts at time marker 11:50.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2016 at 9:01 am
Jeff Moden (1/8/2016)
The subject matter for this post starts at time marker 11:50.
Thanks, Jeff! Found it, watched it, and Gail is pretty explicit about it.
Coming from a lot of people, I would have put this aside as a misunderstanding from the presenter. Coming from Gail who normally knows her stuff very well, I will postpone judgement until after I have double checked this in a test database.
BRB!
January 8, 2016 at 9:13 am
This seems relevant as well: http://www.sqlskills.com/blogs/kimberly/nonclustered-indexes-lookup-key-btree/
Cheers!
January 8, 2016 at 9:22 am
Okay, did a small test and confirmed that my faith in Gail was justified (and my faith in my own understanding of index internals was not). After creating a test table and two nonclustered indexes, one unique and one nonunique, and adding 10,000 rows; I used DBCC IND to find the root pages and DBCC PAGE to examine their contents. The nonunique index did indeed have extra data added to each entry, and that data corresponds to the primary key values.
I then decided to check the documentation where I got my understanding from - and though phrased in a hard-to-understand way, I must admit that it's there and that I just had overlooked it so far. (See https://msdn.microsoft.com/en-us/library/ms190620.aspx - step 2, item 2 in the list, first and second paragraph).
Obviously, the next question is why this is implemented like this. Why would SQL Server need the clustered index key on the non-leaf levels? (Which is obviously also what the original question in this thread was after).
NOTE: This is where I enter into speculation mode!
Think about when a row is deleted (or updated affecting data in the nonclustered index). If the row is located through another index, then SQL Server needs to find it in this nonclustered index as well. The indexed value will be known (because updates and deletes always start from the clustered index), but now the right row has to be found. Now suppose that the row affected has a clustered index value of 12345, and the value in the nonclustered index is "green". The table has 1,000,000 entries, and approximately 25,000 of them have the value "green" in the indexed column. With the clustered index value in the leaf pages only, SQL Server has to seek to the first "green" row, read the clustered index value from the leaf page, check if it's the right one and repeat - it will probably have to visit a lot of leaf pages in that process.
So what I guess is that the clustered index value is added to the nonclustered index, not as just some additional data in the index but as an additional indexed column, and hence the data is added to the nonleaf pages as well. Now SQL Server can directly seek for row "green / 12345", and save a lot of IO.
January 8, 2016 at 9:24 am
Thanks for the link, Jacob! That post confirms what I put out as speeculation.
January 8, 2016 at 12:41 pm
Hugo Kornelis (1/8/2016)
Okay, did a small test and confirmed that my faith in Gail was justified (and my faith in my own understanding of index internals was not). After creating a test table and two nonclustered indexes, one unique and one nonunique, and adding 10,000 rows; I used DBCC IND to find the root pages and DBCC PAGE to examine their contents. The nonunique index did indeed have extra data added to each entry, and that data corresponds to the primary key values.I then decided to check the documentation where I got my understanding from - and though phrased in a hard-to-understand way, I must admit that it's there and that I just had overlooked it so far. (See https://msdn.microsoft.com/en-us/library/ms190620.aspx - step 2, item 2 in the list, first and second paragraph).
Obviously, the next question is why this is implemented like this. Why would SQL Server need the clustered index key on the non-leaf levels? (Which is obviously also what the original question in this thread was after).
NOTE: This is where I enter into speculation mode!
Think about when a row is deleted (or updated affecting data in the nonclustered index). If the row is located through another index, then SQL Server needs to find it in this nonclustered index as well. The indexed value will be known (because updates and deletes always start from the clustered index), but now the right row has to be found. Now suppose that the row affected has a clustered index value of 12345, and the value in the nonclustered index is "green". The table has 1,000,000 entries, and approximately 25,000 of them have the value "green" in the indexed column. With the clustered index value in the leaf pages only, SQL Server has to seek to the first "green" row, read the clustered index value from the leaf page, check if it's the right one and repeat - it will probably have to visit a lot of leaf pages in that process.
So what I guess is that the clustered index value is added to the nonclustered index, not as just some additional data in the index but as an additional indexed column, and hence the data is added to the nonleaf pages as well. Now SQL Server can directly seek for row "green / 12345", and save a lot of IO.
Very well done, Hugo. Your research and the link that Jacob posted saved me a whole lot of time.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2016 at 12:42 pm
Jacob Wilkins (1/8/2016)
This seems relevant as well: http://www.sqlskills.com/blogs/kimberly/nonclustered-indexes-lookup-key-btree/Cheers!
Excellent reference. You can't do much better that getting it straight from the horse's mouth. Thanks, Jacob.
--Jeff Moden
Change is inevitable... Change for the better is not.
January 8, 2016 at 1:32 pm
er.mayankshukla (1/8/2016)
In a unique non clustered index, the clustered key comes to leaf level onlyin a non unique non clustered index, the clustered key comes to leaf and non leaf level
Yup, that's correct. What clarification do you need?
Basically, if the nonclustered index is defined as UNIQUE, the clustered index key is treated as include columns (unless explicitly specified in the nonclustered index key), if the nonclustered index is not defined as unique, the clustered index key is treated like key columns and added at the end of the explicitly specified key.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 8, 2016 at 1:39 pm
Edit: nm
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 13, 2016 at 10:25 pm
Hi All,
I took some time to write a detail article on this, which might be useful for all for our reference
Viewing 15 posts - 1 through 15 (of 24 total)
You must be logged in to reply to this topic. Login to reply