Tables and Indexes

Question

Post reply

Tables and Indexes

Karim Hemani

SSCrazy

Points: 2063
More actions
November 1, 2005 at 7:05 am

#112273

Hi,
I am a budding DBA and need your help in answering some questions regarding tables and indexes. I would highly appreciate your help.
Does SQL Server scan each extent "fully" after scanning the IAM page for a heap table. If yes, then what about data pages in the extent that haven't been used yet. Does it read those too, or is there some way of bypassing such pages?
Is a data file in SQL Server necessarily physically contiguous?
I wouldn't think that all the extents in an IAM could be physically contiguous necessarily. Is this true?
In Oracle, storage structures form a hierarcy: tablespace, segment, extent, data block. In SQL, pages form extents but is there further grouping? Is there such a thing as a segment in SQL. Do extents or "segments" (if they exist) form filegroups.
Do data pages of a table form the leaf level of its clustered index? How then does SQL have separate data and index pages, when the leaf level of the index is both formed of data pages which are also index pages. IS there a distinction?
I would highly appreciate if you could reply to these questions.
Thanks a lot.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

peterhe SSChampion Points: 11469 More actions · Answer 1

Starting from the easy ones:

"Is a data file in SQL Server necessarily physically contiguous? "

No. But if physically continous e.g. allocating enough space when db is created, the db files will have less fragments and the performance will be better.

"Do data pages of a table form the leaf level of its clustered index? "

Yes

"How then does SQL have separate data and index pages, when the leaf level of the index is both formed of data pages which are also index pages. IS there a distinction?"

For nonclustered index, leaf level is not data page. it's the clustered index key (if the table has) or the RID sql server assigned. SQL server distinguish clustered and nonclustered index only.

Steve Jones - SSC Editor SSC Guru Points: 736254 More actions · Answer 2

Peter's answers look good. As far as the heap. If memory serves, the pages are linked in extents and each extent that contained data would need to be scanned. These would be filled in contiguously, but with deletes, upates, etc. there may be empty spaces. I believe the page is read and then information scanned based on the query, but not necessarily every byte of the page is passed through the CPU.

Karim Hemani SSCrazy Points: 2063 More actions · Answer 3

Thank you Steve and Peter. I really appreciate it.

Karim

Karim Hemani SSCrazy Points: 2063 More actions · Answer 4

Just need one clarification if you will. The leaf level of a clustered index: is it said to be made of data pages or index pages? In Oracle, for example, there are data blocks and index blocks. IS it so with SQL, i.e. data pages and index pages? If so, what do we call the leaf level, a linked list of data pages or index pages?

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 5

The leaf level pages of a clustered index are data pages (mtype=1), stored in logical index order.

Regarding scanning unused pages in an extent: An extent is the smallest object SQL Server reads from disk. So when page 7010 is asked for, extent 7008 is read.

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/

Karim Hemani SSCrazy Points: 2063 More actions · Answer 6

Thank you, Chris. Could you also clarify one more thing? When I was reading the database book by Elmasri, I learnt that B+ Tree indexes at their leaf level have a set of key values along with data pointers pointing to the records themselves. IN other words, the actual data was separate from the B+ tree index, as they said. Now, if you will, having all the data at the leaf level of the index would mean that at the level above the leaf level, there would be unimaginably more keys and pointers and therefore nodes, since having the entire data on the leaf level would mean that fewer rows will fit into one data page. I mean, it's obvious that if we store entire rows instead of (key, pointer to record) combinations at the leaf level, we will have to have less in each leaf node and therefore there would be far more pointers at the level above leaf level because of the far too many nodes at the leaf level. For these many more pointers and therefore nodes, we would need more nodes at the level above and so on. My question is, what is mircosoft's rationale in having such a huge b tree index as the clustered index. why not a more lean index with just the key, pointer combinations at the leaf level.

I would highly appreciate if you could shed some light on this.

Chris Hedgate One Orange Chip Points: 25041 More actions · Answer 7

Since you created a separate thread for this question (which was the correct thing to do of course), I have answered there, so please continue that discussion there.

http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=5&messageid=234349

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/