difference between clusted & nonclusted index

Question

difference between clusted & nonclusted index

khaleel_1

Grasshopper

Points: 24
More actions
September 16, 2001 at 4:58 am

#77412

hi friends
i want to know what is difference between
clusted & non-clusted
thanks & regards
m.a.khaleel

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

Andy Warren SSC Guru Points: 119902 More actions · Answer 1

A clustered index means that the table is actually sorted based on the index criteria. Since the data IS the index, it's only possible to have one clustered index per table. Every table should have a clustered index defined, choosing the right column for the clustered index requires a good knowledge of the data access patterns.

Non-clustered indexes are "normal" indexes, you can have as many as you want.

Andy

Andy
SQLAndy - My Blog!
Connect with me on LinkedIn
Follow me on Twitter

mdburr Default port Points: 1483 More actions · Answer 2

A clustered index is one where a leaf-level of the index is an actual data page, i.e. it contains the actual row data on it. This differs from a non-clustered index, where the leaf-level pages of the index contain "pointers" to the data.

As a result of the structure of clustered indexes, there are a few side effects. One, perhaps the most frequently stated, is that the data is sorted in the order of the indexed column(s). For example, if your clustered index is on a date column, then your data will be sorted according to date. Another side effect is that data retrievals for queries that use the clustered index are often faster than queries that use no index or that use a non-clustered index. This is simply because less IO is required: once SQL Server has located a value in the clustered index it has the entire row associated with that value. A third side effect is that SQL Server can utilize some performance improvements when working with clustered indexes that allow it to reduce IO even more. In particular, since SQL Server knows that all of the rows with a certain value in the clustered index are stored together, it doesn't need to perform reads for each row that its found; it know that once it has retrieved the data page, all of the other rows that match the searched for value are on that page, or adjacent pages. These qualities make clustered indexes particularily useful if queries against a table will be searching for ranges of values.

There are additional side effects. A well chosen clustered index can help you avoid hot spots of activity by spreading inserts, updates and deletes across all of the data pages in a table; while a poorly chosen clustered index can have the opposite effect. In particular clustered indexes on columns that tend to have sequential values can create "hot-spots" of activity, with most insert activity occurring on the last page of the table. Concurrency might be reduced because of this, as users block one another. Identity columns and columns that include record update or record insert dates are good examples of these.

You can have only one clustered index per table, since the data pages are sorted accordingly, and the data pages can only be sorted one way at a time. There are many additional side-effects that must be considered. As some general rules: do not adopt a rule of immediately assigning a clustered index to every table in your database: some tables simply don't benefit from clustered indexes; if you do place a clustered index on a table, carefully consider your options, look for columns that you will frequently perform sorts on, or that you will commonly do "range" searches against, also, keep in mind that you may want to use the clustered index to help spread your data modifications across the entire table instead of concentrating them at the end.

Non-clustered indexes, as I said contain pointers to data pages where rows may be found. They often require more IO during data retrievals than a clustered index would. Sometimes, they require more IO than a table scan would; SQL Server is aware of this, and in such cases, it will usually opt for a table scan. Despite their decreased performance when compared to clustered indexes, there are some areas where they excel.

Nonclustered indexes are especially useful for queries that are searching for a particular value and which will return a relatively small subset of the data. They are also useful on occasion for "covering queries": if the non-clustered index includes all of the columns that you are retrieving in a query, SQL Server never has to go down to the data page to retrieve data; this can provide amazing performance gains. In general, the goal with a non-clustered index is to choose an index that will return the fewest rows in your query.

There's a lot more that could be said, but I have a meeting to go to.

Good luck.

Matthew Burr

don1941 SSCarpal Tunnel Points: 4183 More actions · Answer 3

quote:
Non-clustered indexes are "normal" indexes, you can have as many as you want.

Actually you can have as many as 254 indexes on a table. SS sometimes inserts indexes for its own purposes so the actual limit you have may be somewhat different.

If you have a table with lots of indexes, at least one of the should be clustered. I don't have all the details at my fingertips but the clustered index key is stored in the non-clustered indexes. Then actual data retrieval when the entire row is needed is much faster.

don1941 SSCarpal Tunnel Points: 4183 More actions · Answer 4

Replying to me own post ;; tsks tsk

On indexes- keep in mind that they're data too. When you insert a row in a table, every index on that table has to be updated as well. If you have 25 indexes on a table, ever INSERT, DELETE and UPDATE also has to update those 25 indexes as well.

If you happen to update the primary key upon which your table is physically sorted and that causes the row to change position, the data will be resorted on disk as well.

And, if you have a clustered index on a table with a high number of inserts, this may cause poor performance or single threading. Poor performance if the key value is random and causes the above-mentioned resorting. Single threading if your keys are sequential (ie. an IDENTITY) column as each row will need to be inserted on the same physical page as the prior row and will have to wait (LOCKING) for the prior insert to be committed before it can go.