While conducting interviews , I have noticed that, many people does not have clear picture about index.Many of them does not have clear picture about the difference between clustered and non clustered index.When people ask about index, it is tough to explain in one go. I feel it is worth to write details post about SQL server indexes in a simple understandable way even if we have tones of article available in the internet.
In short words, indexes helps the database engine to find the requested data efficiently using the minimal resource.Indexes also helps in data integrity through uniqueness of the column but it is not mandatory to define index on unique column. In a busy system it helps to improve the performance by increasing the concurrency. Multiple indexes on a same table can be used to cater the request issued by different users, but many indexes on a table will create a overhead also.Indexes are stored in different pages, it is like data stored in multiple places, and should be in sync with underlying table. Any insert,update or delete on a table should do same operation on all indexes defined on that table. Index help us improve the performance of the data retrieval but has an overhead on DML operation.In the case of Delete and Update ,index will helps the database engine to find the record that need to be modified. There is no thumb rule about the number of indexes on a table. If you need better performance for your read operation, go with more number of indexes and if you need better performance for the DML operation, keep minimal number of indexes.
SQL server support two types of indexes :
- Clustered Index (CI)
- Non Clustered Index (NCI)
Let us try to understand these two indexes using real life example. Assume that,your neighbour came to your house and asked for the telephone number of 'Robert Mike'. In this scenario, telephone directory will act as Clustered Index. You will open the directory by skipping almost 3/4th of pages assuming that his name appear in the last part of the directory.After turning couple of pages forward or backward , you reach the page where the name 'Robert Mike' is listed. Now you read out the number to your neighbour.So what happened here? When you reached the page where the name 'Robert Mike' is listed, you have all the information requested by your neighbour (client).
Let us assume that, your neighbour came to your house and asked the email id of 'Robert Mike' and you do not remember his email id. In this situation, telephone directory will act as Non Clustered Index. You will open the directory by skipping almost 3/4th of pages assuming that his name appear in the last part of the directory.After turning couple of pages forward or backward , you reach the page where the name 'Robert Mike' is listed. Now you dial 'Robert Mike' and ask him his email id.After disconnecting the line, you will hand over the email id to your neighbour.So what happened here ? When you reached page where the name ' 'Robert Mike' is listed, you do not have information requested by your neighbour (client). You have to do one more operation (dial the number ) to get the information requested by your neighbour (client). In SQL server, this additional operation is called as Bookmark or RID Lookup.
Hope this will give you pictorial idea about the cluster and non clustered index. In the coming post, we will discuss in details about the clustered index and non clustered index.
If you liked this post, do like my page on FaceBook