How to know on which column should we create index.

Question

How to know on which column should we create index.

mcfarlandparkway

SSCertifiable

Points: 7699
More actions
February 2, 2017 at 7:32 am

#321236

I have a table getcode where it has countrycode,statecode, citycode and zipcode columns
there are other tables called- city in that city code is PK so automatically it will create cluster index..
same for other tables
But comming for getcode table -- do we need to create indexes for all these columns? i mean seperately?
or just create index on country code?
This is confusion for me how we know which index will be created?

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 1

No, you (almost) certainly don't want an index on every column. I'd advise you to read through this series of articles, and anything else you can find on indexing. It's a big area, and not really something that can be answered in a forum post.

John

Y.B. SSChampion Points: 11649 More actions · Answer 2

You should do some research as suggested but here are a few things to keep in mind.
1. A clustered index IS the your table and thus can only have one per table.
2. You can have many non-clustered indexes per table which sorts and stores the data by specified key in a different order than the clustered index.

What you want to examine is how the table in question is being accessed. The easiest way to get started is to look at your queries and note the columns in your joins and where conditions. This will give you a good indication on what column or columns should be indexed. So let's try a simple example...

Let's pretend you have a table with ID, FIRST_NAME, LAST_NAME, ADDRESS and the clustered index is keyed on the ID column. Since your clustered index is the table it will hold all 4 columns. Now maybe most of the time you'd query the table with;
SELECT * FROM Customers WHERE ID = 123
In this case SQL is happy and it can easily navigate the index to find the requested record. But hang on maybe you want to query by LAST_NAME such as;
SELECT * FROM Customers WHERE LAST_NAME = 'Smith'
Here your clustered index is not helpful since it would have to scan the whole table to find all last names that match your criteria.
So this is where you might want to create a non-clustered index on LAST_NAME. Then if you were the run the previous query you'll (likely) find that your scan changed to a seek. Note: In some cases such as with small tables it may still do a scan anyway.

So at this point you might be thinking well this is great I should put an index on every column for every possible scenario and everything will be blazing fast. The answer is NO, there is a tradeoff for adding indexes. The first issue is disk space. You are duplicating data every time you create an index and thus consuming more disk space. On large tables this will be very noticeable. In my simple example above you are storing LAST_NAME twice. (You are also storing ID twice as well because it's automatically included for key lookups). Secondly, there is a performance overhead on indexes. Every insert, update and delete operation have to be done on all indexes that are part of your table. I'm over simplifying everything but I hope this helps. Look up the stairways series on this site for more info on indexes.

EDIT: Also don't use "SELECT *" in your statements. I used it for demonstration purposes only. Always include your column names in your code.

Cheers,

SELECT quote FROM brain WHERE original = 1
0 rows returned

Steve Jones - SSC Editor SSC Guru Points: 736478 More actions · Answer 3

Good advice above. One note, don't be afraid of putting multiple columns in an index. If you tend to query by city and state and zip, put all those in an index. The first column should be the one that is most often used in queries.

Indexing is a bit of an art, balancing the issues that Yb mentions above.

ScottPletcher SSC Guru Points: 101120 More actions · Answer 4

SQL Server provides system views that greatly assist you here. You need to use those views, rather than just look at SQL code, to determine index(es) usage and missing index(es).

The views are:
sys.dm_db_missing_index* views
sys.dm_db_index_operational_stats
sys.dm_db_index_physical_stats
sys.dm_db_index_usage_stats

Those will give you specific numbers based on actual requests to SQL Server, rather than you being forced to estimate how many times each query you are looking at might be executed.

I can give you my (fairly involved) queries against those views as a solid starting point if you'd like to pursue this approach.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".

Jeff Moden SSC Guru Points: 1004406 More actions · Answer 5

mcfarlandparkway - Thursday, February 2, 2017 7:32 AM
I have a table getcode where it has countrycode,statecode, citycode and zipcode columns
there are other tables called- city in that city code is PK so automatically it will create cluster index..
same for other tables
But comming for getcode table -- do we need to create indexes for all these columns? i mean seperately?
or just create index on country code?
This is confusion for me how we know which index will be created?

My best advice would be to never add an index unless 1) it needs to enforce uniqueness, 2) is part of an FK, or 3) you have a slow or resource intensive query that would actually benefit from it and then you'd better test the heck out of it. It's not difficult to find queries (Extended Events or SQL Profiler) that are taking too long that use the table. Also, be VERY leery of creating any index that SQL Server may recommend especially those whose leading column is a low cardinality column. For example, it's usually not a good idea to make the leading column of an index a "status" column unless it's the table that simply contains the descriptions of the various statuses.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)