Index Creation

  • Hi all,

    I am in the process of creating few tables and indexes. I am creating a vendor table and a service table. The Vendor and service table has one-to-many relationship. Both tables are linked with table, VendorServiceLink. The VendorServiceLink will have the following fields, VendorServiceID(PRIMARY KEY), VendorID, ServiceID, Description, EffectiveDate, TerminationDate.

    Now, my qusetion is on which fields should i create index on?

    We will likely to query the link table with VendorID or ServiceID or EffectiveDate and TerminationDate.

    Should I create non-clustered index on all the four fields? I know that slows down my insertions and updation. But without indexing, retrieveing data will be slower. Please Advice.

    Also the clustered index will be created on VendorServiceID, which i will never use in select statements. So can i create a clustered index on other fields. If so, which field should i use coz all the four fields has the same preference and will be used in the select statements quite frequently.

    Thanks,

    Kayal

  • mailsar (6/2/2008)


    Hi all,

    I am in the process of creating few tables and indexes. I am creating a vendor table and a service table. The Vendor and service table has one-to-many relationship. Both tables are linked with table, VendorServiceLink. The VendorServiceLink will have the following fields, VendorServiceID(PRIMARY KEY), VendorID, ServiceID, Description, EffectiveDate, TerminationDate.

    Now, my qusetion is on which fields should i create index on?

    We will likely to query the link table with VendorID or ServiceID or EffectiveDate and TerminationDate.

    Should I create non-clustered index on all the four fields? I know that slows down my insertions and updation. But without indexing, retrieveing data will be slower. Please Advice.

    Also the clustered index will be created on VendorServiceID, which i will never use in select statements. So can i create a clustered index on other fields. If so, which field should i use coz all the four fields has the same preference and will be used in the select statements quite frequently.

    It really depends on the most common access path, but I'll assume for the moment that it's through VendorId in order to get the Service Listings... With that in mind, I think you're right, I wouldn't put the cluster on VendorServiceId. I'd create a compound clustered index on VendorId, ServiceId. But remember my assumption. If the most common path is through ServiceId, I'd swap those out.

    Whether or not you need additional indexes on the data fields again depends on the majority of the queries against them. For example, you're going to be getting lists of all vendors that have accessed by day, then you might need and index on EffectiveDate.

    In short, it just depends on what/how you're planning to access the table. There's not a single right answer for the structure, but there's probably a best answer for the query patterns.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Which fields you will create indexes on will depend on what your most common and/or most demanding queries will be.

    For example, if you usually have a vendor ID and want to look up all the services they deliver, based on a date input, your index on VendorServices might be on VendorID, EffectiveDate and TerminationDate, and include ServiceID. If, on the other hand, you are usually trying to find out which vendors delivered a service on a specific date, it might be better to index ServiceID, EffectiveDate and TerminationDate, and include VendorID.

    It really depends on how you're querying it.

    A very usual practice is to index foreign keys. That's a good place to start.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • To all intents & purposes... Jinx!

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • why cluster on VendorServiceID? You can have a primary key that's non-clustered.

    Use the clustered index for one of the dates since they're often queries in ranges.

    I don't think that < 5 indexes really impacts inserts/updates that much. I get worried when I see 10 indexes or more on tables that are changed often.

    You can easily make 4 indexes on these fields, using the (double, jinxed) advice above. Or you can make some of them compound, such as VendorID and one of the dates if you often query by those.

  • First of all, you should recognize that you have, not a one-to-many, but a many-to-many relationship between Vendors and Services, resolved through the intersection table VendorServiceLink. You also have created my pet peeve, a completely useless Identity key field on the intersection table. You stated yourself that you will never use this field in a query. Right. Actually, there is no way that field can be used. So get rid of it.

    Your primary key for the VendorServiceLink table is the composite key of the foreign keys to Vendors and Services. By default, that will include those two fields in your clustered index. This should satisfy most of your queries. What you should do is identify if most of your queries will be Vendor-centric or Services-centric. That will determine the optimum order of the key placement: (VendorID, ServiceID) or (ServiceID, VendorID). If it is more of an even split, you may want to place each column in its own index.

    Whether or not you need to index the date fields depends on your other queries. If you will be looking for vendor V who started or terminated any service on a certain date or in a certain date range, or a service S that was started or terminated on a certain date or or in a date range, then you probably don't need to index them because the PK index will be used. If, however, you will be looking for all vendors who started or terminated any service by date or date range, or all services that were started or terminated on a date or date range, then you can create up to four other indexes: (EffectiveDate, VendorID), (EffectiveDate, ServiceID), (TerminatedDate, VendorID) and (TerminatedDate, ServiceID).

    As you say, this will impact your inserts and updates somewhat but will speed up your queries. As you will normally perform many, many more queries than inserts and updates, this is an acceptable trade off. However, if your analysis shows that most queries are performed by end-of-(week/month/quarter) report generation and the day-to-day operations consists of a high percentage of inserts and updates, then you might want to hold off on the extra indexes. The reports can take a little longer so that the day-to-day operations run better.

    Tomm Carr
    --
    Version Normal Form -- http://groups.google.com/group/vrdbms

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply