Basic Index Question

  • I recently read this article: http://www.sqlservercentral.com/articles/Performance+Tuning/indexcreationguidelines/1389/

    The author mentions that indexes should not be created on free-form text columns, such as varchar's. Why is that and would this still hold true for varchar columns that are frequently used together in where clauses?

    Finally, if one were to be doing a large data load, where the end goal is to end with a table full of data, and a covering composite index over frequently queried-together columns, is it generally best to add the index after the data is inserted (versus putting the index in place and then loading the data)?

  • I believe the author is talking about free-form text such as whole sentences of text for example. Also free form text can be error prone - different spellings of the same entity can cause issues (e.g., some data could contain abbreviations and others, not. But the values should be the same. )

    The greatest enemy of knowledge is not ignorance, it is the illusion of knowledge. - Stephen Hawking

  • I see where you're coming from and I agree - however, I don't believe that's what the author was talking about. He specifically mentions names and addresses as freeform text...and that's exactly what I feel that I need to index. I don't know...maybe I'm just reading the article incorrectly..

  • Ok, let me try...

    There's little point in an index with a text column in the index key if all queries that filter on the column filter using LIKE with leading wildcards, CHARINDEX, LEFT, RIGHT or SUBSTRING. Using any of those (which is more common for a freeform column) make the predicate non-SARGable and mean that the index can't be seeked on.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • First off, that article is 7 years old, so almost anything it says might be slightly off these days. However, I just read through it and frankly, the author is wrong. It really depends on the size of the column, but indexing varchar & nvarchar columns is a huge advantage. Also, his statements on datetime were somewhat applicable on a 2000 database but not on 2005 or above. You can get good use out of an index on datetime.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • coronaride (6/10/2011)


    I see where you're coming from and I agree - however, I don't believe that's what the author was talking about. He specifically mentions names and addresses as freeform text...and that's exactly what I feel that I need to index. I don't know...maybe I'm just reading the article incorrectly..

    Not reading it incorrectly.

    It's just really hard to end up with useful indexes on "Richard", "Rick", "Dick", "Ricky", "Richy", et al, which are all the same name, but won't query that way.

    On addresses, how often are you going to want all street addresses that start with "1", or something like that? You're more likely to need to query by city and state, or postal code, and then by street name, and only then would you worry about the number predicate on the address.

    That's not to say such indexes are useless ALL the time, just that they are useless so often as to be generally thought of that way.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • p.s. take that article with a pinch of salt (maybe a saltpan sized one) There's a lot of bad info in there unfortunately.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail - that helps.

    G-Squared - I see where you're coming from and, if I wasn't normalizing addresses out into CASS certified columns or making first/last name entities (our front-end SOLR indexer handles homonym lookups) 'owned' by a person entity, you'd be absolutely right. Thanks, though.

  • coronaride (6/10/2011)


    Thanks Gail - that helps.

    G-Squared - I see where you're coming from and, if I wasn't normalizing addresses out into CASS certified columns or making first/last name entities (our front-end SOLR indexer handles homonym lookups) 'owned' by a person entity, you'd be absolutely right. Thanks, though.

    Makes sense. Sounds like you're doing it the right way.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • love your avatar, by the way. 🙂

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply