Index on Table with 2 Columns Millions of Rows

  • Hi,

    I have a table with round about 30 millions records in it.

    CREATE TABLE [dbo].[Table1](

    [uniq] [nvarchar](10) NULL,

    [names] [nvarchar](200) NULL

    ) ON [PRIMARY]

    I also have the following index on the table as well.

    CREATE NONCLUSTERED INDEX [IX_NAMES_UNIQ] ON Table1

    (

    names asc,

    uniq asc

    )

    I populate this table once a week and it takes quite some time to populate it because of the index.

    Is it advisable to drop the index before populating the table, and then once the table is populated, to then re-create the index.

    Any insight would be appreciated

    Regards

  • How do you populate the table?

  • I run a cursor through another table and then i

    Insert Into Table1

    Select distinct Uniq,ltrim(rtrim([Name])) from kri.dbo.SeperatedListTable(@info,@uid,' ')

    SeperatedListTable is a table-valued function that spilts up a line of text into individual words.

    So for instance

    'The man with the red hat' becomes

    The

    man

    with

    the

    red

    hat

  • I do not know if deleting or disabling of the index results in a higher performance because of the way you are filling the table. The most time consuming is the cursor.

    I think a better way is to use

    INSERT INTO ......

    SELECT SUBSTRING(field, x, y), SUBSTRING(field, x + z, y)

    FROM .....

    But if you need the cursor than check performance by deleting and recreating index.

  • Do you have a clustered index on this table/heap ?

    As stated above the problem with speed would most likely be due to the fact that you are using a cursor to split out the string and insert this into a tables. To make this run quicker you should use set based code for the insert, there is an excellent article on here on how to split up a string with a tally table.

    http://www.sqlservercentral.com/articles/T-SQL/62867/"> http://www.sqlservercentral.com/articles/T-SQL/62867/

  • However, once you have a set based process (and yes, please do this first), yes, usually when doing large scale data loads I've found that dropping the index and recreating it is cheaper than letting the index get updated during the inserts. However, there's a caveat to that of course, if you're partitioning your data (no mention in this case), you don't want to use that method, at all.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • steveb. (11/25/2009)


    Do you have a clustered index on this table/heap ?

    As stated above the problem with speed would most likely be due to the fact that you are using a cursor to split out the string and insert this into a tables. To make this run quicker you should use set based code for the insert, there is an excellent article on here on how to split up a string with a tally table.

    http://www.sqlservercentral.com/articles/T-SQL/62867/"> http://www.sqlservercentral.com/articles/T-SQL/62867/

    Thanks steveb

    The reason I opted for RBAR operations instead of Set-based was because I tried using CTE to do the splitting for me me, and the query ran for hours. Thus I thought a cursor would be the only way.

    This article has put everything into perspective. Thanks a lot. Your help is greatly appreciated.

  • Grant Fritchey (11/25/2009)


    However, once you have a set based process (and yes, please do this first), yes, usually when doing large scale data loads I've found that dropping the index and recreating it is cheaper than letting the index get updated during the inserts. However, there's a caveat to that of course, if you're partitioning your data (no mention in this case), you don't want to use that method, at all.

    I have found similar results in some operations, while in others the better method was to absorb the cost of the inserts into the index. I would recommend testing both methods for best performance - after converting to set based.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Rowan-283474 (11/25/2009)


    steveb. (11/25/2009)


    Do you have a clustered index on this table/heap ?

    As stated above the problem with speed would most likely be due to the fact that you are using a cursor to split out the string and insert this into a tables. To make this run quicker you should use set based code for the insert, there is an excellent article on here on how to split up a string with a tally table.

    http://www.sqlservercentral.com/articles/T-SQL/62867/"> http://www.sqlservercentral.com/articles/T-SQL/62867/

    Thanks steveb

    The reason I opted for RBAR operations instead of Set-based was because I tried using CTE to do the splitting for me me, and the query ran for hours. Thus I thought a cursor would be the only way.

    This article has put everything into perspective. Thanks a lot. Your help is greatly appreciated.

    Post your code... I believe there are a couple of folks that know how to do inline splits. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Nice dead-pan delivery there Jeff! hehehe

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply