Optimize big table

  • Hi Guys

    I need to optimize a table that grows so much. I have SQL Server 2008 in Windows 7 at moment, but I will put it in a server in few days.

    At moment the table has around 3Gb and 21 millions of rows.

    To make a Select count(*) it take more than 2 minutes and I had to stop it. So very slow.

    What I did yesterday that gave me some improvement was:

    a)I had deleted all table indexes mainly the primary key

    b)Compression of the table (right click in the table -> Storage -> compression)

    c)Shrink the database.

    But the table will still grow more and more. So if somebody can give a hand to know what to do. I have some ideas:

    1 ) Partition of the table by rows.

    2) Add non clustered indexes in some important fields

    3) Each time rows are added (normally added by a program that add thousand of rows) make maintenance:

    - rebuild /reorganize indexes

    - shrink the database.

    It is correct? Are the better practices for that kind of tables?

    I will really appreciate some ideas.

    Thanks

    Marcos

  • Marcos

    Why would you shrink the database if it is constantly growing? This will only lead to physical fragmentation and wasted resources when it (inevitably) grows again. I would recommend that you never do this.

    Also, why did you drop your primary key? Unless you have another unique index on the same column(s), you now have the situation where duplicate data can be added to your table.

    Reindexing is definitely a good idea on a highly transactional database - carry it out at the time of minimum load on the database.

    Not sure about partitioning, but it's probably worth trying.

    The main things to check are that the code that runs against your database is as efficient as it can be, and that you have the right indexes in place to support it.

    John

  • Well... You shouldn't really shrink the database because it will cause massive index fragmentation. You will now most likely have to rebuild/reorganize them (if you reorganize, you will have to update the statistics). Of course, you would have to determine the pros/cons, your maintenance windows, and the amount of fragmentation of your indexes. Also since you said that the table is growing really fast, you should pre-grow the data file(s) it resides on so the database won't have to consume more resources to enlarge the data file(s) incrementally.

    I would suggest partitioning the table (if you go that route) by a helpful column such as date. I am currently in that situation on a thread I created.

  • Hi John

    thanks a lot for the quickly response.

    Here are some considerations responding your questions:

    I deleted the primary key and I dont have a unique key. I do a control before insert to check repetitive data. Primary key has asociated clustered index, and it is maybe that cause locks when I try to insert a new row bacause the index are not organized correctly. What I can try is to keep the primary key index and a maintanence plan to re'organize it after insert thousands of rows.

    About DB Shrink , in my ignorance, I beleave that reduce tha database space can help ? isn't good idea?

    I saw in internet that compress a table is not a perfect solution because it transform tha table so for select or insert the Db have to make conversions to get data...

    I really have lot of doubts about this !!

    Thanks again !

  • Hi dajonx thanks

    so accordint to yu suggestion, I am thinking in:

    - Partition the table (I have to investigate a bit that)

    - Add non clustered inxes (I think clusterd index are hard to re-organize in big tables)

    - Make a maintanance plan for rebuild / reorganize index...

    What do you think about?

    Marcos

  • marcosjroig (4/27/2011)


    I deleted the primary key and I dont have a unique key. I do a control before insert to check repetitive data. Primary key has asociated clustered index, and it is maybe that cause locks when I try to insert a new row bacause the index are not organized correctly. What I can try is to keep the primary key index and a maintanence plan to re'organize it after insert thousands of rows.

    Well a lack of good indexes (and that includes a good cluster) is really going to hinder your performance

    About DB Shrink , in my ignorance, I beleave that reduce tha database space can help ? isn't good idea?

    Shrink it a terrible idea. Yes, it removes free space, but that free space is not causing your performance problems, and the shrink is fragmenting your indexes badly. Besides, the next thing SQL will do is grow the file, and that will cause more performance problems.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Removing indexes is generally not going to help SELECT performance. You need to understand what your queries are doing in order to correctly configure your indexes, not simply dump them. Have you looked at the execution plans for the common queries run against this table? If not, I'd start there. Understand what they are doing and where their bottlenecks are and then address them.

    A simple select count(*) query from a table will use the small index it can to find the data. Without an index, it must scan the table. That's not saying that having an index will be faster, but it won't be any slower. You're not working from a position of knowledge and that's where you should start, getting some knowledge of how the queries are behaving.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • marcosjroig (4/27/2011)


    Hi John

    I deleted the primary key and I dont have a unique key. I do a control before insert to check repetitive data. Primary key has asociated clustered index, and it is maybe that cause locks when I try to insert a new row bacause the index are not organized correctly. What I can try is to keep the primary key index and a maintanence plan to re'organize it after insert thousands of rows.

    My recommendation is to have a primary key and a clustered index on each table, although not necessarily on the same columns. If you rely on your application code to avoid insertion of duplicates, you will eventually run into problems. What happens when someone writes version 2 of your application and doesn't include such a check? And how are you going to ensure that a similar check is performed for ad hoc inserts? Yes, definitely reorganise or rebuild your indexes. You can either use a maintenance plan, which is easy to set up but gives you less control, or you can build your own procedure - I think there are a few examples on this site.

    John

  • Thanks again.

    It is a database for telecom optimization, so lot of rows are inserted.

    What happens it tha with primary index plus unique keys the program that insert a tranche of 50 thousand rows take lot of time (more than 5 hours) or sometimes it go in error because of time out.

    the queries I use, so normal select with this indexes work perfectly. So deleting indexes now i can insert rows, and the select of course it is slower.

    To resolve the problem, I need maximum performance I can in both: at time to insert, or when I do select.

  • There have been tests, and you can search for them online, that show that a well defined clustered index speeds up inserts over and above having a heap table. I would strongly suggest getting a well configured cluster on the table, minimum.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Marcos

    You may wish to try inserting those rows in smaller batches. If your clustered index is not on an ascending key, then inserts will take longer because page splits will be needed. It may be wise to put your clustered index on a different column, and have your primary key enforced instead by a non-clustered index? You can use the dynamic management views to find out which of your indexes, if any, are not being used over a period of time. Does your database have auto shrink enabled? This should be turned off.

    You should try anything I (and everybody else) have suggested in a test environment before putting it in production.

    John

  • Hi Guys

    a little late but here is what I did.

    Finally I've resolved my problem and the response is in this post, so thank you very much for all your help.

    So, what I did?

    1) I load the table with a little program made in .NET and using thread to insert quickly. The number of thread I use are between 3 and 5 not more. So each thread insert rows basically.

    2) I keep my primary key in the table that gave me performance problems.

    3) I designed again the database: improving Db Design, deleting non useful index and adding indexes where needed.

    4) at the end of each load (just to remember in each load I add millions of rows) I reorganize / Recreate indexes. There are many script in internet the help to do it according the index state it must to be recreated or reorganized.

    What I avoid, is exactly what I said at the beginning of the post:

    a) I had deleted all table indexes mainly the primary key

    b) Compression of the table (right click in the table -> Storage -> compression)

    c) Shrink the database.

    Just a comment, my client is very happy, so I don’t know what to say, just many Thanks !

    Marcos

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply