Performance Puzzle

  • I'm running up against a puzzling performance degradation problem. I'm running a batch load of some 700,000 rows per batch. When the target tables have less than about 10 million rows each (about 1GB), I'm writing up to 15,000 rows per second. But after that, performance starts to drop precipitously to 2000 rows per second and lower. I have indexes on primary key and foreign key fields (some of which are numeric(18,0)), and on a date field which has a clustered index.

    Paging remains low, cache hit rate is between 95 and 100, average latch wait time is about 900ms, processor time is consistently lower than 55% and disk transfers are quite low.

    Here are the particulars:

    WIN 2K Server

    SQL 2K Standard SP2

    Dual Processor Pentium III 1400s

    4GB memory

    Promise ATA Raid 10 with 4 120GB disks

    Any ideas?

  • Some ideas...

    Do you have automatic growing of your DB's enabled? Maybe at the 1GB point, SQL Server starts to grow your database (or logs).

    A second possible cause are the indexes. You should check the fill factor, decreasing it if there are a lot of rows with similar dates.

  • May even be the transaction log needing to grow and while doing so you are getting waits. Make sure file growth is large enough to keep delays to a minimum.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Thanks for your response! I'm not logging this DB so the log is not a problem, but your thought about dynamic DB growth has me thinking. I do have it set up that way and the increment is 10 MB. I assume that means there's a lot of overhead consumed in adding these extents. Is that what you're alluding to?

    quote:


    Some ideas...

    Do you have automatic growing of your DB's enabled? Maybe at the 1GB point, SQL Server starts to grow your database (or logs).

    A second possible cause are the indexes. You should check the fill factor, decreasing it if there are a lot of rows with similar dates.


  • I'd first check the file growth like npeeters and antares686 mentioned. It's also possible your performance problems stem from index page splitting. Is it possible for you to drop the clustered index before and recreate it afterwards (or reindex with a very low fill factor, but with the amount of data your inserting that might not be feasible)

    Ken


    -Ken

  • Thanks for your reply! Looks like I need to up the file growth increment substantially to decrease the overhead of extent addition.

    quote:


    May even be the transaction log needing to grow and while doing so you are getting waits. Make sure file growth is large enough to keep delays to a minimum.

    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)


  • THanks also for your idea bout the index fill factor. I've got that set to zero now, but every record in a batch load has the same date (it's a load of daily transactions), so I'm adding about 750,000 transactions with identical date column values. Any recommendations on what fill factor value to start with for experimentation?

    quote:


    Some ideas...

    Do you have automatic growing of your DB's enabled? Maybe at the 1GB point, SQL Server starts to grow your database (or logs).

    A second possible cause are the indexes. You should check the fill factor, decreasing it if there are a lot of rows with similar dates.


  • Thanks, Ken. I'm working on the file growth problem now. I have toyed with the idea of dropping the idexes at load time, then recreating them after the load, but I've been somewhat leery of taking this apporach since some rudimentary comparisons of elapsed times for each method seemed to favor keeping the indexes intact. I do realize that these dynamics could change with increasing volumes, however, so I'm keeping that option open.

    Carl

    quote:


    I'd first check the file growth like npeeters and antares686 mentioned. It's also possible your performance problems stem from index page splitting. Is it possible for you to drop the clustered index before and recreate it afterwards (or reindex with a very low fill factor, but with the amount of data your inserting that might not be feasible)

    Ken


  • Just fyi regarding the fill factor of 0, this does not really mean 0, but is closer to 100, but some space left. I am copying from BOL below. I have had similar situations in the past and found that fill factors below 70 came out detrimental.

    "A fill factor value of 0 does not mean that pages are 0 percent full. It is treated similarly to a fill factor value of 100 in that SQL Server creates clustered indexes with full data pages and nonclustered indexes with full leaf pages. It is different from 100 in that SQL Server leaves some space within the upper level of the index tree. " ~BOL

  • I am not sure how you are doing a load, so I am assuming a bulk insert or bcp. You have a number of options with these to speed up the load (in BOL) such as KILOBYTES_PER_BATCH, CHECK_CONSTRAINTS, etc. Also, since you have 2 processors, you can simultaniously load from 2 sources. ie, half of the data in 1 file, the other half in the other file.

  • Thanks - yes I was aware of this oddity. My perplexity stems from the lack of information on what to expect from a fill factor of (for example) 98 in contrast to a fill factor of 90. I have to assume that there are no benchmark metrics or fill factor estimation tools available and that experimentation is the only route!

    Carl

    quote:


    Just fyi regarding the fill factor of 0, this does not really mean 0, but is closer to 100, but some space left. I am copying from BOL below. I have had similar situations in the past and found that fill factors below 70 came out detrimental.

    "A fill factor value of 0 does not mean that pages are 0 percent full. It is treated similarly to a fill factor value of 100 in that SQL Server creates clustered indexes with full data pages and nonclustered indexes with full leaf pages. It is different from 100 in that SQL Server leaves some space within the upper level of the index tree. " ~BOL


  • Thanks! Sorry I wasn't explicit about the load method - I use bulk insert to get the raw data into a non-indexed table in another database, but then use INSERTs to populate the target tables since I have to do quite a bit of data cleansing and manipulation. The bulk inserts are consistently very fast, especially since they're not logged and I use TABLOCK.

    Carl

    quote:


    I am not sure how you are doing a load, so I am assuming a bulk insert or bcp. You have a number of options with these to speed up the load (in BOL) such as KILOBYTES_PER_BATCH, CHECK_CONSTRAINTS, etc. Also, since you have 2 processors, you can simultaniously load from 2 sources. ie, half of the data in 1 file, the other half in the other file.


Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply