June 23, 2004 at 11:02 pm
I have the following:
INSERT INTO target
SELECT top 500 *
FROM source
executes in < 1 second!!
INSERT INTO target
SELECT top 501 *
FROM source
executes in 140 seconds
What is the deal?
June 24, 2004 at 1:02 am
Sure you dont have an Order By on the second query ?
June 24, 2004 at 8:07 am
There is no order by. Both queries are the same query. In the application we are moving data from a staging area into the live area after appropriately scrubbing the staging area. We then simply INSERT into the live table select from the stage tables. Most of the time the insert runs very fast, but sometimes it runs like molasses. This is a test system in a lab environment, with well defined test data. No one is accessing any of the tables. Not even a query. I am profiling the execution to determine what SQL statements need to be tuned before this goes into productiion. The only thing different about the cases that ran like molasses was the volume of data being inserted in a single SQL statement. So I looked at the execution plan created by SQLServer. In the fast case, the execution plan was predictably simple. In the slow case SQLServer created four spools and then read the data from the spools, merging them together. So then I simply put a top in the subquery of the slow insert limiting it to the same number of rows that were inserted in the fast query. Voila! Iran fast (but of course I didn't get all my data. I then played around with the number on top and discovered that 500 was a "magic" number. Up to 500 rows in the subquery, and SQLServer produced a reasonable plan. If SQLServer estimated that the subquery would retrieve more than 500 rows, it used the spooling plan. The second plan would be appropriate if the subquery retunred 500,000 rows, but not for 501 rows. It is overkill as shown by the performance going from 1 second to 140 seconds.
June 24, 2004 at 9:01 pm
Use ROWCOUNT and see what that gives you.
Derrick Leggett
Mean Old DBA
When life gives you a lemon, fire the DBA.
June 25, 2004 at 3:23 am
I've also saw such "magic numbers". On different systems and different tables it was each time different.
>>Use ROWCOUNT
ROWCOUNT is not recommended in BOL, isn't it?
TOP looks preferable.
June 25, 2004 at 4:43 pm
Also if you have a clustered index on the table being INSERTed into the data is stored in the clustered index order. Try using an ORDER BY clause to order the selected data into the clustered index order. It should reduce the amount of reorganisation going on. Or better still, drop the clustered index, insert the data with an Order By clause, then recreate the clustered index.
Peter
June 25, 2004 at 5:29 pm
ALong those lines can you post the DDL of the staging and live table. It may be as simple as the data is being suffeled in memory for inserts but I would like to see what you are working with. Also if you can post the SHOWPLAN_TEXT version of the execution plans and a bit of sample type data as an example of what is being moved. These will help us see potentially. Note thou, alter any sensitive data before posting please.
Viewing 7 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply