August 31, 2004 at 10:24 am
I have a stored procedure, currently written using dynamic SQL, that selects the TOP X rows, where X depends on user input.
I would like to rewrite the SP without the dynamic SQL.
Replacing the X with an input parameter does not work.
I have tried SETting ROWCOUNT to X. This works, but is MUCH slower.
Any hints would be appreciated.
Joe
August 31, 2004 at 12:02 pm
I'm not sure if a cursor would run any faster than the SET ROWCOUNT, but it is another option. Create a temp table to hold the results (@Table, not #Table to avoid logging), and FETCH results until end of cursor or until X results obtained.
Same performance disclaimer as above especially if actual rows might be much greater than X: Create a temp table with an identity column and just select all results into it. Then select * from the temp table where identity column <= X. If you know X will never be greater than Y, but that there may be a lot more than Y results available you can use a TOP Y in the SELECT INTO when populating the temp table to improve performance a bit. If variable based tables (@Table) don't support Identity columns (I don't know without looking) you can either use a #Temp table or a second pass with an UPDATE statement to apply sequential numbering. Since that gets ugly (needs a subselect on itself) I'd just go with the #Temp table.
I can't seem to come up with an alternative to Dynamic SQL or ROWCOUNT that doesn't involve a temp table at all.
August 31, 2004 at 1:33 pm
Aaron,
I'll try the cursor, but I'm guessing you're right about the speed.
Definitely can't select all of the results. MANY orders of magnitude more than is likely to be requested. I WILL try the idea of selecting some number much greater than what the user is likely to choose, like 200, and process that using the temp table/identity.
This may turn out to be not worth it. I just don't like dynamic SQL.
August 31, 2004 at 2:54 pm
Without knowing why the ROWCOUNT apparently performs as poorly as it apparently does, you might try a TOP 200 in conjuction with the ROWCOUNT. Just a stab in the dark.
Also, the reason might simply be because you aren't running in Dynamic SQL. If the Dynamic SQL version was prepared using constants in the where clause instead of variables then statistics could cause a different access path to be chosen (since Dynamic SQL reassesses access paths every time). To confirm this you could run it both ways and get an execution plan for each and see what differs. If the plans are the same then there is apparently some sort of overhead with ROWCOUNT limiting (I've never used it yet and don't know) a result set. If the plans differ then the performance difference is likely due to the constants embedded into the dynamic SQL which means that my other two suggestions will perform as poorly as the ROWCOUNT method. In that case you would need to look into restructuring the SQL, changing indexes, or possibly providing index hints (non-ideal).
Good luck.
August 31, 2004 at 4:30 pm
If the ROWCOUNT xxx is outside the dynamic sql...
Put it inside the dynamic Sql.
You would gain performance by avoiding Dynamic Sql if possible.
/rockmoose
You must unlearn what You have learnt
August 31, 2004 at 4:36 pm
Yeah, the whole point of the question was how to do it without the dynamic SQL.
I converted it to non-dynamic, selecting everything, expecting ROWCOUNT to limit the result. It worked but was very slow.
August 31, 2004 at 5:52 pm
Ok, No Idea.
Don't see why SET ROWCOUNT would make it much slower compared to TOP.
You could try to hint:
OPTION( FAST 200 ) in the select , see if it makes any difference.
/rockmoose
You must unlearn what You have learnt
August 31, 2004 at 5:54 pm
Worth a try. Thanks.
September 1, 2004 at 1:09 pm
Another approach which will ONLY be practical if the maximum value of [X] is small and your query is relatively short:
Use a case statement with MAX(X) or MAX(X + 1) discrete cases (assuming you want the "TOP 0" case in there also).
It would be a long proc, but should be quick to implement with cut & paste, and should run fast enough... <grin>
- john
September 1, 2004 at 1:25 pm
John,
Good thought. Impractical in this case, though. I currently have NO idea what the users are going to select, so I can't code myself into that kind of corner (already done it often enough in this project!).
I'll keep the technique in mind, though. Thanks.
September 1, 2004 at 4:48 pm
I've run into exactly this problem. Setting rowcount performed to slowly....look at the estimated execution plan for a hint as to why this is (BOL: "The setting of SET ROWCOUNT is set at execute or run time and not at parse time."):
use pubs
set rowcount 1
select * from authors
set rowcount 0
select * from authors
So, to get around this I created a view that encompassed my query. Then, using dynamic SQL, I call the view as "Select top X * from View". This means the joins, etc in the Query are "compiled", while only the calling of the view is dynamic and doesn't re-use the execution plan.
Hope this helps.
Signature is NULL
September 1, 2004 at 5:02 pm
Now THAT'S nice.
There are other variables besides the TOP X, such as start/end date/time, organizational units, etc. The underlying table has about 7 million rows, and is updated daily, from about 25 sources, at irregular intervals.
Does the view still sound like a viable option? I have little experience with them.
Thanks.
September 1, 2004 at 5:21 pm
Sure...a view works well for these things. It's a good way to encapsulate some complicated join syntax; with the added benefit of "compiling" the execution plan.
Now, the where clause CAN easily change the execution plan. With 7 million records every column in your where clause should be indexed; that will help more than anything. Still, this certainly doesn't make the view less viable.
cl
Signature is NULL
September 1, 2004 at 5:49 pm
That's good news, Calvin.
I'll try it in the next build of the product.
September 5, 2004 at 7:27 am
Hi,
I am doing it like this:
select top 10 <columns list> from <table> where <condition> and <unique key> not in ( select top x <unique key> from <table> where <condition> )
When x is 0 you will get top 10 rows
When x is 10 you will get next top 10 rows
and so on.......
Note: the <condition> in both querys must be same.
Regards,
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply