June 15, 2011 at 9:52 am
Hi everyone. I'm new to this forum so I hope I'm posting this in the correct place.
I have a stored proc that is called many times in a loop and selects chunks of data as it goes until it select all records in the table.
Here is the query that gets run in my stored proc via dynamic sql
First time called
With RowTable As
(Select row_number()
over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum,
* FROM PHDAPP.UF_EVENT_DATA )
SELECT * FROM RowTable Where RowNum between 1 and 200000
Second time called.
With RowTable As
(Select row_number()
over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum,
* FROM PHDAPP.UF_EVENT_DATA )
SELECT * FROM RowTable Where RowNum between 200001 and 400000
Etc.
Once this stored proc gets called enough times so that it's processing rownums in the 6800000, it slows down like crazy.
First runs, you can see the time it takes here.
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 1 and 200000
200000
Jun 15 2011 9:30AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 200001 and 400000
400000
Jun 15 2011 9:30AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 400001 and 600000
600000
Jun 15 2011 9:30AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 600001 and 800000
800000
Jun 15 2011 9:30AM
By the time it hits around 6800000 and > the time it takes looks like this.
Jun 15 2011 9:42AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 7400001 and 7600000
7600000
Jun 15 2011 9:44AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 7600001 and 7800000
7800000
Jun 15 2011 9:45AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 7800001 and 8000000
8000000
Jun 15 2011 9:46AM
With RowTable As (Select row_number() over ( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum, * FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 8000001 and 8200000
8200000
It just gets slower, and slower and slower.
I've tried to find out what causing this and I came across something called Parameter Sniffing. Sooo, it tried this in my stored proc. Didn't help
ALTER PROCEDURE [mydb].[MYPROC]
(
@SelectList nvarchar(1000), -- this is a required parameter
@WhereCondition nvarchar(max) = NULL,
@WhereParams nvarchar(500) = NULL,
@JoinTables nvarchar(max) = NULL,
@OrderByExpression nvarchar(250) = NULL,
@ReturnInPages bit = 0,
@StartRowCount nvarchar(10) = 0,
@EndRowCount nvarchar(10) = 50,
@debug bit=0,
@return_code int=0 output
)
with execute as 'myuser'
As
BEGIN
SET NOCOUNT ON
--dbcc freeproccache
DECLARE @sql nvarchar(max),
@DynSql nvarchar(max),
@OrderBy nvarchar(200),
@TblName nvarchar(40),
@myStartRowCount nvarchar(10)=0,
@myEndRowCount nvarchar(10) = 50,
@myReturnInPages bit = 0,
@mydebug bit = 0,
@MySelectList nvarchar(1000), -- this is a required parameter
@MyWhereCondition nvarchar(max) = NULL,
@MyWhereParams nvarchar(500) = NULL,
@MyJoinTables nvarchar(max) = NULL,
@MyOrderByExpression nvarchar(250) = NULL,
@Optimize nvarchar(500)=NULL,
@Myreturn_code int=0
set @myStartRowCount=@StartRowCount
set @myEndRowCount=@EndRowCount
set @myReturnInPages=@ReturnInPages
set @mydebug=@debug
set @MySelectList=@SelectList -- this is a required parameter
set @MyWhereCondition = @WhereCondition
set @MyWhereParams =@WhereParams
set @MyJoinTables =@JoinTables
set @MyOrderByExpression =@OrderByExpression
set @Myreturn_code =@return_code
etc.
Does any know what I can do to get this to preform. I'm pretty much out of ideas. I don't know how to force it to expire and re create the Exec Plan for this so I don't know if that would help but from what I read, the code above should have done this.
Thanks,
Amber.
June 15, 2011 at 11:05 am
This is sql server 2008 64 bit by the way. Is this in the correct group?
June 15, 2011 at 11:28 am
aracy (6/15/2011)
This is sql server 2008 64 bit by the way. Is this in the correct group?
Nope but we should be fine anyways...
Never done that huge of a paging so not sure I'll be able to help.
Any reason why you page at 200 000 rows? Sounds like a queue processing. There's no way a end user can read a manully parse through all that!
June 15, 2011 at 11:33 am
Actually, the user can pick the amount of records to select. At 100 000, makes no difference. Still slows down at the same place.
This is originally written in a C# app that calls this stored proc and writes the data out to an compressed, encripted file.
There are reasons for this. Customer requirement.
The C# app calls this stored proc but I wrote the loop that calls the stored proc in sql server also to confirm that the problem is with sql server and not the C# code.
June 15, 2011 at 11:35 am
The user is not reading these records. They are being archived off and then can be brought into a temp table for the user to view and query on later if they wish.
June 15, 2011 at 11:36 am
Aaaand, I have seen a lot of similar issues posted on the internet regarding this type of problem and 'parameter sniffing'. Wondering if anyone has solved this problem.
June 15, 2011 at 11:47 am
I think it's just as simple as the server has to do more work to find the valid records.
I personally would do the select only once into a perm. reporting table.
I'd use the row_id as the clustered index and then select 200K records that way. That should give you consistant read speed and an overall speed boost.
I can't tell for sure because I don't know those internals, but my guess is that the server needs to read through 8+ M rows before finding "correct" rows. Compare that to searching only through 1M and that makes it obvious that there's more work there.
Hopefully an internal expert can tell you how it reall works. In the mean time I'd try the reporting table trick. I'm sure it'll work.
June 15, 2011 at 12:50 pm
I have to do it this way. There is no option.
I can though change the query and ya, I think that may be where the problem lies.
I need to select in chunks of data. Maybe there's a better way to write this query. Problem is, RowNum is created on the fly and I can't index it...that I know of.
With RowTable As
(Select row_number() over
( ORDER BY UF_EVENT_DATA.UF_EVENT_DATA_ID ) As RowNum,
* FROM PHDAPP.UF_EVENT_DATA ) SELECT * FROM RowTable Where RowNum between 5000001 and 5200000
June 15, 2011 at 12:53 pm
Why does it have to be in a query re-ran on the fly everytime?
There are ways around most constraints like multi users and parameters.
June 15, 2011 at 12:58 pm
Oh and I had to do it in chunks because this statement pukes in c# if there are too may records (know problem)
bf.Serialize(mycmp, events1);
If that worked, I would have just dumped everything out that was selected to my file.
June 15, 2011 at 1:01 pm
What are you doing with that file one in C#?
You could use SSIS or BCP out to dump that data into a csv or whatever and be done in 10 sec flat.
June 15, 2011 at 1:05 pm
Hmmm....
Maybe this will help for anyone who's following this...
June 15, 2011 at 1:07 pm
Ninja's_RGR'us (6/15/2011)
What are you doing with that file one in C#?You could use SSIS or BCP out to dump that data into a csv or whatever and be done in 10 sec flat.
It has to be encripted and as small in size as possible. Trust me, I've been programming for many years. I know what I'm doing.
June 15, 2011 at 1:08 pm
Ya but you just said you can't use temp tables....
It takes a heck of a lot of ressources to insert X million rows into anything.
Select into is likely the fastest option there. But then you have to index the WHOLE table and THEN select. I can't see that being much faster that just putting the filter in there.
June 15, 2011 at 1:10 pm
aracy (6/15/2011)
Ninja's_RGR'us (6/15/2011)
What are you doing with that file one in C#?You could use SSIS or BCP out to dump that data into a csv or whatever and be done in 10 sec flat.
It has to be encripted and as small in size as possible. Trust me, I've been programming for many years. I know what I'm doing.
Obviously you have some room for improvement or you wouldn't be here ;-).
I know that the server can also encrypt and zip the data but that's obviously besides the point.
I'd still check out bcp out into a single file and the encrypt / zip. Should be the shortest path to completion.
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply