May 28, 2012 at 7:57 am
Hi Friends,
My question may look silly, but i am just curious to know if there is a way...
right now, my tables deals with 10 million records...when ever i execute any T-SQL statement, it is taking too much time to get the results...
One of my friend gave me a suggestion that, processing 10000 records may increase the performance instead of processing the whole rows...
but i am not sure whether it is really possible or not?
Is it possible to process records like that?
Any suggestions would be really appreciated..
Thanks,
Charmer
May 28, 2012 at 8:01 am
It all depends on what your doing with the processing and if the whole table needs to be processed at the same time or if they can be processed in batches.
If its only a select, you could add SELECT TOP 10000 or SELECT TOP 10 PERCENT.
May 28, 2012 at 8:03 am
You could add ROW_NUMBER() in your query and process WHERE ROW_NUMBER() < 10000 , WHERE ROW_NUMBER() BETWEEN 10001 AND 20000 ect
Andy
==========================================================================================================================
A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe
May 28, 2012 at 8:06 am
If i need to process all the records in batches , what i have to do?
i mean if i have 50000 rows, i need to process first 10000 and then next 10000 and then so on....
is it possible, Anthony?
Thanks,
Charmer
May 28, 2012 at 8:09 am
Andy Hyslop (5/28/2012)
You could add ROW_NUMBER() in your query and process WHERE ROW_NUMBER() < 10000 , WHERE ROW_NUMBER() BETWEEN 10001 AND 20000 ectAndy
Andy, I use cursors...so is it possible to process like that by itself...i mean , shall we able to automate the process?
Thanks,
Charmer
May 28, 2012 at 8:10 am
yes, like Andy said, you need to add in a row_number clause to your select, wrap it all in a cte, then have 2 variables, set 1 as the lower limit, set the other as the higher limit then add in the where clause
declare @low int = 1, @high int = 10000
select
row_number() over(partition by order by ) as rownumber,
col1
col2
...............
from
.................
where rownumber >= @low and rownumber <= @high
set @low = @low + @high, @high = @high + 10000
May 28, 2012 at 8:25 am
anthony.green (5/28/2012)
yes, like Andy said, you need to add in a row_number clause to your select, wrap it all in a cte, then have 2 variables, set 1 as the lower limit, set the other as the higher limit then add in the where clausedeclare @low int = 1, @high int = 10000
select
row_number() over(partition by order by ) as rownumber,
col1
col2
...............
from
.................
where rownumber >= @low and rownumber <= @high
set @low = @low + @high, @high = @high + 10000
I hope this may help me....:-)
and
i am just curious, it will never work if we don't have primary key right?
Thanks,
Charmer
May 28, 2012 at 8:28 am
cursors have their place when you need to do RBAR but if you can go for set based executions.
now depending on what your cursor is doing will depend on if it can be changed to a set based execution.
if you post the cursor definition we could take a look at it.
May 28, 2012 at 8:45 am
anthony.green (5/28/2012)
cursors have their place when you need to do RBAR but if you can go for set based executions.now depending on what your cursor is doing will depend on if it can be changed to a set based execution.
if you post the cursor definition we could take a look at it.
This is my cursor definition
declare Cursor for
select distinct ID from table
declare @ID int
open cursor
fetch next from cursor into @ID
while @@Fetch_Status = 0
begin
exec('update column = column
from
table1 t1
left join table2 t2 on t1.column = t2.coulmn
left join table3 t3 on t1.column = t3.coulmn
join table4 t4 on t1.column = t4.coulmn
join table5 t5 on t5.column = t3.coulmn
t1.ID =' +@ID)
fetch next from cursor into @ID
end
close Cursor
deallocate cursor
the update statement for each ID has more or less a million records...so it is taking at least 1 hour to execute...
i need to process it by batches if it takes less time consumption...
Thanks,
Charmer
May 28, 2012 at 7:26 pm
Charmer (5/28/2012)
anthony.green (5/28/2012)
cursors have their place when you need to do RBAR but if you can go for set based executions.now depending on what your cursor is doing will depend on if it can be changed to a set based execution.
if you post the cursor definition we could take a look at it.
This is my cursor definition
declare Cursor for
select distinct ID from table
declare @ID int
open cursor
fetch next from cursor into @ID
while @@Fetch_Status = 0
begin
exec('update column = column
from
table1 t1
left join table2 t2 on t1.column = t2.coulmn
left join table3 t3 on t1.column = t3.coulmn
join table4 t4 on t1.column = t4.coulmn
join table5 t5 on t5.column = t3.coulmn
t1.ID =' +@ID)
fetch next from cursor into @ID
end
close Cursor
deallocate cursor
the update statement for each ID has more or less a million records...so it is taking at least 1 hour to execute...
i need to process it by batches if it takes less time consumption...
You certainly don't need a CURSOR to do this.
While I am loathe to admit it, I have a couple of cases where processing a large number of records was sped up by introducing a loop and processing them in batches. However, what I think you should do is SELECT your @IDs out into a TABLE VARIABLE, add a SeqNo using ROW_NUMBER() [during that SELECT] and then loop through the SeqNos, avoiding the CURSOR. You can then process multiple IDs at once using IN and use this to tune the query to see how many IDs processed at one time can achieve the best performance results.
I also suggest:
- Review your use of Dynamic SQL. I don't see any need for it in this case.
- Are all of those JOINs really necessary? Remove any that are not relevant.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
May 29, 2012 at 2:27 am
dwain.c (5/28/2012)
Charmer (5/28/2012)
anthony.green (5/28/2012)
You certainly don't need a CURSOR to do this.While I am loathe to admit it, I have a couple of cases where processing a large number of records was sped up by introducing a loop and processing them in batches. However, what I think you should do is SELECT your @IDs out into a TABLE VARIABLE, add a SeqNo using ROW_NUMBER() [during that SELECT] and then loop through the SeqNos, avoiding the CURSOR. You can then process multiple IDs at once using IN and use this to tune the query to see how many IDs processed at one time can achieve the best performance results.
I also suggest:
- Review your use of Dynamic SQL. I don't see any need for it in this case.
i did dynamic because i need to process only for certain ID's from a table...unless it process for all the ID's..
- Are all of those JOINs really necessary? Remove any that are not relevant
yes , it is necessary according my knowledge...but i will take your suggestion to recheck the joins
let me try with loop that you suggested and will let you know the result...
Thank you Dwain, for the help
Thanks,
Charmer
May 29, 2012 at 4:48 am
Hi Dwain,
I tried with while loop...but it is taking thrice the time of cursor....
so what i think is, cursor is good at least for this process...
Thanks,
Charmer
May 29, 2012 at 4:58 am
No, a cursor is not likely to be good, neither is a while loop. What exactly are you trying to do? Iterative processing is typically slower than set-based in SQL server.
If all you're trying to do is that update, then what you probably want to do is an UPDATE TOP (x) ... Where X is a sensible number, and have a way to tell whether a row has been updated or not. Then you just run in a loop, updating largish numbers of rows at a time.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 29, 2012 at 5:29 am
GilaMonster (5/29/2012)
No, a cursor is not likely to be good, neither is a while loop. What exactly are you trying to do? Iterative processing is typically slower than set-based in SQL server.If all you're trying to do is that update, then what you probably want to do is an UPDATE TOP (x) ... Where X is a sensible number, and have a way to tell whether a row has been updated or not. Then you just run in a loop, updating largish numbers of rows at a time.
yes, i have heard that cursor is not all good from you SQL guru's...
what i am doing is insert/update large amount of records based on a ID....
i don't need to process all the records at the same time since it kills the performance(i already tried it)....
i thought of going batch by batch and so i take one ID in single execution so that it will never kill the performance...
so i fetch each id and doing insert/update using cursor, and then next id and so on...
Gila, Could you please refer to my previous post..? i have put my sample cursor definition of what exactly i am doing here..
Thanks,
Charmer
May 29, 2012 at 5:46 am
I did see your previous post, and I would still recommend you look at something like UPDATE (TOP ... if you can. First though, check indexes make sure that you have some useful ones.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
Viewing 15 posts - 1 through 15 (of 28 total)
You must be logged in to reply to this topic. Login to reply