Limiting the records?

  • Hi Friends,

    My question may look silly, but i am just curious to know if there is a way...

    right now, my tables deals with 10 million records...when ever i execute any T-SQL statement, it is taking too much time to get the results...

    One of my friend gave me a suggestion that, processing 10000 records may increase the performance instead of processing the whole rows...

    but i am not sure whether it is really possible or not?

    Is it possible to process records like that?

    Any suggestions would be really appreciated..

    Thanks,
    Charmer

  • It all depends on what your doing with the processing and if the whole table needs to be processed at the same time or if they can be processed in batches.

    If its only a select, you could add SELECT TOP 10000 or SELECT TOP 10 PERCENT.

  • You could add ROW_NUMBER() in your query and process WHERE ROW_NUMBER() < 10000 , WHERE ROW_NUMBER() BETWEEN 10001 AND 20000 ect

    Andy

    ==========================================================================================================================
    A computer lets you make more mistakes faster than any invention in human history - with the possible exceptions of handguns and tequila. Mitch Ratcliffe

  • If i need to process all the records in batches , what i have to do?

    i mean if i have 50000 rows, i need to process first 10000 and then next 10000 and then so on....

    is it possible, Anthony?

    Thanks,
    Charmer

  • Andy Hyslop (5/28/2012)


    You could add ROW_NUMBER() in your query and process WHERE ROW_NUMBER() < 10000 , WHERE ROW_NUMBER() BETWEEN 10001 AND 20000 ect

    Andy

    Andy, I use cursors...so is it possible to process like that by itself...i mean , shall we able to automate the process?

    Thanks,
    Charmer

  • yes, like Andy said, you need to add in a row_number clause to your select, wrap it all in a cte, then have 2 variables, set 1 as the lower limit, set the other as the higher limit then add in the where clause

    declare @low int = 1, @high int = 10000

    select

    row_number() over(partition by order by ) as rownumber,

    col1

    col2

    ...............

    from

    .................

    where rownumber >= @low and rownumber <= @high

    set @low = @low + @high, @high = @high + 10000

  • anthony.green (5/28/2012)


    yes, like Andy said, you need to add in a row_number clause to your select, wrap it all in a cte, then have 2 variables, set 1 as the lower limit, set the other as the higher limit then add in the where clause

    declare @low int = 1, @high int = 10000

    select

    row_number() over(partition by order by ) as rownumber,

    col1

    col2

    ...............

    from

    .................

    where rownumber >= @low and rownumber <= @high

    set @low = @low + @high, @high = @high + 10000

    I hope this may help me....:-)

    and

    i am just curious, it will never work if we don't have primary key right?

    Thanks,
    Charmer

  • cursors have their place when you need to do RBAR but if you can go for set based executions.

    now depending on what your cursor is doing will depend on if it can be changed to a set based execution.

    if you post the cursor definition we could take a look at it.

  • anthony.green (5/28/2012)


    cursors have their place when you need to do RBAR but if you can go for set based executions.

    now depending on what your cursor is doing will depend on if it can be changed to a set based execution.

    if you post the cursor definition we could take a look at it.

    This is my cursor definition

    declare Cursor for

    select distinct ID from table

    declare @ID int

    open cursor

    fetch next from cursor into @ID

    while @@Fetch_Status = 0

    begin

    exec('update column = column

    from

    table1 t1

    left join table2 t2 on t1.column = t2.coulmn

    left join table3 t3 on t1.column = t3.coulmn

    join table4 t4 on t1.column = t4.coulmn

    join table5 t5 on t5.column = t3.coulmn

    t1.ID =' +@ID)

    fetch next from cursor into @ID

    end

    close Cursor

    deallocate cursor

    the update statement for each ID has more or less a million records...so it is taking at least 1 hour to execute...

    i need to process it by batches if it takes less time consumption...

    Thanks,
    Charmer

  • Charmer (5/28/2012)


    anthony.green (5/28/2012)


    cursors have their place when you need to do RBAR but if you can go for set based executions.

    now depending on what your cursor is doing will depend on if it can be changed to a set based execution.

    if you post the cursor definition we could take a look at it.

    This is my cursor definition

    declare Cursor for

    select distinct ID from table

    declare @ID int

    open cursor

    fetch next from cursor into @ID

    while @@Fetch_Status = 0

    begin

    exec('update column = column

    from

    table1 t1

    left join table2 t2 on t1.column = t2.coulmn

    left join table3 t3 on t1.column = t3.coulmn

    join table4 t4 on t1.column = t4.coulmn

    join table5 t5 on t5.column = t3.coulmn

    t1.ID =' +@ID)

    fetch next from cursor into @ID

    end

    close Cursor

    deallocate cursor

    the update statement for each ID has more or less a million records...so it is taking at least 1 hour to execute...

    i need to process it by batches if it takes less time consumption...

    You certainly don't need a CURSOR to do this.

    While I am loathe to admit it, I have a couple of cases where processing a large number of records was sped up by introducing a loop and processing them in batches. However, what I think you should do is SELECT your @IDs out into a TABLE VARIABLE, add a SeqNo using ROW_NUMBER() [during that SELECT] and then loop through the SeqNos, avoiding the CURSOR. You can then process multiple IDs at once using IN and use this to tune the query to see how many IDs processed at one time can achieve the best performance results.

    I also suggest:

    - Review your use of Dynamic SQL. I don't see any need for it in this case.

    - Are all of those JOINs really necessary? Remove any that are not relevant.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (5/28/2012)


    Charmer (5/28/2012)


    anthony.green (5/28/2012)


    You certainly don't need a CURSOR to do this.

    While I am loathe to admit it, I have a couple of cases where processing a large number of records was sped up by introducing a loop and processing them in batches. However, what I think you should do is SELECT your @IDs out into a TABLE VARIABLE, add a SeqNo using ROW_NUMBER() [during that SELECT] and then loop through the SeqNos, avoiding the CURSOR. You can then process multiple IDs at once using IN and use this to tune the query to see how many IDs processed at one time can achieve the best performance results.

    I also suggest:

    - Review your use of Dynamic SQL. I don't see any need for it in this case.

    i did dynamic because i need to process only for certain ID's from a table...unless it process for all the ID's..

    - Are all of those JOINs really necessary? Remove any that are not relevant

    yes , it is necessary according my knowledge...but i will take your suggestion to recheck the joins

    let me try with loop that you suggested and will let you know the result...

    Thank you Dwain, for the help

    Thanks,
    Charmer

  • Hi Dwain,

    I tried with while loop...but it is taking thrice the time of cursor....

    so what i think is, cursor is good at least for this process...

    Thanks,
    Charmer

  • No, a cursor is not likely to be good, neither is a while loop. What exactly are you trying to do? Iterative processing is typically slower than set-based in SQL server.

    If all you're trying to do is that update, then what you probably want to do is an UPDATE TOP (x) ... Where X is a sensible number, and have a way to tell whether a row has been updated or not. Then you just run in a loop, updating largish numbers of rows at a time.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/29/2012)


    No, a cursor is not likely to be good, neither is a while loop. What exactly are you trying to do? Iterative processing is typically slower than set-based in SQL server.

    If all you're trying to do is that update, then what you probably want to do is an UPDATE TOP (x) ... Where X is a sensible number, and have a way to tell whether a row has been updated or not. Then you just run in a loop, updating largish numbers of rows at a time.

    yes, i have heard that cursor is not all good from you SQL guru's...

    what i am doing is insert/update large amount of records based on a ID....

    i don't need to process all the records at the same time since it kills the performance(i already tried it)....

    i thought of going batch by batch and so i take one ID in single execution so that it will never kill the performance...

    so i fetch each id and doing insert/update using cursor, and then next id and so on...

    Gila, Could you please refer to my previous post..? i have put my sample cursor definition of what exactly i am doing here..

    Thanks,
    Charmer

  • I did see your previous post, and I would still recommend you look at something like UPDATE (TOP ... if you can. First though, check indexes make sure that you have some useful ones.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply