Insert...or UpdateCursor..Which is faster?

  • since I have so much time on my hands while this monster crunches data....

    I am still working to see if I can find a set based solution...

    enter the correlated sub query....

    Is is possible to have a Correlated Nested Sub Query???

    Make sense? I'm confused just writing the question...

    I need to reference a.Datefilled...inside the nested query for table Y...I bolded the place I need to do it...

    I'm not sure it's even possible...But I thought I'd at least pose the question.

    example query

    =======================================

    SELECT a.DateFilled, *

    FROM ClaimsData a

    INNER JOIN

    (-- joining the transaction table to itself...

    -- to get the latest transaction record...for the current claim record (table a)

    SELECT x.*

    FROM TransactionTable x

    INNER JOIN (

    SELECT ClientID, GroupID, MAX(timeStamp) AS timeStamp

    FROM TransactionTable

    WHERE timeStamp< a.DateFilled

    GROUP BY ClientID, GroupID

    ) y

    ON x.ClientID = y.ClientID

    AND x.GroupID = y.GroupID

    AND x.timeStamp= y.timeStamp

    ) b

    ON b.timeStamp= a.DateFilled

  • Well...you CAN, meaning it's a query that is "legal" and will get past the parser...I'm just not sure you should.

    Inequality joins (otherwise known by some on this board as "triangular joins") like that tend to create a REALLY large temporary set of data to look at. considering I happen to know that you're working with 18 Million rows... it looks to me that that would KILL your server, since it would need to process something to the tune of (18M)^2, or 324 Trillion rows (give or take a few million rows). Let's just say I'd thinkg your TempDB will be about the size of Texas before you're done (if it ever finishes).... Even if you manage to put in some better linking to cut that number down - it's STILL going to turn into something huge....

    You may care to read through a few techniques outlined over here. Should be a lot less hazardous, IMO.

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/[/url]

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • thanks again.

    i'm trying to see if I can get the correct transaction record without a nested query.......then a standard correlated subquery will work...but I know there has to be a better way.

    Thanks for the link to the article... I'll do some more homework!

  • Keep in mind what I said in the conclusion of the article on triangular joins...

    Not all Triangular Joins are bad. With some restraint and the right criteria, Triangular Joins can be used for some pretty remarkable things... make finite schedules... do high speed dupe checks... etc. But, you've really got to be careful. Improperly written Triangular Joins are worse than even Cursors or While Loops and can bring a CPU and Disk System right to it's knees.

    It looks to me like this particular triangular join may be correctly constrained.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (4/17/2008)


    Keep in mind what I said in the conclusion of the article on triangular joins...

    Not all Triangular Joins are bad. With some restraint and the right criteria, Triangular Joins can be used for some pretty remarkable things... make finite schedules... do high speed dupe checks... etc. But, you've really got to be careful. Improperly written Triangular Joins are worse than even Cursors or While Loops and can bring a CPU and Disk System right to it's knees.

    It looks to me like this particular triangular join may be correctly constrained.

    Perhaps - but we're still doing 18M self-joined to 18M aren't we? So - you're right - the cardinality estimate is high from earlier,but still - we're likely looking at several billion in the work table. Even with the grouping in place, and the relations...

    Just bringing up yet another option for using the running totals logic....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I was able to rewrite my joins...so they didn't have to be nested....so the correlated sub query is working much faster.....

    Thank you again to everyone!!!!

  • Highwayman (4/15/2008)


    isn't that the sad truth....

    damn them once for creating the whole mess...

    damn me for not jumping ship as soon as I realize what a steaming pile I jumped head first into;....

    My days are numbered...

    can you show your update statement, i have one method, it will replace the cursor concept

  • Highwayman (4/17/2008)


    I was able to rewrite my joins...so they didn't have to be nested....so the correlated sub query is working much faster.....

    Thank you again to everyone!!!!

    Would you mind posting your solution? Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Please post your solution. It would be nice reference for lot of us.:)

  • Here is the SQL for my solution...I'm sure there are ways to further optimize it...but its getting things done...10 times faster than my earlier attempts...

    The key lessons learned were that breaking my 18m rows into smaller chunks (i.e. by cycledate) and processing them sped things up..

    The second thing was finally figuring out how to make this thing set based..instead of having to do my joins 18m times. now I only have to do it 200 times...

    Hope this rats nest can help somebody else. If anyone want's or needs further explanation...I'm always glad to discuss. 🙂

    /*

    Below is the select statement that I am using inside a loop...that loop is pulling 2 weeks worth of claims for each iteration...

    */

    select

    -- existing claims data

    a.*

    -- CoverageStratedyID---it's a PK...will be added to the Claims data table as an FK

    , CSID=bh.Id

    -- this is just the actual Coverage code...that the csid represents.

    , CoverageStrategyCode=coalesce(nullif(m.coveragestrategyid, ''), g.coveragestrategyid)

    from ClaimsData a

    /*

    the first 2 left joins below were the statements that replaced the joining of the transaction tables on themselves...

    the allow me to return the appropriate coverage code for the claim record at the time of service.

    */

    left join (

    select maxtxts=max(c.txts), c.clientid, c.groupid, c.coveragestrategyid

    from tpagrouptx2 c

    group by c.clientid, c.groupid, c.coveragestrategyid

    ) g

    on a.clientid=g.clientid

    and a.groupnumber=g.groupid

    left join (

    select mtx.clientid, mtx.groupid, mtx.ssnid, mtx.personcode, mtx.coveragestrategyid, mtx.txts

    from tpamembertx2 mtx

    group by mtx.clientid, mtx.groupid, mtx.ssnid, mtx.personcode, mtx.coveragestrategyid, mtx.txts

    ) m

    on m.clientid=a.clientid

    and m.groupid=a.groupnumber

    and m.ssnid=a.ssnid

    and m.personcode=a.personcode

    and m.txts<a.datefilledfull

    /*

    this is a join to a table that contains my historical coverage codes...

    so I can return the correct CoverageCodeID...This is the field that I am adding to the 18m row table.

    The current table doesn't contain a term date...so I had to derive it based on the next records effective date...

    I just did this once and dumped it into the temp table.

    */

    left join (

    select a.id, a.CoverageCode, a.DateLastUpdated, b.TermDate

    from ##BenefitHistory a join ##BenefitHistoryTerm b on a.id=b.id

    ) bh

    on bh.CoverageCode=coalesce(nullif(m.coveragestrategyid, ''), g.coveragestrategyid)

    and a.DateFilledFull BETWEEN bh.DateLastUpdated AND bh.TermDate

    where g.maxtxts <= a.datefilledfull

    and m.txts <= a.datefilledfull

    and a.financialcutoffdate = @CutOff -- @CutOff is the current cycle date for the Loop that this whole thing is being executed in.

  • Thanks ........:)

Viewing 11 posts - 16 through 25 (of 25 total)

You must be logged in to reply to this topic. Login to reply