Changing from INSERT INTO...SELECT (with a UNION) to SELECT INTO

  • My objective is to turn this long running query that is embedded into a stored proc into a slightly faster query by using a minimal logging strategy, i.e. SELECT INTO. It's fairly straight-forward with one little twist. It is actually a UNION of two queries. To get around this, I added a:

    SELECT <various fields>

    INTO #tmp

    FROM

    and then put the original query inside parentheses with a trailing context variable as you can see below.

    This works perfectly fine in a query window (in about 20 seconds) and I'm seeing a significant performance gain as I expected (roughly two minutes before the change). The problem is that when I put this EXACT same query back inside the stored procedure I cut it out of, it never returns anything and appears to be in some sort of infinite loop.

    There is more stuff before and after this particular query and it will generate the query plans for all of the queries prior to this but the query below is the next query and when I cancel execution after 10 minutes or so, it does not generate a query plan. I also embedded a "Print" statement just before the query and I can see in SQL Profiler that the PRINT statement has executed but that is the last statement that shows up in the trace.

    As you can see below, I commented out the original INSERT INTO. The stuff that is in bold is what I added in. Let me emphasize again. This query runs successfully as it is right now. I just can't figure out why it won't work inside a stored procedure.

    Does anyone have any thoughts?

    --INSERT INTO #tmpGlDtl

    SELECT DISTINCT

    ChgObjId, -- = smsdss.GLEntFlat.ChgObjId,

    TransCommonIdObjId, -- = GLEnt.TransCommonIdObjId,

    -- BlPerObjId = ChgRptV.BlPerObjId,--ETL_Change

    SrceRefNo, -- = GLEnt.SrceRefNo,

    RcvOwnerObjId, -- = GLEnt.RcvOwnerObjId,

    GLIntfcName, -- = GLCr.GLIntfcName,

    GLBatchId, -- = GLCr.GLBatchId,

    GLEntObjId, -- = GLEnt.ObjId,

    GLIntfcJrnlEntObjId, -- = GLCr.GLIntfcJrnlEntObjId,

    GLIntfcDate, -- = GLCr.GLIntfcDTime,

    CommonFiscYrText, -- = GLCr.CommonFiscYrText,

    CommonFiscPerMne, -- = GLCr.CommonFiscPerMne,

    FiscDayDate, -- = GLCr.FiscDayDate,

    FiscPerStpDate, -- = GLCr.FiscPerStpDate,

    GLAcctId, -- = GLCr.GLAcctId,

    TransTypeCd, -- = GLEnt.TransTypeCd ,

    Qty, -- = smsdss.GLEntFlat.Qty,

    StrDate, -- = GLEnt.StrDate,

    CrAmt, -- = GLCr.CrAmt,

    DbAmt, -- = 0,

    DebitAmt, -- = 0,

    CreditAmt, -- = ISNULL(GLEnt.Amt,0),

    DebitQty, -- = 0,

    CreditQty, -- = ISNULL(GLEnt.Amt,0),

    EncProvShortName, -- = smsdss.GlEntFlat.EncProvShortName,

    SvcProvShortName, -- = smsdss.GlEntFlat.SvcProvHpoShortName,

    RcvOwner, -- = GLCr.RcvOwnerShortName,

    PyrShortName, -- = smsdss.GLEntFLat.PyrShortName,

    PyrRptGrpMne, -- = smsdss.GLEntFLat.PyrRptGrpMne,

    HlthPlanName, -- = smsdss.GLEntFLat.HlthPlanName,

    HlthPlanRptGrpMne, -- = smsdss.GLEntFLat.HlthPlanRptGrpMne,

    HlthPlanProdLineMne, -- = smsdss.GLEntFLat.HlthPlanProdLineMne,

    SrceRefTypeCd, -- = GLEnt.SrceRefTypeCd,

    SrcTransTypeCd, -- = GLEnt.SrcTransTypeCd,

    BlPerStartDate, -- = BlPerRpt.BlPerStrDate,

    BlPerStopDate, -- = BlPerRpt.BlPerStpDate,

    PtName -- = PersRpt.LegalRptName

    INTO #tmpGlDtl

    FROM

    (SELECT DISTINCT

    ChgObjId = smsdss.GLEntFlat.ChgObjId,

    TransCommonIdObjId = GLEnt.TransCommonIdObjId,

    -- BlPerObjId = ChgRptV.BlPerObjId,

    SrceRefNo = GLEnt.SrceRefNo,

    RcvOwnerObjId = GLEnt.RcvOwnerObjId,

    GLIntfcName = GLCr.GLIntfcName,

    GLBatchId = GLCr.GLBatchId,

    GLEntObjId = GLEnt.ObjId,

    GLIntfcJrnlEntObjId = GLCr.GLIntfcJrnlEntObjId,

    GLIntfcDate = GLCr.GLIntfcDTime,

    CommonFiscYrText = GLCr.CommonFiscYrText,

    CommonFiscPerMne = GLCr.CommonFiscPerMne,

    FiscDayDate = GLCr.FiscDayDate,

    FiscPerStpDate = GLCr.FiscPerStpDate,

    GLAcctId = GLCr.GLAcctId,

    TransTypeCd = GLEnt.TransTypeCd ,

    Qty = smsdss.GLEntFlat.Qty,

    StrDate = GLEnt.StrDate,

    CrAmt = GLCr.CrAmt,

    DbAmt = 0,

    DebitAmt = 0,

    CreditAmt = ISNULL(GLEnt.Amt,0),

    DebitQty = 0,

    CreditQty = ISNULL(GLEnt.Amt,0),

    EncProvShortName = smsdss.GlEntFlat.EncProvShortName,

    SvcProvShortName = smsdss.GlEntFlat.SvcProvHpoShortName,

    RcvOwner = GLCr.RcvOwnerShortName,

    PyrShortName = smsdss.GLEntFLat.PyrShortName,

    PyrRptGrpMne = smsdss.GLEntFLat.PyrRptGrpMne,

    HlthPlanName = smsdss.GLEntFLat.HlthPlanName,

    HlthPlanRptGrpMne = smsdss.GLEntFLat.HlthPlanRptGrpMne,

    HlthPlanProdLineMne = smsdss.GLEntFLat.HlthPlanProdLineMne,

    SrceRefTypeCd = GLEnt.SrceRefTypeCd,

    SrcTransTypeCd = GLEnt.SrcTransTypeCd,

    BlPerStartDate = BlPerRpt.BlPerStrDate,

    BlPerStopDate = BlPerRpt.BlPerStpDate,

    PtName = PersRpt.LegalRptName

    FROM GLEnt GLEnt

    JOIN smsdss.GLJrnlEntRptV GLCr

    ON GLEnt.GLIntJrnEntCrObjId = GLCr.GLIntfcJrnlEntObjId

    LEFT JOIN smsdss.GLEntFlat

    ON GLEnt.ObjId = smsdss.GLEntFlat.GLEntObjId

    JOIN RcvOwner

    ON GLEnt.RcvOwnerObjId = RcvOwner.ObjId

    JOIN smsdss.HPOMstrV HPOMstrV

    ON HPOMstrV.RcvOwnerHPOObjId = RcvOwner.HPOObjId

    INNER JOIN CustomReports.dbo.user_facility sfuser

    ON HPOMstrV.HPOId = sfuser.facility_id

    LEFT OUTER JOIN CustomReports.dbo.bers_bexd_dl_extract bers

    ON sfuser.facility_id = bers.be_id

    left join smsdss.FintranLineItemRpt F on F.glentobjid = GLEnt.objid

    left join smsdss.ChgRpt C on C.glentobjid = GLEnt.objid

    LEFT JOIN smsdss.BlPerRpt

    ON smsdss.BlPerRpt.BlPerObjId = isnull(C.BlPerObjId, F.blperobjid)

    LEFT JOIN smsdss.EncRpt EncRpt

    ON EncRpt.EncObjId = isnull(C.EncObjId, F.PrimEncObjId)

    LEFT JOIN smsdss.PersRptV PersRpt

    ON EncRpt.PtObjId = PersRpt.Ptobjid

    WHERE

    ((@IntfcType=2 AND GLEnt.SrcTransTypeCd = 2)

    OR

    (@IntfcType <> 2 AND GLEnt.SrcTransTypeCd <> 2)

    )

    AND (@GLIntfcName IS NULL

    OR

    GLCr.GLIntfcName IN (@GLIntfcName))

    AND (@GLAcctId IS NULL

    OR

    GLCr.GLAcctId IN (@GLAcctId))

    AND (

    --(@DateType_GLIntfc = 1 AND GLCr.GLIntfcDTime >= @DateFrom AND GLCr.GLIntfcDTime <= @DateTo)

    --OR

    -- (@DateType_GLIntfc <> 1 AND GLCr.FiscDayDate >= @DateFrom AND GLCr.FiscDayDate <= @DateTo)

    -- )

    (@DateType_GLIntfc = 1 AND GLCr.GLIntfcDTime BETWEEN @DateFrom AND @DateTo)

    OR

    (@DateType_GLIntfc <> 1 AND GLCr.FiscDayDate BETWEEN @DateFrom AND @DateTo)

    )

    AND ( @FiscPerMne IS NULL

    OR

    (GLCr.CommonFiscPerMne=@FiscPerMne AND @FiscPerMne <> '99')

    OR

    (GLCr.CommonFiscPerMne=(SELECT TOP 1 FiscYrPerDim.CommonFiscPerDesc

    FROM smsdss.FiscYrPerDim FiscYrPerDim

    WHERE FiscYrPerDim.CurrFiscPerInd = 1) AND @FiscPerMne='99')

    )

    AND ( @FiscYr IS NULL

    OR

    (GLCr.CommonFiscYrText IN (@FiscYr) AND @FiscYr <> '99')

    OR

    (GLCr.CommonFiscYrText=(SELECT TOP 1 FiscYrPerDim.CommonFiscYr

    FROM smsdss.FiscYrPerDim FiscYrPerDim

    WHERE FiscYrPerDim.CurrFiscYrInd = 1) and @FiscYr = '99')

    )

    AND (@RcvOwner IS NULL

    OR

    GLCr.RcvOwnerShortName IN (@RcvOwner))

    and sfuser.user_id = @sfusername

    UNION

    SELECT DISTINCT

    ChgObjId = smsdss.GLEntFlat.ChgObjId,

    TransCommonIdObjId = GLEnt.TransCommonIdObjId,

    -- BlPerObjId = ChgRptV.BlPerObjId,--ETL_Change

    SrceRefNo = GLEnt.SrceRefNo,

    RcvOwnerObjId = GLEnt.RcvOwnerObjId,

    GLIntfcName = GLDb.GLIntfcName,

    GLBatchId = GLDb.GLBatchId,

    GLEntObjId = GLEnt.ObjId,

    GLIntfcJrnlEntObjId = GLDb.GLIntfcJrnlEntObjId,

    GLIntfcDate = GLDb.GLIntfcDTime,

    CommonFiscYrText = GLDb.CommonFiscYrText,

    CommonFiscPerMne = GLDb.CommonFiscPerMne,

    FiscDayDate = GLDb.FiscDayDate,

    FiscPerStpDate = GLDb.FiscPerStpDate,

    GLAcctId = GLDb.GLAcctId,

    TransTypeCd = GLEnt.TransTypeCd ,

    Qty = smsdss.GLEntFlat.Qty,

    StrDate = GLEnt.StrDate,

    CrAmt = 0,

    DbAmt = GLDb.DebitAmt,

    DebitAmt = ISNULL(GLEnt.Amt,0),

    CreditAmt = 0,

    DebitQty = ISNULL(GLEnt.Amt,0),

    CreditQty = 0,

    EncProvShortName = smsdss.GlEntFlat.EncProvShortName,

    SvcProvShortName = smsdss.GlEntFlat.SvcProvHPOShortName,

    RcvOwner = GLDb.RcvOwnerShortName,

    PyrShortName = smsdss.GLEntFLat.PyrShortName,

    PyrRptGrpMne = smsdss.GLEntFLat.PyrRptGrpMne,

    HlthPlanName = smsdss.GLEntFLat.HlthPlanName,

    HlthPlanRptGrpMne = smsdss.GLEntFLat.HlthPlanRptGrpMne,

    HlthPlanProdLineMne = smsdss.GLEntFLat.HlthPlanProdLineMne,

    SrceRefTypeCd = GLEnt.SrceRefTypeCd,

    SrcTransTypeCd = GLEnt.SrcTransTypeCd,

    BlPerStartDate = BlPerRpt.BlPerStrDate,

    BlPerStopDate = BlPerRpt.BlPerStpDate,

    PtName = PersRpt.LegalRptName

    FROM GLEnt GLEnt

    JOIN smsdss.GLJrnlEntRptV GLDb

    ON GLEnt.GLIntJrnEntDbObjId = GLDb.GLIntfcJrnlEntObjId

    LEFT JOIN smsdss.GLEntFlat

    ON GLEnt.ObjId = smsdss.GLEntFlat.GLEntObjId

    JOIN RcvOwner

    ON GLEnt.RcvOwnerObjId = RcvOwner.ObjId

    JOIN smsdss.HPOMstrV HPOMstrV

    ON HPOMstrV.RcvOwnerHPOObjId = RcvOwner.HPOObjId

    INNER JOIN CustomReports.dbo.user_facility sfuser

    ON HPOMstrV.HPOId = sfuser.facility_id

    LEFT OUTER JOIN CustomReports.dbo.bers_bexd_dl_extract bers

    ON sfuser.facility_id = bers.be_id

    left join smsdss.FintranLineItemRpt F on F.glentobjid = GLEnt.objid

    left join smsdss.ChgRpt C on C.glentobjid = GLEnt.objid

    LEFT JOIN smsdss.BlPerRpt

    ON smsdss.BlPerRpt.BlPerObjId = isnull(C.BlPerObjId, F.blperobjid)

    LEFT JOIN smsdss.EncRpt EncRpt

    ON EncRpt.EncObjId = isnull(C.EncObjId, F.PrimEncObjId)

    LEFT JOIN smsdss.PersRptV PersRpt

    ON EncRpt.PtObjId = PersRpt.Ptobjid

    /*End: */

    WHERE

    ((@IntfcType=2 AND GLEnt.SrcTransTypeCd = 2)

    OR

    (@IntfcType <> 2 AND GLEnt.SrcTransTypeCd <> 2)

    )

    AND (@GLIntfcName IS NULL

    OR

    GLDb.GLIntfcName IN (@GLIntfcName))

    AND (@GLAcctId IS NULL

    OR

    GLDb.GLAcctId IN (@GLAcctId))

    AND (

    --(@DateType_GLIntfc = 1 AND GLDb.GLIntfcDTime >= @DateFrom AND GLDb.GLIntfcDTime <= @DateTo)

    --OR

    -- (@DateType_GLIntfc <> 1 AND GLDb.FiscDayDate >= @DateFrom AND GLDb.FiscDayDate <= @DateTo)

    -- )

    (@DateType_GLIntfc = 1 AND GLDb.GLIntfcDTime BETWEEN @DateFrom AND @DateTo)

    OR

    (@DateType_GLIntfc <> 1 AND GLDb.FiscDayDate BETWEEN @DateFrom AND @DateTo)

    )

    AND ( @FiscPerMne IS NULL

    OR

    (GLDb.CommonFiscPerMne=@FiscPerMne AND @FiscPerMne <> '99')

    OR

    (GLDb.CommonFiscPerMne=(SELECT TOP 1 FiscYrPerDim.CommonFiscPerDesc

    FROM smsdss.FiscYrPerDim FiscYrPerDim

    WHERE FiscYrPerDim.CurrFiscPerInd = 1) AND @FiscPerMne='99')

    )

    AND ( @FiscYr IS NULL

    OR

    (GLDb.CommonFiscYrText IN (@FiscYr) AND @FiscYr <> '99')

    OR

    (GLDb.CommonFiscYrText=(SELECT TOP 1 FiscYrPerDim.CommonFiscYr

    FROM smsdss.FiscYrPerDim FiscYrPerDim

    WHERE FiscYrPerDim.CurrFiscYrInd = 1) and @FiscYr = '99')

    )

    AND (@RcvOwner IS NULL

    OR

    GLDb.RcvOwnerShortName IN (@RcvOwner))

    AND sfuser.user_id = @sfusername ) UN

    OPTION (MAXDOP 1)

    "Beliefs" get in the way of learning.

  • ====edit====

    where is my jump to conclusions mat? i read like the first few words and made an assumption, sorry...

    looking at your post in detail now...now i see it's why the procs not working, not how to do insert into.

    doh!!!

    the performance problem probably has to do with the OR AND ANDS;

    the fix is to change all those into a CTE which unions all the possibilities;

    I'll try reformating it and psoting my syntactically correct results.

    ---previous irrelevant post for reference:

    it's ridiculously easy, just unfamiliar territory.

    the INTO TABLENAME goes before the FIRST "FROM" statement in the query, so it only appears once in your big union statement.

    SELECT top 3 *

    INTO #TMP

    FROM sys.objects

    where type='V'

    UNION ALL

    select top 52 * FROM sys.objects

    where type='U'

    UNION ALL

    select top 7 * FROM sys.objects

    where type='P'

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • I appreciate your help but my goal is to find out why it works in one venue, i.e. a query window and not in a stored proc. Also, I'm not sure that a CTE would be all that helpful since CTE's are not minimally logged. When I run the query within the stored proc the spid is burning cpu cycles like mad but doing zero I/O. It's almost like the optimizer has gone into an infinite loop.

    "Beliefs" get in the way of learning.

  • I tried to understand where the difference between the two "UNION'ed" statements is, but I couldn't figure it out.

    First I thought GLCr and GLDb were alias for different tables, but the table is the same (smsdss.GLJrnlEntRptV), and so are the join conditions, as far as I can tell.

    The person that came up with the table names and the alias definition did a "great job" to secure the data by obfuscating table names...

    Did you try to run both queris separately and see if there is any difference? And, if so, what is the difference based on?

    Another issue is the usage of DISTINCT: both subqueries have the DISTINCT clause together with a DISTINCT at the outer SELECT and the UNION operator in between.

    When using the UNION operator instead of UNION ALL the data will be grouped already so all three DISTINCT's can be removed. Instead of two sorts and a merge join we'd have a concatenation and a sort (if the two queries are needed at all...).



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • I agree, the query doesn't make much sense to me either but I didn't write it. I'm just trying to find some ways to optimize the performance without changing the business rules. You brought up some good points about the redundant DISTINCT's. I'll try tweaking it from that angle and I'll let you know how it turns out.

    "Beliefs" get in the way of learning.

  • Agreed. SELECT DISTINCT usually indicates that more than 1 of the exact same row was created. That usually means the join criteria was NOT sufficient or the design of the data simply won't allow for this query without a many-to-many (think partial cartesian product here) join forming which a SELECT DISTINCT is used inappropriately to try to fix.

    Take a look at the execution plan... are there any arrows that are bigger in actual rowcount than the two original tables from a join??? If so, that's your problem and you may have to split the query to save an interim answer in a temp table and then join to the temp table instead. I've made many an 8 to 40 minute query run in about 3 seconds that way. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    Thanks for taking a look. I just want to clarify that the query isn't really the issue. I didn't write it and I don't know the business rules behind it. My goal is to try and improve performance without re-wrtiting it.

    In this case, I'm simply trying to switch from a fully logged "INSERT INTO..SELECT" to a minimally logged "SELECT INTO...". For some reason, the change to SELECT INTO works fine when run in a query window but runs indefinitely, with no I/O, when embedded in a stored proc.

    I did remove the DISTINCT's because they were redundant due to the UNION clause and because I'd hoped it might make my problem go away but it didn't.

    By the way, the first query returns CREDIT GL entries, the second part after the UNION statement returns DEBIT GL entries.

    "Beliefs" get in the way of learning.

  • Jeff Moden (9/14/2010)


    Agreed. SELECT DISTINCT usually indicates that more than 1 of the exact same row was created. That usually means the join criteria was NOT sufficient or the design of the data simply won't allow for this query without a many-to-many (think partial cartesian product here) join forming which a SELECT DISTINCT is used inappropriately to try to fix.

    Take a look at the execution plan... are there any arrows that are bigger in actual rowcount than the two original tables from a join??? If so, that's your problem and you may have to split the query to save an interim answer in a temp table and then join to the temp table instead. I've made many an 8 to 40 minute query run in about 3 seconds that way. 🙂

    I don't think it's possible to get any redundant rows when using UNION (instead of UNION ALL). At least I couldn't create a sample to prove it....



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • LutzM (9/14/2010)


    Jeff Moden (9/14/2010)


    Agreed. SELECT DISTINCT usually indicates that more than 1 of the exact same row was created. That usually means the join criteria was NOT sufficient or the design of the data simply won't allow for this query without a many-to-many (think partial cartesian product here) join forming which a SELECT DISTINCT is used inappropriately to try to fix.

    Take a look at the execution plan... are there any arrows that are bigger in actual rowcount than the two original tables from a join??? If so, that's your problem and you may have to split the query to save an interim answer in a temp table and then join to the temp table instead. I've made many an 8 to 40 minute query run in about 3 seconds that way. 🙂

    I don't think it's possible to get any redundant rows when using UNION (instead of UNION ALL). At least I couldn't create a sample to prove it....

    True enough... it's the same as DISTINCT and using UNION instead of UNION ALL is likely for the same reason as using DISTINCT. It overcomes the problem of having a many-to-many join return duplicated rows but does nothing to prevent their original generation which... takes time... lots of time. The arrows behind the scene will indicate if it's the same problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I doubt this has anything to do with the way you put data into the temp table. It is all about TWO MASSIVE JOIN queries UNIONed together using the horrible IS NULL OR filtering logic.

    Switch to using dynamic sql (guard against sql injection here) and you will get a proper plan for all given inputs. I have gotten 4-5 orders of magnitude improvement from such changes to queries such as yours. There are other ways to handle this. Gail Shaw has some good posts on this.

    BTW, when you did this in your query window, did you use variables or did you hard code values?

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (9/15/2010)


    I doubt this has anything to do with the way you put data into the temp table. It is all about TWO MASSIVE JOIN queries UNIONed together using the horrible IS NULL OR filtering logic.

    Switch to using dynamic sql (guard against sql injection here) and you will get a proper plan for all given inputs. I have gotten 4-5 orders of magnitude improvement from such changes to queries such as yours. There are other ways to handle this. Gail Shaw has some good posts on this.

    BTW, when you did this in your query window, did you use variables or did you hard code values?

    I used variables. The hard-coded values you see were already present in the code. The query I posted is exactly the code I ran in the query window. (I didn't bother to post the code that set the variables.) The variables are declared and assigned in the proc, they are not passed in as parameters so I think that rules out the parameter sniffing scenario although I'm certainly willing to be corrected if my thinking is incorrect or out-of-date.

    Ironically, I pitched changing it to dynamic SQL yesterday morning but they weren't willing to risk it. This code is a third-party product and any changes must be approved by the vendor. Also, any changes to the code must go through an extensive QA process particularly in light of the financial nature of the product. I had a hard enough time convincing them that changing to minimally logged didn't actually affect the query business logic. It does provide a pretty significant performance gain as running the query went from two minutes to roughly 25 seconds just with that change so it's pretty frustrating that it won't run in the proc.

    This is just the tip of the performance iceberg as the client is rapidly rolling this financial system out across the country and the volume of data coming into the data warehouse is increasing exponentially. Queries that used to process a hundred thousand rows in an "adequate" amount of time are now processing 10 million rows and, not surprisingly, the performance of queries like this is degrading rapidly. It's only going to get worse as they've deployed to less than 25% of their sites.

    Right now, I'm fighting a holding action to keep things moving while recommending architectural changes to support scalability and identifying additional bottlenecks.

    "Beliefs" get in the way of learning.

  • I think the best I can recommend at this point is to get someone in to help those 3rd party folks improving their software asap. If there is such a massive QA procedure (which is probably appropriate), you should get some folks in better sooner than later. Think of it as an activity against "A prophet has no honor in his own country."

    Once the site faces a traffic that cannot be managed in an acceptable amount of time it'll be too late to get someone in, make him familiar with the process, develop some improvements and push it through the QA process. By the end of that cycle you're probably faced with a number of unhappy customers...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • You may see an improvement by adding the OPTION (RECOMPILE) query hint to the problem query - this will cause the optimizer to recompile the statement with knowledge of the values of the parameters.

    No guarantees that it will help in this case, but definitely worth a try if you are still suffering performance problems. I'd recommend that you try it out in a test environment first.

    Chris

  • LutzM (9/15/2010)


    I think the best I can recommend at this point is to get someone in to help those 3rd party folks improving their software asap. If there is such a massive QA procedure (which is probably appropriate), you should get some folks in better sooner than later. Think of it as an activity against "A prophet has no honor in his own country."

    Once the site faces a traffic that cannot be managed in an acceptable amount of time it'll be too late to get someone in, make him familiar with the process, develop some improvements and push it through the QA process. By the end of that cycle you're probably faced with a number of unhappy customers...

    I agree with Lutz... first, make sure the software has been loaded correctly and that the hardware meets the requirements specified by the 3rd party and that your company hasn't exceeded any documented limitations. Then call the 3rd party sofware house and tell them this is the perfect opportunity for them to not only make their software more scalable, but to also build up an incredible reputation for helping their customers. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I certainly understand the sentiment behind those thoughts and I'm certainly onboard but it really isn't a feasible solution to my dilemma. The turn around time for something like that would probably be close to a year assuming we can even get it on their priority list. We are one of the only customers they have who see these kinds of data volumes. While you might think that would provide leverage, it doesn't really. Having worked for vendors before, their mindset is that they'd rather fix bugs that will make 90% of their customers happy. They also know that switching to a competitors product isn't financially feasible so that also reduces motivation. (It would literally cost millions and millions of dollars to change to a competitive product.) Essentially, we are a captive audience.

    So, I guess what I'm hearing is that no one has any insight into why a query works fine as a stand-alone query and doesn't work at all when embedded in a proc when the only change is that it now uses SELECT INTO instead of INSERT INTO ... SELECT.

    Bummer. I'll seek help elsewhere. If I get some help or if I figure it out, I'll post back here to let everyone know. My next step is to figure out what it is actually doing when it is off cranking cpu cycles but doing no I/O. I'll see if I can capture a trace and some perfmon counters and correlate the two to gain some insight.

    "Beliefs" get in the way of learning.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply