Help with a Long running stored procedure

  • Guys, there is a sproc at the place of work I've just started at that used to take an hr but for the last week has taken 3 hrs or more!

    Its obviously legacy stuff and could be way better - don't be shy about your opinions on how - but the pt is it was running fine...

    Locking profile looks the same as in dev as does the query plan - where nearly 60% of the time is taken on the index spooler / eager spool - have recompiled sproc, freed procedure cache etc. I've re-indexed, updated statisticsThe server was rebooted last night and issue still remains.

    Perhaps its worth a maxdop 1, or perhaps parameter sniffing....all of which could be feasible explanations however how does it just flip from

    1 hr to 3hrs overnight?

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    create procedure [dbo].[Reporting.Report.--------] (

    @tradeId int = null

    , @valuationDate datetime = null

    , @TradeVolumeStartBoundary datetime = '1900-01-01'

    , @TradeVolumeEndBoundary datetime = '2100-01-01'

    )

    as

    begin

    if @tradeID is not null

    delete from [PowerPnlAgg] where TradeID = @tradeID and EFACalendar = 0

    if (@valuationDate is null)

    select @valuationDate = ValuationDate from SystemControlAggregate where TableName = 'PowerPnlAgg'

    declare @systemControlId int

    exec [Reporting.System.ReportingProcedureStart] 'CreatePowerAgg', @valuationDate, @systemControlId output

    print '1: ' + convert(nvarchar, getdate(), 121)

    insert into [PowerPnlAgg]

    select@valuationDate

    ,Trade.PKID

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,null

    ,Cast(convert(varchar(50), dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), 110) as datetime) as DeliveryDate

    ,case when Location.Efa <> 1 then

    null

    else

    Cast(convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) as datetime)

    end as DeliveryDate_EFA

    ,null as DelieryDate_Year

    ,case when Location.Efa <> 1 then

    null

    else

    (select EfaYear from EfaCalendarDates where convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) >= EfaCalendarDates.StartDate and convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) < EfaCalendarDates.Enddate)

    end as DeliveryDate_EFAYear

    , null as DeliveryDate_Month

    ,case when Location.Efa <> 1 then

    null

    else

    (select EfaMonth from EfaCalendarDates where convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) >= EfaCalendarDates.StartDate and convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) < EfaCalendarDates.Enddate)

    end as DeliveryDate_EFAMonth

    ,null as DeliveryDate_Day

    ,GMA.dbo.[GMT.Calendar.IsPeak](dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), Location.CountryId) as [PEAK]

    ,null as [PEAKOFFPEAK]

    ,sum(case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end) as BuyVolume

    ,sum(case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end) as SellVolume

    ,case when sum(case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end) <> 0

    then

    sum(case when [Profile].UserDefined = 1 then TradeVolume.Price else Trade.Price end *

    case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end)

    / sum(case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end)

    else 0

    end

    as BuyPrice

    ,case when sum(case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end) <> 0

    then

    sum(case when [Profile].UserDefined = 1 then TradeVolume.Price else Trade.Price end *

    case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end)

    / sum(case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end)

    else 0

    end

    as SellPrice

    ,case when sum(case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end) <> 0

    then

    sum(FWDCURVE.[Value] *

    case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end)

    / sum(case when TradeVolume.ContractVolume > 0 then abs(TradeVolume.ContractVolume) else 0 end)

    else 0

    end

    as BuyMarketPrice

    ,case when sum(case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end) <> 0

    then

    sum(FWDCURVE.[Value] *

    case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end)

    / sum(case when TradeVolume.ContractVolume < 0 then abs(TradeVolume.ContractVolume) else 0 end)

    else 0

    end

    as SellMarketPrice

    ,null as PeakMWh

    ,null as OffPeakMWh

    ,0 as Hours

    ,0 --- Timeznoe indicator -- 0 for UK

    ,0 --- EFA indicator --- 0 means NON EFA

    from GMA.dbo.Trade Trade

    inner join GMA.dbo.TradeVolume TradeVolume with(nolock)

    on Trade.PKID = TradeVolume.Trade

    inner join GMA.dbo.ForwardCurve FC with(nolock)

    on Trade.ForwardCurve = FC.[PKID]

    inner join GMA.dbo.[Profile] with(nolock) on

    Trade.[Profile]= [Profile].PKID

    inner join [GMA].[dbo].Location with(nolock) on

    Trade.Location = Location.PKID

    left outer join

    (selectForwardCurve.PKID

    ,cdv.Date Date

    ,cdv.DaylightSaving

    ,cdv.Value [Value]

    fromGMA.dbo.ForwardCurve

    inner join GMA.dbo.CurveIssue on

    ForwardCurve.Curve = CurveIssue.Curve

    inner join GMA.dbo.CurveData cdv on

    CurveIssue.PKID = cdv.CurveIssue

    whereCurveIssue.Date = @valuationDate

    ) FWDCURVE on

    Trade.ForwardCurve = FWDCURVE.PKID and

    TradeVolume.[Date] = FWDCURVE.[Date] and

    TradeVolume.[DaylightSaving] = FWDCURVE.[DaylightSaving]

    whereTrade.Deleted = 0

    and(Trade.Buyer = 1 or Trade.Seller = 1)

    andTrade.commodity = 2

    and(@tradeId is null or Trade.pkid = @tradeId)

    and TradeVolume.[Date] > @TradeVolumeStartBoundary

    and TradeVolume.[Date] < @TradeVolumeEndBoundary

    groupby Trade.pkid

    ,location.efa

    ,convert(varchar(50), dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), 110)

    ,GMA.dbo.[GMT.Calendar.IsPeak](dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), Location.CountryId)

    ,convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110)

    print '2: ' + convert(nvarchar, getdate(), 121)

    update [PowerPnlAgg]

    set [TradeDate] = t.Date

    ,CommodityID = commodity.pkid

    ,Commodity = commodity.Name

    ,LocationID = location.pkid

    ,Location = Location.name

    ,CountryId = location.CountryId

    ,BookID = book.pkid

    ,Book = book.name

    ,BrokerID = broker.pkid

    ,Broker = broker.name

    ,TraderID = trader.pkid

    ,Trader = trader.name

    ,CounterpartyID = Counterparty.[PKID]

    ,Counterparty = Counterparty.[Name]

    ,CurrencyID = currency.pkid

    ,Currency = Currency.[Name]

    ,UnitID = unit.pkid

    ,Unit = Unit.[Name]

    ,CommodityTypeID = t.CommodityTypeId

    ,CommodityType = CommodityType.[Name]

    from gma.dbo.Trade t

    inner join [GMA].[dbo].Commodity with(nolock) on

    t.Commodity = Commodity.PKID

    inner join [GMA].[dbo].ProductType with(nolock) on

    t.ProductType = ProductType.PKID

    inner join [GMA].[dbo].Location with(nolock) on

    t.Location = Location.PKID

    inner join [GMA].[dbo].Book with(nolock) on

    t.Book = Book.PKID

    inner join [GMA].[dbo].Counterparty with(nolock) on

    case when t.Buyer = 1 then Seller else Buyer end = Counterparty.PKID

    inner join [GMA].[dbo].Trader with(nolock) on

    t.Trader = Trader.PKID

    inner join [GMA].[dbo].Broker with(nolock) on

    t.Broker = Broker.PKID

    inner join [GMA].[dbo].Currency with(nolock) on

    t.Currency = Currency.PKID

    inner join [GMA].[dbo].Unit with(nolock) on

    t.Unit = Unit.PKID

    inner join [GMA].MasterData.CommodityType with(nolock) on

    t.CommodityTypeID = CommodityType.CommodityTypeID

    where tradeid = t.pkid

    and(@tradeId is null or tradeId = @tradeId)

    and EFACalendar = 0

    print '3: ' + convert(nvarchar, getdate(), 121)

    update[PowerPnlAgg]

    setDeliveryDate_Year = year(DeliveryDate)

    , DeliveryDate_Month = month(DeliveryDate)

    , DeliveryDate_Day = day(DeliveryDate)

    , [PEAKOFFPEAK] = case when Peak = 1 then 'PEAK' else 'OFFPEAK' end

    whereEFACalendar = 0

    and(@tradeId is null or TradeId = @tradeId)

    print '4: ' + convert(nvarchar, getdate(), 121)

    exec [Reporting.System.ReportingProcedureStop] @systemControlId

    end

  • Please see this article on how to post performance problems.

    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    It will take the guess work out of it...



    Clear Sky SQL
    My Blog[/url]

  • See if you can narrow down which query is the problem. It may just be one of them, or it may be that proc that's called at the bottom. The more you can narrow down exactly where the problem is, the more likely it is that someone can help you.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks all, its the scalar operators in the insert;

    ,Cast(convert(varchar(50), dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), 110) as datetime) as DeliveryDate

    ,case when Location.Efa <> 1 then

    null

    else

    Cast(convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) as datetime)

    end as DeliveryDate_EFA

    ,null as DelieryDate_Year

    ,case when Location.Efa <> 1 then

    null

    else

    (select EfaYear from EfaCalendarDates where convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) >= EfaCalendarDates.StartDate and convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) < EfaCalendarDates.Enddate)

    end as DeliveryDate_EFAYear

    , null as DeliveryDate_Month

    ,case when Location.Efa <> 1 then

    null

    else

    (select EfaMonth from EfaCalendarDates where convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) >= EfaCalendarDates.StartDate and convert(varchar(50), dateadd(hh, 1, [TradeVolume].[Date]), 110) < EfaCalendarDates.Enddate)

    end as DeliveryDate_EFAMonth

    ,null as DeliveryDate_Day

    ,GMA.dbo.[GMT.Calendar.IsPeak](dbo.[GMT.Calendar.AggConvertDateTime]([TradeVolume].[Date], Location.Efa), Location.CountryId) as [PEAK]

    Pls see sql plan attached.

  • At a glance (Gail has the right of it, please post what she asked for), these 2 may be causing significant problems.

    SELECT EfaYear

    FROM EfaCalendarDates

    WHERE CONVERT(varchar(50), DATEADD(hh, 1, [TradeVolume].[Date]), 110) >= EfaCalendarDates.StartDate

    AND CONVERT(varchar(50), DATEADD(hh, 1, [TradeVolume].[Date]), 110) < EfaCalendarDates.Enddate

    These are correlated subqueries. In very simple queries, or if the subquery is simple enough, the optimizer will often convert these to inner joins and they won't cause performance problems. However, my guess is that with a query this "complex", it is failing to do so. I'd look at seeing if you could replace these with derived tables or a temp table instead. Also, do these statements return more than 1 row?

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Oh yeah, didn't even notice those were functions. That's likely major issue #2. Please post the code for the functions as well. If they're simple enough, pulling the code out of them can cause hundreds of % performance increase.

    I had one just last week where removing a scalar function from a where clause (not my doing) and putting the logic inline caused a query to go from 3 minutes to 1s.

    Seth Phelabaum


    Consistency is only a virtue if you're not a screwup. 😉

    Links: How to Post Sample Data[/url] :: Running Totals[/url] :: Tally Table[/url] :: Cross Tabs/Pivots[/url] :: String Concatenation[/url]

  • Can you narrow it down further ?

    Im guessing that its code within the udf's, that is taking the time.

    Try rewriting them using proper set based logic.



    Clear Sky SQL
    My Blog[/url]

  • newbie2009 (10/6/2009)


    Pls see sql plan attached.

    Did you read the article?

    Table definitions and index definitions please.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Apologies Gail, I've only just read the article...busy day.

    You'll notice a distinct lack of indexes on the inserted table ( amongst others)...have experimented on dev and this has had no impact on the insert time, although

    thats not to say of course that on live we wouldn't benefit from it.

    Functions as follows:

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    create function [dbo].[GMT.Calendar.AggConvertDateTime]

    (

    @date datetime

    ,@locationEFA int

    )

    returns datetime

    as

    begin

    declare @Result as datetime

    set @Result = @date

    if (@locationEFA <> 1)

    begin

    set @Result = dateadd(hh, 1, @date)

    end

    return @Result

    end

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE function [dbo].[GMT.Calendar.AggIsPeak]

    (

    @date datetime

    ,@CountryId int

    )

    returns int

    as

    begin

    declare @Result as int

    if (@CountryId <> 230)

    begin

    set @Result = GMA.dbo.[GMT.Calendar.IsPeak](@date, @CountryId)

    end

    else

    begin

    set @date = Dateadd(hh, -1, @date)

    set @Result = GMA.dbo.[GMT.Calendar.IsPeak](@date, @CountryId)

    end

    return @Result

    end

  • SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    Create function [dbo].[GMT.Calendar.IsPeak]

    (

    @date datetime -- This date should always be in country's local datetime

    ,@countryId int

    )

    returns bit

    as

    begin

    declare @Result as bit

    set@Result = 0

    if (@countryId in (0, 230))

    begin

    if (datepart(hh, @date) >= 7 and datepart(hh, @date) < 19 and datepart(dw, @date) not in (1, 7))

    begin

    set @Result = 1

    end

    end

    else if (@countryId in (81, 74, 211, 59, 58, 108, 204, 198, 15, 99))

    begin

    if (datepart(hh, @date) >= 8 and datepart(hh, @date) < 20 and datepart(dw, @date) not in (1, 7))

    begin

    set @Result = 1

    end

    end

    else if (@countryId in (155, 22, 176, 34, 181))

    begin

    if (datepart(hh, @date) >= 7 and datepart(hh, @date) < 23 and datepart(dw, @date) not in (1, 7))

    begin

    set @Result = 1

    end

    end

    return @Result

    end

  • It is interesting that you point out that the Procedure seemed to work fine before the performance suddenly degraded.

    One of the things you want to do in such a situation is to take a close look at what may have changed in the Environment or on the Box.

    Is the Procedure having to process more data than it usually does ie. are rowcounts roughly the same as they have always been?

    An increased number of rows could have brought a latent performance problem in the code to light. Maybe a recent bulk load into one of the

    tables refrenced?

    Is it the only Procedure affected?

    I believe spooling is done to Tempdb. Has the usage pattern of this database changed in any significant way, for instance, Row Versioning / Snapshots / new applications that make heavy use of it?

    Do you happen to have the query plan for when it ran within acceptable parameters?

  • Very relevant questions, unfortunately I've only just started so can't comment on row counts, changes in workload etc.

    It was exactly the question I asked though and I've been told the row counts are pretty consistent with before.

    Its the only sproc affected!

  • How do things like this go from 1 hour to over 3 hours overnight? Since it's legacy code and I'm not likely to offend you personally, it's what happens when crap code (sorry folks) reaches the "tipping point". As with the proverbial "straw that broke the camel's back", it only takes one or two extra rows in a key table or two and BOOM! The execution plan changes to the nightmare of mega row processing we see all over the internal rowcounts of the execution plan you were good enough to provide.

    From the execution plan, I see things like a clustered index seek on TradeVolume that returns over 60 million rows yet the estimated number of rows is about 1300. Not a good estimate to say the least.

    I also see several index spools on the EfaCalendarDates table in excess of 54 million internal rows with an estimated 180 rows.

    Last but not least, I see a final INSERT with an input of only 1.5 million actual rows but is being fed by a whopping big compute scalar with over 46 million rows. That's not even reasonable for most large aggregations.

    All of that leads to only one thing... the query is simply trying to do too much at one time. It needs to be rewritten using "Divide'n'Conquer" methods.

    To help me understand all of the sources of these rather large rowcounts and possibly come up with a solution to fix this, would you run the following code an post the output please?

    SELECT COUNT(*) AS TrvCount FROM TradeVolume

    SELECT COUNT(*) AS EfaCount FROM EfaCalendarDates

    SELECT TOP 10 * FROM TradeVolume

    SELECT TOP 10 * FROM EraCalendarDates

    We may not get it all, but I think we can knock the heck out of some low hanging fruit. With only a couple of variations on the theme, the execution plan looks just like some of the crud ball execution plans for legacy code at work. When they asked me to create a new instance of a server because (they've always had to) the wanted to run a year end report, I got suspicious. 😛 Their run used to take in excess of 3 hours. I'm not making any promises to you on your code but I got that run down to about 8 seconds just by doing a little slice and dice.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Also, you might want to provide that output as an attachment to the post rather than listing it up front. Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, thats exactly it however the execution plan pretty much the same in the dev environment and it runs in 50mins which has been difficult to explain to be honest...

    obviously I/O demands on live are extreme by conmparison, but by a factor of 4....?

    Pls see row counts attached.

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply