Ideas on re writing the group by to reduce the cost of Sorting?

  • All,

    I have the following query that according to the execution plan sorting takes %79 percent of over all time. I think this relates to the group by clause I use within the query.

    Here is the query

    declare @fdwPeriod varchar(6)

    Set @fdwPeriod = '201104'

    select

    isnull(DRH.BusinessUnit,'[?]') as Branch

    ,isnull(DRH.UnitId,'[?]') as UnitID

    ,isnull(DF.PureYOA,'[?]') as UwYear

    ,'[?]' as AccidentYear

    ,Source

    ,isnull(DC.Currency,'[?]') as TranCcyCode

    ,isnull(FEP.GAAPAccountingPeriod,'[?]') AS ProcAccPrd

    ,isnull(DRH.SectionRef,'[?]') as PolicyRef

    ,'[?]' as CedingPolicyRef

    ,isnull(nullif(DRH.SectionRef,'[!]'),'[?]') as InwardPolicyRef

    ,'INWARD' as ContractType

    ,'INWARD' as OriginalSourcePolicyType

    ,'[?]' as ContractInceptionYear

    ,case when DF.PureYOA = '[?]' then '[?]' when DF.PureYOA < left(@FDWPeriod,4) then DF.PureYOA else left(@FDWPeriod,4) end as AccountingYear

    ,isnull(DRH.InceptionDate,'99981231') as ValuationDate

    ,'99981231' as PaymentDate

    , '[?]' as ClaimRef

    , isnull(DRH.BrokerPseudonym + DRH.BrokerNo,'[?]') as Broker

    , isnull(DF.RiskCategory,'[?]') as RiskCode

    , isnull(ARG.AllocatedReserveGroup,'[?]') as COB1

    , isnull(DRG.ReserveGroup, '[?]') AS COB2

    , isnull(DF.SubClass3Code,'[?]') as Sub3Code

    , '[?]' as SubClass3ReserveGroup

    , isnull(convert(varchar(3),nullif(substring(DRH.SectionRef,5,1),'')),'[?]') as MOA

    , isnull(DF.FILCode1, '[?]') as Fil4

    , isnull(DF.TrustFund, '[?]') as TrustFundCode

    , isnull(DRH.Territory,'[?]') as GeoCode

    , '[?]' as QualifyingCategory

    , FEP.RiskHistoryKey

    , null as FXBindingKey

    , DC.CurrencyKey as TranCcyKey

    , DC.CurrencyKey as MetricCcyKey

    , 2 as FXMethodologyKey

    , 1 as FXRate

    , '' as Narrative

    , DRH.Origin as Origin

    , sum(FEP.GrossGrossPremium) AS [15]

    , sum(FEP.GrossNetPremium) AS [19]

    , (sum(FEP.GrossNetPremium) - sum(FEP.GrossGrossPremium)) AS [61]

    from

    FactEarned FEP

    inner join DimRiskHistory DRH on FEP.RiskHistoryKey = DRH.RiskHistoryKey

    inner join DimReserveGroup DRG on FEP.ReserveGroupKey = DRG.ReserveGroupKey

    inner join DimAllocatedReserveGroup ARG on FEP.AllocatedReserveGroupKey = ARG.AllocatedReserveGroupKey

    inner join DimFinancials DF on FEP.FinancialsKey = DF.FInancialsKey

    left join [cubesupport].[rvw_AsAtReserveGroup] sc3 on sc3.Sub3Code = DF.SubClass3Code

    and sc3.AccountingPeriod = (select max(Snapshotperiod) from dwfact.FactUltimatesSnapshot)

    left join DimSubClass3ReserveGroup rg3 on rg3.SubClass3ReserveGroup = sc3.AsAtReserveGroup

    inner join DimCurrency DC on FEP.SettlementCurrencyKey = DC.CurrencyKey

    inner join DimDate DD on FEP.GAAPInceptedDateKey = DD.DateKey

    where FEP.FXMethodologyKey = 2 and Source = 7

    group by

    DRH.BusinessUnit

    ,DRH.UnitId

    ,DRH.SectionRef

    ,isnull(nullif(DRH.SectionRef,'[!]'),'[?]')

    ,FEP.RiskHistoryKey

    ,DF.PureYOA

    ,DRG.ReserveGroup

    ,FEP.ReserveGroupKey

    ,ARG.AllocatedReserveGroup

    ,FEP.AllocatedReserveGroupKey

    ,isnull(rg3.SubClass3ReserveGroup,'[?]')

    ,isnull(rg3.SubClass3ReserveGroupKey,0)

    ,DC.Currency

    ,DC.CurrencyKey

    ,DD.MonthKey

    ,FEP.GAAPInceptedDateKey

    ,FEP.GAAPEarnedDateKey

    ,DRH.InceptionDate

    ,FEP.PureInceptedDateKey

    ,FEP.PureEarnedDateKey

    ,FEP.GAAPAccountingPeriod

    ,DRH.BrokerPseudonym + DRH.BrokerNo

    ,DF.RiskCategory

    ,DF.FinancialsKey

    ,DF.FILCode1

    ,DF.TrustFund

    ,DRH.Territory

    ,DF.SubClass3Code

    ,FEP.AccountingPeriod

    ,Source

    ,DRH.Origin

    having FEP.GAAPAccountingPeriod <= @fdwPeriod

    Notes: Fact table contains around 400 millions of rows, it is MSSQL 2005

    I have also attached the execution plan.

  • Try this. It probably won't perform better, but it's easier to read and should be easier to figure out good indexes for.

    declare @fdwPeriod varchar(6)

    Set @fdwPeriod = '201104'

    select

    isnull(DRH.BusinessUnit,'[?]') as Branch

    ,isnull(DRH.UnitId,'[?]') as UnitID

    ,isnull(DF.PureYOA,'[?]') as UwYear

    ,'[?]' as AccidentYear

    ,Source

    ,isnull(DC.Currency,'[?]') as TranCcyCode

    ,isnull(FEP.GAAPAccountingPeriod,'[?]') AS ProcAccPrd

    ,isnull(DRH.SectionRef,'[?]') as PolicyRef

    ,'[?]' as CedingPolicyRef

    ,isnull(nullif(DRH.SectionRef,'[!]'),'[?]') as InwardPolicyRef

    ,'INWARD' as ContractType

    ,'INWARD' as OriginalSourcePolicyType

    ,'[?]' as ContractInceptionYear

    ,case when DF.PureYOA = '[?]' then '[?]' when DF.PureYOA < left(@FDWPeriod,4) then DF.PureYOA else left(@FDWPeriod,4) end as AccountingYear

    ,isnull(DRH.InceptionDate,'99981231') as ValuationDate

    ,'99981231' as PaymentDate

    , '[?]' as ClaimRef

    , isnull(DRH.BrokerPseudonym + DRH.BrokerNo,'[?]') as Broker

    , isnull(DF.RiskCategory,'[?]') as RiskCode

    , isnull(ARG.AllocatedReserveGroup,'[?]') as COB1

    , isnull(DRG.ReserveGroup, '[?]') AS COB2

    , isnull(DF.SubClass3Code,'[?]') as Sub3Code

    , '[?]' as SubClass3ReserveGroup

    , isnull(convert(varchar(3),nullif(substring(DRH.SectionRef,5,1),'')),'[?]') as MOA

    , isnull(DF.FILCode1, '[?]') as Fil4

    , isnull(DF.TrustFund, '[?]') as TrustFundCode

    , isnull(DRH.Territory,'[?]') as GeoCode

    , '[?]' as QualifyingCategory

    , FEP.RiskHistoryKey

    , null as FXBindingKey

    , DC.CurrencyKey as TranCcyKey

    , DC.CurrencyKey as MetricCcyKey

    , 2 as FXMethodologyKey

    , 1 as FXRate

    , '' as Narrative

    , DRH.Origin as Origin

    , FEP.[15]

    , FEP.[19]

    , FEP.[61]

    FROM (

    SELECT FE.GAAPAccountingPeriod, FE.RiskHistoryKey, FE.ReserveGroupKey, FE.FinancialsKey, FE.SettlementCurrencyKey, FE.GAAPInceptedDateKey,

    SUM(FE.GrossGrossPremium) AS [15], SUM(FE.GrossNetPremium) AS [19], (SUM(FE.GrossNetPremium) - SUM(FE.GrossGrossPremium)) AS [61]

    FROM FactEarned FE

    WHERE FE.FXMethodologyKey = 2 AND FE.GAAPAccountingPeriod <= @fdwPeriod

    GROUP BY FE.GAAPAccountingPeriod, FE.RiskHistoryKey, FE.ReserveGroupKey, FE.FinancialsKey, FE.SettlementCurrencyKey, FE.GAAPInceptedDateKey) AS FEP

    inner join DimRiskHistory DRH on FEP.RiskHistoryKey = DRH.RiskHistoryKey

    inner join DimReserveGroup DRG on FEP.ReserveGroupKey = DRG.ReserveGroupKey

    inner join DimAllocatedReserveGroup ARG on FEP.AllocatedReserveGroupKey = ARG.AllocatedReserveGroupKey

    inner join DimFinancials DF on FEP.FinancialsKey = DF.FInancialsKey

    left join [cubesupport].[rvw_AsAtReserveGroup] sc3 on sc3.Sub3Code = DF.SubClass3Code

    and sc3.AccountingPeriod = (select max(Snapshotperiod) from dwfact.FactUltimatesSnapshot)

    left join DimSubClass3ReserveGroup rg3 on rg3.SubClass3ReserveGroup = sc3.AsAtReserveGroup

    inner join DimCurrency DC on FEP.SettlementCurrencyKey = DC.CurrencyKey

    inner join DimDate DD on FEP.GAAPInceptedDateKey = DD.DateKey

    where Source = 7

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    That's a very good idea actually. I think it will speed up the query considerably. I am not sure whether the granularity will be lost as you removed a quite bit from group by. In theory it should be the same as all the fields from the fact table are present.

    Let me give this a try and I will let you know.

    Much appreciated.

    Enis

  • ertemen (6/10/2011)


    I am not sure whether the granularity will be lost as you removed a quite bit from group by.

    I haven't changed the logic of the query. Just grouping before joining the fact table into the rest. The other group by columns were only needed because of the 'must be in aggregate or group by' requirement.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    You are absolutely right. I am testing it right now and it looks good so far. It is early stages but it should be fine.

    Once the query runs I will include the quey plan.

    Thanks

    Enis

  • Hi Gail,

    Attached is the new SQL query plan and results seem to be identical that of original query so far.

    Performance improved massively by the way. Thank you

    Regards

    Enis

  • SELECT FE.GAAPAccountingPeriod, FE.RiskHistoryKey, FE.ReserveGroupKey, FE.FinancialsKey, FE.SettlementCurrencyKey, FE.GAAPInceptedDateKey,

    SUM(FE.GrossGrossPremium) AS [15], SUM(FE.GrossNetPremium) AS [19], (SUM(FE.GrossNetPremium) - SUM(FE.GrossGrossPremium)) AS [61]

    FROM FactEarned FE

    WHERE FE.FXMethodologyKey = 2 AND FE.GAAPAccountingPeriod <= @fdwPeriod

    GROUP BY FE.GAAPAccountingPeriod, FE.RiskHistoryKey, FE.ReserveGroupKey, FE.FinancialsKey, FE.SettlementCurrencyKey, FE.GAAPInceptedDateKey

    How large is the data set from the above sub query from the join? If it is a large data set then you may see a nice improvemnt by placing the results into a temp table and then using that in the join.

    Dan

    If only I could snap my figures and have all the correct indexes apear and the buffer clean and.... Start day dream here.

  • Hi Dan,

    It is about 1 million. Do you think it would be better I use a temp table.

    Thanks

    Enis

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply