Conditional Aggregation

  • Thanks if you can help.

    I would like to SUM by Color but only if SampleType = 1.

    I could do this with UNION ALL but I am wondering if I can eliminate the UNION ALL. If I can eliminate the UNION ALL I would then do some performance testing between the two methods.

    DROP TABLE IF EXISTS #SampleData
    CREATE TABLE #SampleData (Color VARCHAR(10), SampleType INT, SampleValue INT)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Blue',1,100)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Blue',1,101)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Blue',1,102)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Green',1,200)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Green',1,201)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Red',2,300)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Red',2,301)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Yellow',3,400)
    INSERT INTO #SampleData (Color,SampleType,SampleValue)VALUES ('Yellow',3,401)

    SELECT * FROM #SampleData

    SELECT * FROM
    (
    SELECT DISTINCT Color,SampleType,
    SUM(SampleValue)OVER(PARTITION BY Color) AS SampleValue
    FROM #SampleData WHERE SampleType = 1

    UNION ALL

    SELECT * FROM #SampleData WHERE SampleType <> 1
    )SD

     

     

     

  • Try

    SUM(IIF(SampleType = 1, SampleValue,0))

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Phil Parkin wrote:

    Try

    SUM(IIF(SampleType = 1, SampleValue,0))

    Like this?

    SELECT Color,SampleType,

    SUM(IIF(SampleType = 1, SampleValue,0))

    FROM #SampleData

    GROUP BY

    Color,SampleType

     

  • This seems to work but I somehow doubt it will perform any better.

    SELECT
    DISTINCT
    Color,SampleType,
    SUM(SampleValue)OVER(PARTITION BY CASE WHEN SampleType = 1 THEN Color ELSE CONVERT(VARCHAR(MAX),RowNum) END) AS SampleValue
    FROM
    (
    SELECT
    ROW_NUMBER()OVER (ORDER BY Color,SampleType,SampleValue ) AS RowNum,
    Color,SampleType,SampleValue
    FROM #SampleData
    )SD
  • It's a bit hacky, but this avoids the UNION ALL:

    WITH Uniqued
    AS (SELECT Grouper = IIF(sd.SampleType = 1, 1, CHECKSUM (*))
    ,sd.Color
    ,sd.SampleType
    ,sd.SampleValue
    FROM #SampleData sd)
    SELECT Uniqued.Color
    ,Uniqued.SampleType
    ,SUM (Uniqued.SampleValue)
    FROM Uniqued
    GROUP BY Uniqued.Grouper
    ,Uniqued.Color
    ,Uniqued.SampleType
    ORDER BY Uniqued.SampleType
    ,Uniqued.Color;

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • SELECT Color,
    SampleType,
    SUM(SampleValue) SampleValue
    FROM #SampleData
    GROUP BY Color,
    SampleType,
    CASE WHEN SampleType = 1 THEN '1' ELSE CONVERT(char(36), NewId()) END
    ;
  • This might be more efficient:

    ;WITH CTE AS
    (
    SELECT *,
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) rn
    FROM #SampleData
    )
    SELECT Color,SampleType,SUM(SampleValue) SampleValue
    FROM CTE
    GROUP BY Color,
    SampleType,
    CASE WHEN SampleType = 1 THEN 0 ELSE rn END
    ;
  • Thanks for all the solutions. I'll adapt to my actual DDL and report back If I uncover anything notable.

  • If your actual tables include a PK not mentioned here, you may be able to use that to make your row unique, which would obviate the need for a derived 'row uniquer'.

    CASE WHEN SampleType = 1 THEN 0 ELSE PK END

    (From Jonathan's latest code)

    This would remove the need for a CTE.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

  • Good point. I have a composite primary key comprised of 3 INT columns.

    Maybe.

    CASE WHEN SampleType = 1 THEN 0 ELSE PK1 + PK2 + PK3 END

    Although I guess the 3 numbers summed could produce duplicates. Casting to CHAR and concatenating would create a unique value but the cast could be a performance killer.

  • You are right to consider the potential for duplicates. I can't think of an elegant (and fool-proof) way of handling the composite key – so I'd stick with Jonathan's suggestion.

    The absence of evidence is not evidence of absence
    - Martin Rees
    The absence of consumable DDL, sample data and desired results is, however, evidence of the absence of my response
    - Phil Parkin

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply