Select Where_Min()

  • I know how to get the smallest value for a group using MIN and GROUP BY, but what if I want to get an associated value that is in that row (for example, give me the "Name" for the value that has the smallest value for the group)?

    I believe the traditional way is to use a derived table to get the minimum values, then join back to the table again to get the rows that contain the minimum value. Is there another way?

    One that I just thought of today was to look for the minimum based on a formula - one that encapsulates both the value I'm looking for and the one that is the "minimum". Something like this:CREATE TABLE #tmp (ID int, IDName varchar(10), Cost numeric (9,2))

    INSERT INTO #tmp VALUES (1, 'first', 51)

    INSERT INTO #tmp VALUES (1, 'second', 5.51)

    INSERT INTO #tmp VALUES (1, 'third', 1234567.89)

    INSERT INTO #tmp VALUES (1, 'fourth', 14.3)

    INSERT INTO #tmp VALUES (1, 'fifth', 100)

    SELECT ID

    , MIN(RIGHT('000000000' + CONVERT(varchar(10), Cost), 10) + '|' + IDName)

    , RIGHT(MIN(RIGHT('000000000' + CONVERT(varchar(10), Cost), 10) + '|' + IDName),

    LEN(MIN(RIGHT('000000000' + CONVERT(varchar(10), Cost), 10) + '|' + IDName))

    - CHARINDEX('|', MIN(RIGHT('000000000' + CONVERT(varchar(10), Cost), 10) + '|' + IDName)))

    FROM #tmp

    GROUP BY ID

    This only requires one pass through the table but seems obtuse and is going to have to be customized (padding/no padding, what to use as a delimiter, etc.). I also haven't completely worked out the logic to be confident that it will always work.

    Anyone have suggestions on how to grab additional information for a row that "won" the Min/Max aggregate? You know, something like "SELECT IDName Where_Min(Cost)"? 🙂

    Thanks,

    Chad

  • This is not horribly clever and it certainly requires a second pass through but it works.

    select t.IDName

    from (select ID, MIN(cost) mincost

    from #tmp

    group by id) min

    join #tmp t on min.mincost = t.Cost

  • Since you are using SQL 2005, you have the windowed functions and the OVER clause....

    Here is that query

    SELECT *

    FROM (

    SELECT MIN( Cost ) OVER( PARTITION BY [ID] ) AS MinCost, *

    FROM #tmp

    ) T

    WHERE T.MinCost = T.Cost

    --Ramesh


  • Ramesh (1/15/2009)


    Since you are using SQL 2005, you have the windowed functions and the OVER clause....

    Here is that query

    SELECT *

    FROM (

    SELECT MIN( Cost ) OVER( PARTITION BY [ID] ) AS MinCost, *

    FROM #tmp

    ) T

    WHERE T.MinCost = T.Cost

    Heh... you don't HAVE to use windowed functions... but it is certainly a good way to do it. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Very nice Ramesh. I think that might work for me. Thank you for your help!

    Chad

  • Just be careful of the possible perf pitfall with the windowed Aggregate function. Keep in mind that that will get evaled for EACH ROW in the rowset (which is NOT what the older sub-query will do).

    A smallish set won't show much of a difference, but the perf difference will get more marked as the recordset grows......

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Ramesh (1/15/2009)


    Since you are using SQL 2005, you have the windowed functions and the OVER clause....

    Here is that query

    SELECT *

    FROM (

    SELECT MIN( Cost ) OVER( PARTITION BY [ID] ) AS MinCost, *

    FROM #tmp

    ) T

    WHERE T.MinCost = T.Cost

    This works also and should be faster:

    SELECT *

    FROM (

    SELECT ROW_Number() OVER(Partition by ID Order by Cost) as RowID, *

    FROM #tmp

    ) T

    WHERE T.RowID = 1

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (1/15/2009)


    Ramesh (1/15/2009)


    Since you are using SQL 2005, you have the windowed functions and the OVER clause....

    Here is that query

    SELECT *

    FROM (

    SELECT MIN( Cost ) OVER( PARTITION BY [ID] ) AS MinCost, *

    FROM #tmp

    ) T

    WHERE T.MinCost = T.Cost

    This works also and should be faster:

    SELECT *

    FROM (

    SELECT ROW_Number() OVER(Partition by ID Order by Cost) as RowID, *

    FROM #tmp

    ) T

    WHERE T.RowID = 1

    I had considered that one - they don't return equivalent results though. The ROW_NUMBER will return 1 row per ID, the MIN will return any ties.

    So - depends on what you are looking for.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • True. I guess you could use RANK for ties.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Heh... reminds me of Father's Day and Christmas... "rank ties". 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh. Good one. 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (1/15/2009)


    Ramesh (1/15/2009)


    Since you are using SQL 2005, you have the windowed functions and the OVER clause....

    Here is that query

    SELECT *

    FROM (

    SELECT MIN( Cost ) OVER( PARTITION BY [ID] ) AS MinCost, *

    FROM #tmp

    ) T

    WHERE T.MinCost = T.Cost

    This works also and should be faster:

    SELECT *

    FROM (

    SELECT ROW_Number() OVER(Partition by ID Order by Cost) as RowID, *

    FROM #tmp

    ) T

    WHERE T.RowID = 1

    Thanks barry for pointing out....:), actually I does knew that but its like someday you don't think too much, drink too much coffee and post too much scraps....:D:w00t::hehe:

    --Ramesh


  • It's weird actually. There's no obvious reason why RANK() and ROW_NUMBER() should be faster than MIN(), but they clearly are when you look at the differences in the execution plans.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • The only obvious reason I am thinking of is that the optimizer has to do a "Scalar Operation" on RANK() & ROW_NUMBER() whereas "Aggreagate Operation" on MIN() function....

    --Ramesh


  • RBarryYoung (1/16/2009)


    It's weird actually. There's no obvious reason why RANK() and ROW_NUMBER() should be faster than MIN(), but they clearly are when you look at the differences in the execution plans.

    Heh... You, of all people, should know that the execution plan is a troubleshooting tool and can have nothing to do with reality insofar as which code will be more performant especially if you're looking at that damnable "% of Batch%" indicator. Unless you actually test it and measure, you cannot base decisions on which code is the most performant just by looking at the execution plan.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply