Selecting minimum value per ID from Table.

  • Urgent Help required in implementing below using SQL SERVER 2008.

    I have a table with the following data.

    Table A:

    ID AMT_DIFF DAYS_DIFF

    ------------------------------------

    100 10 20

    100 15 25

    200 10 10

    200 10 15

    I need to select data from above table based on below scenerios.

    Select the record with min AMT_DIFF per ID,if it gives duplicate(i.e. AMT_DIFF is equal for 2 records like ID:200) then select record with minimum DAYS_DIFF value per ID.

    If DAYS_DIFF is also same for 2 records per ID,then select none.

    Any Help would be greatly Appreciated.Thanks in Advance.

  • How about this?

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

    CREATE TABLE #Temp

    (

    iD INT --IDENTITY(1,1)

    ,AMT_DIFF INT

    ,DAYS_DIFF INT

    );

    INSERT INTO #Temp (ID, AMT_DIFF, DAYS_DIFF)

    SELECT 100 ,10 ,20

    UNION ALL SELECT 100 ,15 ,25

    UNION ALL SELECT 200 ,10 ,10

    UNION ALL SELECT 200 ,10 ,15

    UNION ALL SELECT 300 ,10 ,10

    UNION ALL SELECT 300 ,10 ,10

    UNION ALL SELECT 400 ,10 ,10

    ;

    ; WITH CTE AS

    (

    SELECT ID, AMT_DIFF, DAYS_DIFF

    , GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)

    , RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)

    FROM #Temp

    )

    SELECT C.iD , C.AMT_DIFF , C.DAYS_DIFF

    FROM CTE C

    WHERE C.GrpCt = 1

    AND C.RN1 = 1

  • sunnyepgc (3/15/2012)


    Urgent Help required in implementing below using SQL SERVER 2008.

    I have a table with the following data.

    Table A:

    ID AMT_DIFF DAYS_DIFF

    ------------------------------------

    100 10 20

    100 15 25

    200 10 10

    200 10 15

    I need to select data from above table based on below scenerios.

    Select the record with min AMT_DIFF per ID,if it gives duplicate(i.e. AMT_DIFF is equal for 2 records like ID:200) then select record with minimum DAYS_DIFF value per ID.

    If DAYS_DIFF is also same for 2 records per ID,then select none.

    Any Help would be greatly Appreciated.Thanks in Advance.

    SELECT AMT.ID, MIN(A.DAYS_DIFF) AS MIN_DAYS_DIFF, AMT.MIN_AMT_DIFF

    FROM (SELECT ID, MIN(AMT_DIFF) AS MIN_AMT_DIFF

    FROM TableA

    GROUP BY ID) AMT

    INNER JOIN TableA A

    ON AMT.ID = A.ID

    AND AMT.MIN_AMT_DIFF = A.AMT_DIFF

    GROUP BY AMT.ID, AMT.MIN_AMT_DIFF

    EDIT: This does noes display none when DAYS_DIFF and AMT_DIFF match. Curious though, why would you eliminate this row? I can't think of a business case for this information...

    Jared
    CE - Microsoft

  • sunnyepgc (3/15/2012)


    Urgent Help required in implementing below using SQL SERVER 2008.

    I have a table with the following data.

    ...

    "I have a table with the following data", especially when "Urgent Help required" , should not be provided as free-text but something consumable:

    -- I have a table like:

    Create Table #A (ID INT, AMT_DIFF INT, DAYS_DIFF INT)

    -- with data like:

    INSERT #A VALUES (100,10,20),

    (100,15,25),

    (200,10,10),

    (200,10,15),

    (300,10,15),

    (300,10,15)

    Link at the bottom of my signature will lead you to the forum etiquette article on a subject...

    Here, how you can do it:

    ;WITH cte_Rnk --1. rank by R_AMT and R_Days

    AS

    (

    select *

    ,RANK() OVER (PARTITION BY ID ORDER BY AMT_DIFF) R_AMT

    ,RANK() OVER (PARTITION BY ID ORDER BY DAYS_DIFF) R_DAYS

    from #A

    )

    , cte_Dif --2. get the minimums and filter out ones without difference

    AS

    (

    SELECT ID, MIN(R_AMT) minR_AMT, MIN(R_DAYS) minR_DAYS

    FROM cte_Rnk

    GROUP BY ID

    HAVING MAX(R_AMT) > 1 OR MAX(R_DAYS) > 1

    )

    -- 3. Get the final results:

    SELECT R.ID, R.AMT_DIFF, R.DAYS_DIFF

    FROM cte_Rnk R

    JOIN cte_Dif D

    ON D.ID = R.ID

    WHERE R.R_AMT = D.minR_AMT

    AND R.R_DAYS = D.minR_DAYS

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • EDIT: Misread requirements...

    Jared
    CE - Microsoft

  • ColdCoffee (3/15/2012)


    How about this?

    IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

    DROP TABLE #Temp;

    CREATE TABLE #Temp

    (

    iD INT --IDENTITY(1,1)

    ,AMT_DIFF INT

    ,DAYS_DIFF INT

    );

    INSERT INTO #Temp (ID, AMT_DIFF, DAYS_DIFF)

    SELECT 100 ,10 ,20

    UNION ALL SELECT 100 ,15 ,25

    UNION ALL SELECT 200 ,10 ,10

    UNION ALL SELECT 200 ,10 ,15

    UNION ALL SELECT 300 ,10 ,10

    UNION ALL SELECT 300 ,10 ,10

    UNION ALL SELECT 400 ,10 ,10

    ;

    ; WITH CTE AS

    (

    SELECT ID, AMT_DIFF, DAYS_DIFF

    , GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)

    , RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)

    FROM #Temp

    )

    SELECT C.iD , C.AMT_DIFF , C.DAYS_DIFF

    FROM CTE C

    WHERE C.GrpCt = 1

    AND C.RN1 = 1

    I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    Yeah, I misread that last part and edited my post. However, I have to question the business case of these requirements. Seems "strange" and not really interpretable to anything I can think of.

    Jared
    CE - Microsoft

  • Eugene Elutin (3/15/2012)


    I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.

    But even if you remove the row with "400", my query works ( i guess :w00t: )

    Lets wait for what the OP wanted to do with scenarios like "400"

  • ColdCoffee (3/15/2012)


    Eugene Elutin (3/15/2012)


    I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.

    But even if you remove the row with "400", my query works ( i guess :w00t: )

    Lets wait for what the OP wanted to do with scenarios like "400"

    That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.

    However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/15/2012)


    ColdCoffee (3/15/2012)


    Eugene Elutin (3/15/2012)


    I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.

    But even if you remove the row with "400", my query works ( i guess :w00t: )

    Lets wait for what the OP wanted to do with scenarios like "400"

    That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.

    However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...

    Nope, it wont! You can check the rows with id = 300... they are present multiple times with same AMT_DIFF and DAYS_DIFF and my query did not bring them. The COUNT(*) check was included for that purpose only!

  • ColdCoffee (3/15/2012)


    Eugene Elutin (3/15/2012)


    ColdCoffee (3/15/2012)


    Eugene Elutin (3/15/2012)


    I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

    So, the question is should single "400 ,10 ,10" appear in results?

    If yes, your method is the best!

    I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.

    But even if you remove the row with "400", my query works ( i guess :w00t: )

    Lets wait for what the OP wanted to do with scenarios like "400"

    That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.

    However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...

    Nope, it wont! You can check the rows with id = 300... they are present multiple times with same AMT_DIFF and DAYS_DIFF and my query did not bring them. The COUNT(*) check was included for that purpose only!

    I see your point, i misread what you said, apologies.

    To remove single occuring records like "400", this will do

    ; WITH CTE AS

    (

    SELECT ID, AMT_DIFF, DAYS_DIFF

    , GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)

    , iDCt = COUNT(*) OVER (PARTITION BY ID)

    , RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)

    FROM #Temp

    )

    SELECT *

    FROM CTE C

    WHERE C.GrpCt = 1

    AND C.RN1 = 1

    AND C.iDCt <> 1

  • Yep! It works!

    Now is only performance left.

    Yours give:

    Table 'Worktable'. Scan count 6, logical reads 70, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#A_000000000015'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    My one:

    Table '#A_000000000015'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    I don't know what is better: 6-scans of worktable with 70 logical reads and single scan of a table itself, or just two scans of a table?

    Your query looks more elegant in any case...:hehe:

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (3/15/2012)


    Yep! It works!

    Now is only performance left.

    Yours give:

    Table 'Worktable'. Scan count 6, logical reads 70, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table '#A_000000000015'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    My one:

    Table '#A_000000000015'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    I don't know what is better: 6-scans of worktable with 70 logical reads and single scan of a table itself, or just two scans of a table?

    Your query looks more elegant in any case...:hehe:

    Agreed , windowing functions are more resource-suckers! i once had my hands burnt when i employed ROW_NUMBER to find out the max number for every set when GROUP BY and TOP gave me results 5 times faster :w00t:

  • Thanks for the response guys..

    It worked for me...If there is only one ID present (Like 400) we need to select that record,here no need to look at Min amt and Min days column data.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply