Selecting minimum value per ID from Table.

Question

Post reply

Selecting minimum value per ID from Table.

sunnyepgc

Mr or Mrs. 500

Points: 546
More actions
March 15, 2012 at 11:19 am

#254769

Urgent Help required in implementing below using SQL SERVER 2008.
I have a table with the following data.
Table A:
ID AMT_DIFF DAYS_DIFF
------------------------------------
100 10 20
100 15 25
200 10 10
200 10 15
I need to select data from above table based on below scenerios.
Select the record with min AMT_DIFF per ID,if it gives duplicate(i.e. AMT_DIFF is equal for 2 records like ID:200) then select record with minimum DAYS_DIFF value per ID.
If DAYS_DIFF is also same for 2 records per ID,then select none.
Any Help would be greatly Appreciated.Thanks in Advance.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 1

How about this?

IF OBJECT_ID('TempDB..#Temp') IS NOT NULL

DROP TABLE #Temp;

CREATE TABLE #Temp

(

iD INT --IDENTITY(1,1)

,AMT_DIFF INT

,DAYS_DIFF INT

);

INSERT INTO #Temp (ID, AMT_DIFF, DAYS_DIFF)

SELECT 100 ,10 ,20

UNION ALL SELECT 100 ,15 ,25

UNION ALL SELECT 200 ,10 ,10

UNION ALL SELECT 200 ,10 ,15

UNION ALL SELECT 300 ,10 ,10

UNION ALL SELECT 400 ,10 ,10

;

; WITH CTE AS

(

SELECT ID, AMT_DIFF, DAYS_DIFF

, GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)

, RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)

FROM #Temp

)

SELECT C.iD , C.AMT_DIFF , C.DAYS_DIFF

FROM CTE C

WHERE C.GrpCt = 1

AND C.RN1 = 1

Jared SSC Guru Points: 61793 More actions · Answer 2

sunnyepgc (3/15/2012)
Urgent Help required in implementing below using SQL SERVER 2008.
I have a table with the following data.
Table A:
ID AMT_DIFF DAYS_DIFF
------------------------------------
100 10 20
100 15 25
200 10 10
200 10 15
I need to select data from above table based on below scenerios.
Select the record with min AMT_DIFF per ID,if it gives duplicate(i.e. AMT_DIFF is equal for 2 records like ID:200) then select record with minimum DAYS_DIFF value per ID.
If DAYS_DIFF is also same for 2 records per ID,then select none.
Any Help would be greatly Appreciated.Thanks in Advance.

SELECT AMT.ID, MIN(A.DAYS_DIFF) AS MIN_DAYS_DIFF, AMT.MIN_AMT_DIFF

FROM (SELECT ID, MIN(AMT_DIFF) AS MIN_AMT_DIFF

FROM TableA

GROUP BY ID) AMT

INNER JOIN TableA A

ON AMT.ID = A.ID

AND AMT.MIN_AMT_DIFF = A.AMT_DIFF

GROUP BY AMT.ID, AMT.MIN_AMT_DIFF

EDIT: This does noes display none when DAYS_DIFF and AMT_DIFF match. Curious though, why would you eliminate this row? I can't think of a business case for this information...

Jared
CE - Microsoft

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 3

sunnyepgc (3/15/2012)
Urgent Help required in implementing below using SQL SERVER 2008.
I have a table with the following data.
...

"I have a table with the following data", especially when "Urgent Help required" , should not be provided as free-text but something consumable:

-- I have a table like:

Create Table #A (ID INT, AMT_DIFF INT, DAYS_DIFF INT)

-- with data like:

INSERT #A VALUES (100,10,20),

(100,15,25),

(200,10,10),

(200,10,15),

(300,10,15),

(300,10,15)

Link at the bottom of my signature will lead you to the forum etiquette article on a subject...

Here, how you can do it:

;WITH cte_Rnk --1. rank by R_AMT and R_Days

AS

(

select *

,RANK() OVER (PARTITION BY ID ORDER BY AMT_DIFF) R_AMT

,RANK() OVER (PARTITION BY ID ORDER BY DAYS_DIFF) R_DAYS

from #A

)

, cte_Dif --2. get the minimums and filter out ones without difference

AS

(

SELECT ID, MIN(R_AMT) minR_AMT, MIN(R_DAYS) minR_DAYS

FROM cte_Rnk

GROUP BY ID

HAVING MAX(R_AMT) > 1 OR MAX(R_DAYS) > 1

)

-- 3. Get the final results:

SELECT R.ID, R.AMT_DIFF, R.DAYS_DIFF

FROM cte_Rnk R

JOIN cte_Dif D

ON D.ID = R.ID

WHERE R.R_AMT = D.minR_AMT

AND R.R_DAYS = D.minR_DAYS

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Jared SSC Guru Points: 61793 More actions · Answer 4

EDIT: Misread requirements...

Jared
CE - Microsoft

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 5

ColdCoffee (3/15/2012)
How about this?
IF OBJECT_ID('TempDB..#Temp') IS NOT NULL
DROP TABLE #Temp;
CREATE TABLE #Temp
(
iD INT --IDENTITY(1,1)
,AMT_DIFF INT
,DAYS_DIFF INT
);
INSERT INTO #Temp (ID, AMT_DIFF, DAYS_DIFF)
SELECT 100 ,10 ,20
UNION ALL SELECT 100 ,15 ,25
UNION ALL SELECT 200 ,10 ,10
UNION ALL SELECT 200 ,10 ,15
UNION ALL SELECT 300 ,10 ,10
UNION ALL SELECT 300 ,10 ,10
UNION ALL SELECT 400 ,10 ,10
;
; WITH CTE AS
(
SELECT ID, AMT_DIFF, DAYS_DIFF
, GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)
, RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)
FROM #Temp
)
SELECT C.iD , C.AMT_DIFF , C.DAYS_DIFF
FROM CTE C
WHERE C.GrpCt = 1
AND C.RN1 = 1

I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...

So, the question is should single "400 ,10 ,10" appear in results?

If yes, your method is the best!

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

Jared SSC Guru Points: 61793 More actions · Answer 6

I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...
So, the question is should single "400 ,10 ,10" appear in results?
If yes, your method is the best!

Yeah, I misread that last part and edited my post. However, I have to question the business case of these requirements. Seems "strange" and not really interpretable to anything I can think of.

Jared
CE - Microsoft

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 7

Eugene Elutin (3/15/2012)
I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...
So, the question is should single "400 ,10 ,10" appear in results?
If yes, your method is the best!

I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.

But even if you remove the row with "400", my query works ( i guess :w00t: )

Lets wait for what the OP wanted to do with scenarios like "400"

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 8

ColdCoffee (3/15/2012)
Eugene Elutin (3/15/2012)
I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...
So, the question is should single "400 ,10 ,10" appear in results?
If yes, your method is the best!
I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.
But even if you remove the row with "400", my query works ( i guess :w00t: )
Lets wait for what the OP wanted to do with scenarios like "400"

That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.

However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 9

Eugene Elutin (3/15/2012)
ColdCoffee (3/15/2012)
Eugene Elutin (3/15/2012)
I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...
So, the question is should single "400 ,10 ,10" appear in results?
If yes, your method is the best!
I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.
But even if you remove the row with "400", my query works ( i guess :w00t: )
Lets wait for what the OP wanted to do with scenarios like "400"
That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.
However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...

Nope, it wont! You can check the rows with id = 300... they are present multiple times with same AMT_DIFF and DAYS_DIFF and my query did not bring them. The COUNT(*) check was included for that purpose only!

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 10

ColdCoffee (3/15/2012)
Eugene Elutin (3/15/2012)
ColdCoffee (3/15/2012)
Eugene Elutin (3/15/2012)
I'm not sure what OP wants for "UNION ALL SELECT 400 ,10 ,10 " data sample, I've read his requirements as: he only want to see the record when there is a duplicate by id with different AMT_DIFF or DAYS_OFF...
So, the question is should single "400 ,10 ,10" appear in results?
If yes, your method is the best!
I just wanted to throw those extra scenarios that are not in the OP's requirement sheet.
But even if you remove the row with "400", my query works ( i guess :w00t: )
Lets wait for what the OP wanted to do with scenarios like "400"
That exactly what I've said. If OP wants to see the single-occurred case like "400" returned, your query does the job.
However, if OP doesn't want to see rows where there are no duplicates by ID, your query doesn't work as it will return it (Cnt =1)...
Nope, it wont! You can check the rows with id = 300... they are present multiple times with same AMT_DIFF and DAYS_DIFF and my query did not bring them. The COUNT(*) check was included for that purpose only!

I see your point, i misread what you said, apologies.

To remove single occuring records like "400", this will do

; WITH CTE AS

(

SELECT ID, AMT_DIFF, DAYS_DIFF

, GrpCt = COUNT(*) OVER (PARTITION BY ID, AMT_DIFF, DAYS_DIFF)

, iDCt = COUNT(*) OVER (PARTITION BY ID)

, RN1 = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID, AMT_DIFF, DAYS_DIFF)

FROM #Temp

)

SELECT *

FROM CTE C

WHERE C.GrpCt = 1

AND C.RN1 = 1

AND C.iDCt <> 1

Eugene Elutin SSC Guru Points: 59322 More actions · Answer 11

Yep! It works!

Now is only performance left.

Yours give:

Table 'Worktable'. Scan count 6, logical reads 70, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table '#A_000000000015'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

My one:

Table '#A_000000000015'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I don't know what is better: 6-scans of worktable with 70 logical reads and single scan of a table itself, or just two scans of a table?

Your query looks more elegant in any case...:hehe:

_____________________________________________
"The only true wisdom is in knowing you know nothing"
"O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
(So many miracle inventions provided by MS to us...)

How to post your question to get the best and quick help[/url]

ColdCoffee SSC-Dedicated Points: 39972 More actions · Answer 12

Eugene Elutin (3/15/2012)
Yep! It works!
Now is only performance left.
Yours give:
Table 'Worktable'. Scan count 6, logical reads 70, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#A_000000000015'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
My one:
Table '#A_000000000015'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
I don't know what is better: 6-scans of worktable with 70 logical reads and single scan of a table itself, or just two scans of a table?
Your query looks more elegant in any case...:hehe:

Agreed , windowing functions are more resource-suckers! i once had my hands burnt when i employed ROW_NUMBER to find out the max number for every set when GROUP BY and TOP gave me results 5 times faster :w00t:

sunnyepgc Mr or Mrs. 500 Points: 546 More actions · Answer 13

Thanks for the response guys..

It worked for me...If there is only one ID present (Like 400) we need to select that record,here no need to look at Min amt and Min days column data.