November 9, 2011 at 9:59 am
Hello Everyone
This is yet another issue of duplicate data. This is very old data that is coming over from an old Unix and DB2 database. Just my luck.
In a table, I have duplicate rows, all except for a DateCreated columns, each of the rows has a distinct date created value. I need to select only the row, of the duplicate values, with the latest DateCreated. Some of the data does not have a duplicate value, but some does.
I can select the correct code if I select a couple times from the table, and use a couple table variables, but I am hoping to get a more efficient way
Here is a small example, if more is needed, I will create an insert query for you
Mary Poppins, 123456, 12-07-05 14:23:45
Mary Poppins, 123456, 12-12-05 08:15:28 -- I want this one
Daffy Duck , 234567, 01-15-05 09:27:23 -- I want this one
Wile E. Coyote, 345678, 03-01-05 18:12:23
Wile E. Coyote, 345678, 03-01-05 20:15:20
Wile E. Coyote, 345678, 03-01-05 20:18:56 -- I want this one
Thank You in advance for any and all help and suggestions
Andrew SQLDBA
November 9, 2011 at 10:12 am
AndrewSQLDBA (11/9/2011)
Hello EveryoneThis is yet another issue of duplicate data. This is very old data that is coming over from an old Unix and DB2 database. Just my luck.
In a table, I have duplicate rows, all except for a DateCreated columns, each of the rows has a distinct date created value. I need to select only the row, of the duplicate values, with the latest DateCreated. Some of the data does not have a duplicate value, but some does.
I can select the correct code if I select a couple times from the table, and use a couple table variables, but I am hoping to get a more efficient way
Here is a small example, if more is needed, I will create an insert query for you
Mary Poppins, 123456, 12-07-05 14:23:45
Mary Poppins, 123456, 12-12-05 08:15:28 -- I want this one
Daffy Duck , 234567, 01-15-05 09:27:23 -- I want this one
Wile E. Coyote, 345678, 03-01-05 18:12:23
Wile E. Coyote, 345678, 03-01-05 20:15:20
Wile E. Coyote, 345678, 03-01-05 20:18:56 -- I want this one
Thank You in advance for any and all help and suggestions
Andrew SQLDBA
An insert statement would make it easier to provide tested code, but I think you need something like this:
WITH data AS
(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY NAME, Id ORDER BY date_column DESC) AS row_no
FROM
table
)
SELECT
*
FROM
DATA
WHERE
row_no = 1
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
November 9, 2011 at 10:16 am
you could do this with a cte and row_number() or an inner join
eg.
to select the rows with duplicates and get the maxdate created, if you want to include non-duplicates as well remove the having clause
Select * from YourTable a
inner join
(Select Character, Max(DateCreated) as MaxDateCreated
from YourTable
group by Character
Having count(*) > 1) b
on a.character = b.character and a.DateCreated = b.MaxDateCreated
November 9, 2011 at 10:24 am
With 582 points and 1500 visits you must have known that the sample data in form of INSERT statement will make the life easier for the answerers. I also see that you have mentioned that you will provide insert statments if needed; it IS needed, be the dataset is large or small.
That said, here is one of doing it
WITH SampleData (UserName, UserID, DateCreated) AS
(
SELECT 'Mary Poppins', 123456, '12-07-05 14:23:45'
UNION ALL SELECT 'Mary Poppins', 123456, '12-12-05 08:15:28' -- I want this one
UNION ALL SELECT 'Daffy Duck' , 234567, '01-15-05 09:27:23' -- I want this one
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 18:12:23'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:15:20'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:18:56'
),
GroupNumbered AS
(
SELECT RN = ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY DateCreated DESC)
,UserName, UserID, DateCreated
FROM SampleData
)
SELECT UserName, UserID, DateCreated
FROM GroupNumbered
WHERE RN = 1
November 9, 2011 at 10:30 am
Another way of doing it:
WITH SampleData (UserName, UserID, DateCreated) AS
(
SELECT 'Mary Poppins', 123456, '12-07-05 14:23:45'
UNION ALL SELECT 'Mary Poppins', 123456, '12-12-05 08:15:28' -- I want this one
UNION ALL SELECT 'Daffy Duck' , 234567, '01-15-05 09:27:23' -- I want this one
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 18:12:23'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:15:20'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:18:56'
),
DistinctUserID AS
(
SELECT UserID
FROM SampleData
GROUP BY UserID
)
SELECT CrsAppOutPut.UserName , OutTab.UserID , CrsAppOutPut.DateCreated
FROM DistinctUserID OutTab
CROSS APPLY (
SELECT TOP 1 UserName , DateCreated
FROM SampleData InTab
WHERE OutTab.UserID = InTab.UserID
ORDER BY InTab.DateCreated DESC
) CrsAppOutPut
November 9, 2011 at 11:20 am
So sorry to everyone for not posting properly. My thought was to post it that way, and I knew the answer would be very simple. But if not, I would post a proper insert statement in a later post if needed. I was merely in a hurry
Thanks for all, it is working great. I tried every suggestion.
Thanks again, and hope that you all have a good day
Andrew SQLDBA
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply