November 9, 2011 at 9:59 am
Hello Everyone
This is yet another issue of duplicate data. This is very old data that is coming over from an old Unix and DB2 database. Just my luck.
In a table, I have duplicate rows, all except for a DateCreated columns, each of the rows has a distinct date created value. I need to select only the row, of the duplicate values, with the latest DateCreated. Some of the data does not have a duplicate value, but some does.
I can select the correct code if I select a couple times from the table, and use a couple table variables, but I am hoping to get a more efficient way
Here is a small example, if more is needed, I will create an insert query for you
Mary Poppins, 123456, 12-07-05 14:23:45
Mary Poppins, 123456, 12-12-05 08:15:28 -- I want this one
Daffy Duck , 234567, 01-15-05 09:27:23 -- I want this one
Wile E. Coyote, 345678, 03-01-05 18:12:23
Wile E. Coyote, 345678, 03-01-05 20:15:20
Wile E. Coyote, 345678, 03-01-05 20:18:56 -- I want this one
Thank You in advance for any and all help and suggestions
Andrew SQLDBA
November 9, 2011 at 10:12 am
AndrewSQLDBA (11/9/2011)
Hello EveryoneThis is yet another issue of duplicate data. This is very old data that is coming over from an old Unix and DB2 database. Just my luck.
In a table, I have duplicate rows, all except for a DateCreated columns, each of the rows has a distinct date created value. I need to select only the row, of the duplicate values, with the latest DateCreated. Some of the data does not have a duplicate value, but some does.
I can select the correct code if I select a couple times from the table, and use a couple table variables, but I am hoping to get a more efficient way
Here is a small example, if more is needed, I will create an insert query for you
Mary Poppins, 123456, 12-07-05 14:23:45
Mary Poppins, 123456, 12-12-05 08:15:28 -- I want this one
Daffy Duck , 234567, 01-15-05 09:27:23 -- I want this one
Wile E. Coyote, 345678, 03-01-05 18:12:23
Wile E. Coyote, 345678, 03-01-05 20:15:20
Wile E. Coyote, 345678, 03-01-05 20:18:56 -- I want this one
Thank You in advance for any and all help and suggestions
Andrew SQLDBA
An insert statement would make it easier to provide tested code, but I think you need something like this:
WITH data AS
(
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY NAME, Id ORDER BY date_column DESC) AS row_no
FROM
table
)
SELECT
*
FROM
DATA
WHERE
row_no = 1
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
November 9, 2011 at 10:16 am
you could do this with a cte and row_number() or an inner join
eg.
to select the rows with duplicates and get the maxdate created, if you want to include non-duplicates as well remove the having clause
Select * from YourTable a
inner join
(Select Character, Max(DateCreated) as MaxDateCreated
from YourTable
group by Character
Having count(*) > 1) b
on a.character = b.character and a.DateCreated = b.MaxDateCreated
November 9, 2011 at 10:24 am
With 582 points and 1500 visits you must have known that the sample data in form of INSERT statement will make the life easier for the answerers. I also see that you have mentioned that you will provide insert statments if needed; it IS needed, be the dataset is large or small.
That said, here is one of doing it
WITH SampleData (UserName, UserID, DateCreated) AS
(
SELECT 'Mary Poppins', 123456, '12-07-05 14:23:45'
UNION ALL SELECT 'Mary Poppins', 123456, '12-12-05 08:15:28' -- I want this one
UNION ALL SELECT 'Daffy Duck' , 234567, '01-15-05 09:27:23' -- I want this one
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 18:12:23'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:15:20'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:18:56'
),
GroupNumbered AS
(
SELECT RN = ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY DateCreated DESC)
,UserName, UserID, DateCreated
FROM SampleData
)
SELECT UserName, UserID, DateCreated
FROM GroupNumbered
WHERE RN = 1
November 9, 2011 at 10:30 am
Another way of doing it:
WITH SampleData (UserName, UserID, DateCreated) AS
(
SELECT 'Mary Poppins', 123456, '12-07-05 14:23:45'
UNION ALL SELECT 'Mary Poppins', 123456, '12-12-05 08:15:28' -- I want this one
UNION ALL SELECT 'Daffy Duck' , 234567, '01-15-05 09:27:23' -- I want this one
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 18:12:23'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:15:20'
UNION ALL SELECT 'Wile E. Coyote', 345678, '03-01-05 20:18:56'
),
DistinctUserID AS
(
SELECT UserID
FROM SampleData
GROUP BY UserID
)
SELECT CrsAppOutPut.UserName , OutTab.UserID , CrsAppOutPut.DateCreated
FROM DistinctUserID OutTab
CROSS APPLY (
SELECT TOP 1 UserName , DateCreated
FROM SampleData InTab
WHERE OutTab.UserID = InTab.UserID
ORDER BY InTab.DateCreated DESC
) CrsAppOutPut
November 9, 2011 at 11:20 am
So sorry to everyone for not posting properly. My thought was to post it that way, and I knew the answer would be very simple. But if not, I would post a proper insert statement in a later post if needed. I was merely in a hurry
Thanks for all, it is working great. I tried every suggestion.
Thanks again, and hope that you all have a good day
Andrew SQLDBA
Viewing 6 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy