April 16, 2010 at 8:02 am
I am working on a database where we are storing data on multiple servers (SQL 2005), I am importing the data from both servers into one table on one server. I have been reading up on how to delete truly duplicate records (where all fields match), but this doesn't seem to take into account the date.
sample data set:
machine name date record_id (primary key)
machine 1 4/12/2010 1
machine 1 4/14/2010 2
machine 2 4/12/2010 3
machine 3 4/12/2010 4
I tried using MAX [date] to get only the most recent date, but I think because the records are "unique", a record for machine 1 shows up twice. long story short, I want to delete or ignore "older" records for each machine, regardless of how any other identifier.
Help?
April 16, 2010 at 8:14 am
sort in descending order by unique-key combination of columns and date
keep top 1 or each group defined by you unique-key combination of columns.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.April 17, 2010 at 5:46 am
Will the below code help you buddy??
WITH MAX_DATE_FOR_A_MACHINE_CTE(MACHINE_NAME, MAX_DATE)
AS
(
SELECT
NAME
, MAX(DATE) MAX_DATE
FROM
YOUR_TABLE
GROUP BY
NAME
, DATE
)
DELETE Y_T
FROM
YOUR_TABLE Y_T
INNER JOIN
MAX_DATE_FOR_A_MACHINE_CTE CTE
WHERE
(
Y_T.NAME = CTE.NAME AND
Y_T.DATE = CTE.MAX_DATE
)
Tell us if this worked for you mate..
Cheers!!
C'est Pras!
~edit :- Aliased the max date column
P.S: I am writing this code from home,so i have not tested it,please test it and post the results buddy!
April 20, 2010 at 4:55 pm
COldCoffee (4/17/2010)
Will the below code help you buddy??
WITH MAX_DATE_FOR_A_MACHINE_CTE(MACHINE_NAME, MAX_DATE)
AS
(
SELECT
NAME
, MAX(DATE) MAX_DATE
FROM
YOUR_TABLE
GROUP BY
NAME
, DATE
)
DELETE Y_T
FROM
YOUR_TABLE Y_T
INNER JOIN
MAX_DATE_FOR_A_MACHINE_CTE CTE
WHERE
(
Y_T.NAME = CTE.NAME AND
Y_T.DATE = CTE.MAX_DATE
)
Tell us if this worked for you mate..
Cheers!!
C'est Pras!
~edit :- Aliased the max date column
P.S: I am writing this code from home,so i have not tested it,please test it and post the results buddy!
This is the method you would want to employ - test it out and make sure it works before running it in prod. Please let us know as well if it worked.
ColdCoffee - in your sig line, Je ne suis pas au currant avec le mot 'Pras,' qu'est-ce que cela veut dire?
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
April 20, 2010 at 7:43 pm
CirquedeSQLeil (4/20/2010)
COldCoffee (4/17/2010)
Cheers!!C'est Pras!
ColdCoffee - in your sig line, Je ne suis pas au currant avec le mot 'Pras,' qu'est-ce que cela veut dire?
Jason, Il est la forme abrégée de mon nom. Mon nom est Prasanna et je suis originaire de l'Inde!!! 🙂 😉 J'ai appris le français pendant six ans!!
April 20, 2010 at 8:45 pm
COldCoffee (4/20/2010)
CirquedeSQLeil (4/20/2010)
COldCoffee (4/17/2010)
Cheers!!C'est Pras!
ColdCoffee - in your sig line, Je ne suis pas au currant avec le mot 'Pras,' qu'est-ce que cela veut dire?
Jason, Il est la forme abrégée de mon nom. Mon nom est Prasanna et je suis originaire de l'Inde!!! 🙂 😉 J'ai appris le français pendant six ans!!
Oh voila, ca fais du sens.
Merci beaucoup.
Jason...AKA CirqueDeSQLeil
_______________________________________________
I have given a name to my pain...MCM SQL Server, MVP
SQL RNNR
Posting Performance Based Questions - Gail Shaw[/url]
Learn Extended Events
April 20, 2010 at 10:13 pm
CirquedeSQLeil (4/20/2010)
COldCoffee (4/20/2010)
CirquedeSQLeil (4/20/2010)
COldCoffee (4/17/2010)
Cheers!!C'est Pras!
ColdCoffee - in your sig line, Je ne suis pas au currant avec le mot 'Pras,' qu'est-ce que cela veut dire?
Jason, Il est la forme abrégée de mon nom. Mon nom est Prasanna et je suis originaire de l'Inde!!! 🙂 😉 J'ai appris le français pendant six ans!!
Oh voila, ca fais du sens.
Merci beaucoup.
De rien, mon ami! 😉
April 20, 2010 at 10:50 pm
This should help.
Declare @vTab Table (machinename varchar(10),dateRec datetime,recID int)
Insert into @vTab
Select 'machine 1', '4/14/2010', 1 Union ALL
Select 'machine 1' ,'4/12/2010', 2 Union ALL
Select 'machine 2' ,'4/12/2010', 3 Union ALL
Select 'machine 2' ,'4/15/2010', 3 Union ALL
Select 'machine 3' ,'4/12/2010', 4
;with wcte as (
Select machinename,daterec,row_number() over (partition by machinename order by dateRec desc) as rno
from @vTab
) --Select * from wcte
Delete from wcte where rno > 1
Plz try on Test data.
Atif Sheikh
April 22, 2010 at 2:43 pm
Atif Sheikh (4/20/2010)
Declare @vTab Table (machinename varchar(10),dateRec datetime,recID int)
Insert into @vTab
Select 'machine 1', '4/14/2010', 1 Union ALL
Select 'machine 1' ,'4/12/2010', 2 Union ALL
Select 'machine 2' ,'4/12/2010', 3 Union ALL
Select 'machine 2' ,'4/15/2010', 3 Union ALL
Select 'machine 3' ,'4/12/2010', 4
;with wcte as (
Select machinename,daterec,row_number() over (partition by machinename order by dateRec desc) as rno
from @vTab
) --Select * from wcte
Delete from wcte where rno > 1
I have just done something similar to get rid of duplicates. My first response would have been similar to Atif's above. Like this (which I no longer think is the way to go) :
Declare @vTab2 Table (machinename varchar(10),dateRec datetime,recID int)
Insert into @vTab2
Select 'machine 1', '4/14/2010', 1 Union ALL
Select 'machine 1' ,'4/12/2010', 2 Union ALL
Select 'machine 2' ,'4/12/2010', 3 Union ALL
Select 'machine 2' ,'4/15/2010', 3 Union ALL
Select 'machine 3' ,'4/12/2010', 4
;with wcte as (
Select machinename, daterec
,rank() over (partition by machinename order by dateRec desc) as rno
from @vTab2
)
, cteDups as (
SELECT wcte.machinename, wcte.daterec from wcte where rno > 1
)
DELETE vT2
FROM cteDups
INNER JOIN
@vTab2 vT2
ON cteDups.machinename = vT2.machinename
AND cteDups.daterec = vT2.daterec
I think Atif's code is better because it avoids the join back to the original table I would have used.
Also I think you could just get away with doing the following in Atif's version and not include the machinename,daterec in the select:
;with wcte as (
Select row_number() over (partition by machinename order by dateRec desc) as rno
from @vTab
) --Select * from wcte
Delete from wcte where rno > 1
But if you need to report on the rows you are deleting which was part of a requirement I had you might do this:
Declare @vTab3 Table (machinename varchar(10),dateRec datetime,recID int)
Declare @vDelTab Table (machinename varchar(10),dateRec datetime,recID int)
Insert into @vTab3
Select 'machine 1', '4/14/2010', 1 Union ALL
Select 'machine 1' ,'4/12/2010', 2 Union ALL
Select 'machine 2' ,'4/12/2010', 3 Union ALL
Select 'machine 2' ,'4/15/2010', 3 Union ALL
Select 'machine 3' ,'4/12/2010', 4
;WITH wcte AS (
SELECT machinename, dateRec, recID
,rank() OVER (PARTITION BY machinename ORDER BY dateRec DESC) AS rno
FROM @vTab3
)
DELETE wcte
OUTPUT DELETED.machinename, DELETED.dateRec, DELETED.recID
INTO @vDelTab (machinename, dateRec, recID)
WHERE rno > 1
SELECT * FROM @vTab3
SELECT * FROM @vDelTab
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply