March 29, 2012 at 3:11 pm
How can I delete everything from a table except for the results of a select statement? Like this:
delete from gwinn_hw where not in (select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital)
from gwinn_hw
group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id)
I'm just not seeing how to do this. Thanks for any ideas.
March 29, 2012 at 3:56 pm
This should work, but you might want to run this against a copy of your table first to ensure that the results are what you are looking for.
select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital) as d_vital
into #gh2
from gwinn_hw gh2
group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id
Delete
From gwinn_hw
Where Not Exists
(
select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id,d_vital
from #gh2 gh2
Where gwinn_hw.ptid = gh2.ptid and
gwinn_hw.pat_ext_id = gh2.pat_ext_id and
gwinn_hw.pat_last_name = gh2.pat_last_name and
gwinn_hw.pat_first_name = gh2.pat_first_name and
gwinn_hw.gwn_practice_id = gh2.gwn_practice_id and
gwinn_hw.kis_practice_id = gh2.kis_practice_id and
gwinn_hw.d_vital = gh2.d_vital
)
If Object_ID(N'tempdb.dbo.#gh2'N'U') is not null
Drop table #gh2
March 29, 2012 at 5:15 pm
Like so much of SQL the answer is it depends. In this case is the number of rows which will remain, a in significant number or percentage of the total number of rows in the existing table. If it is you might want to consider.
1. Create a table, lets call it gwinn_hw_2 to be an exact match of the existing table gwinn_hw. That is column sequence, data type, size etc., etc.
2. User your
(select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, Max(d_vital) from gwinn_hw
to insert the data you wish to preserve into table gwinn_hw_2.
(A most important advantage, is you can review what is in the table gwinn_hw_2 to be sure it is all that you require)
3. Drop the original table gwinn_hw.
4. Rename gwinn_hw_2 back to gwinn_hw
Or
1a. In the above where I have used gwinn_hw_2, make that a temp table named #gwinn_hw to be an exact match of the existing table gwinn_hw. That is column sequence, data type, size etc., etc.
2a. User your SELECT statement to insert the data you wish to preserve into table #gwinn_hw
(A side advantage, is you can review what is in the table #gwinn_hw to be sure it is all that you require.)
3a. Truncate the table gwinn_hw
4a. Insert the data from #gwinn_hw into gwinn_hw.
In either case test, test, and test again before executing step 3 in either of the suggestions above.
Realize when complete you should check your indexes, statistics .. etc.
March 30, 2012 at 6:28 am
Thank you for those ideas.
I was hoping to do this without a second table though. I'm not using all the fields in the table to get the max date that I need.
How could I add in the other fields, which are height and weight?
My real goal here is to get the information for the most recent date. But when I add height and weight in I get everything because that can be different on different dates.
Again thanks for your help.
March 30, 2012 at 7:28 am
Denise McMillan (3/30/2012)
Thank you for those ideas.I was hoping to do this without a second table though. I'm not using all the fields in the table to get the max date that I need.
How could I add in the other fields, which are height and weight?
My real goal here is to get the information for the most recent date. But when I add height and weight in I get everything because that can be different on different dates.
Again thanks for your help.
The above will probably change any suggestion previously posted to be irrelevant to your actual problem.
Could / would you post the table definition, some sample data, and required results.
Please click on the first link in my signature block. The article referenced includes T-SQL code to allow you to do this quickly and easily.
April 2, 2012 at 1:04 am
Is something like this what you mean to do (try it in a Test database :-))?
delete g
FROM gwinn_hw g
LEFT OUTER JOIN (
select ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id
, Max(d_vital) as d_vital
from gwinn_hw
group by ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id) x
ON x.ptid = g.ptid and x.pat_ext_id = g.pat_ext_id and x.pat_last_name = g.pat_last_name and
x.pat_first_name = g.pat_first_name and x.gwn_practice_id = g.gwn_practice_id and
x.kis_practice_id = g.kis_practice_id and x.d_vital <> g.d_vital
WHERE ptid IS NULL
Edit: I used = on d_vital but meant to use <>
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 2, 2012 at 4:52 am
My preference for this would be the use of CTEs (Common Table Expressions); I find this the easiest way to visualise a problem because you can break it into discreet testable steps. You don't need CTES, you could use nested SELECTs for co-Related sub queries.
1) Get the records you want to keep
2) get their unique keys
3) delete records whose unique key is not in this list
E.g.
WITH CTE01 AS -- Get the last date for each combo
(
SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'
FROM myTable
Group by Item_ID,height,width,weight,max(insert_date)
)
,
CTE02 AS -- get the primary key
(
Select PK_ID from myTable T
JOIN CTE01 C where T.item_ID = C.Item_ID
and T.height = C.height
and T.width = c.width
and T.weight = C.weight
and T.insert_date = C.insert_date
)
Delete from myTable T where T.PK_ID not in (Select PK_ID from CTE02)
April 2, 2012 at 7:41 am
aaron.reese (4/2/2012)
My preference for this would be the use of CTEs (Common Table Expressions); I find this the easiest way to visualise a problem because you can break it into discreet testable steps. You don't need CTES, you could use nested SELECTs for co-Related sub queries.1) Get the records you want to keep
2) get their unique keys
3) delete records whose unique key is not in this list
E.g.
WITH CTE01 AS -- Get the last date for each combo
(
SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'
FROM myTable
Group by Item_ID,height,width,weight,max(insert_date)
)
,
CTE02 AS -- get the primary key
(
Select PK_ID from myTable T
JOIN CTE01 C where T.item_ID = C.Item_ID
and T.height = C.height
and T.width = c.width
and T.weight = C.weight
and T.insert_date = C.insert_date
)
Delete from myTable T where T.PK_ID not in (Select PK_ID from CTE02)
slight rewrite of the delete statement to get rid of the sub select if its the way you want to go.
WITH CTE01 AS -- Get the last date for each combo
(
SELECT Item_ID,height,width,weight,max(insert_date) as 'insert_date'
FROM myTable
Group by Item_ID,height,width,weight,max(insert_date)
)
,
CTE02 AS -- get the primary key
(
Select PK_ID from myTable T
JOIN CTE01 C where T.item_ID = C.Item_ID
and T.height = C.height
and T.width = c.width
and T.weight = C.weight
and T.insert_date = C.insert_date
)
--Nothing changed above, included to make things easier.
Delete t from myTable T
LEFT JOIN CTE02 c
ON t.PK_ID = c.PK_ID
WHERE t.PK_ID IS NULL
EDIT: Missed /code tag
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 2, 2012 at 1:13 pm
yes, that is probably a better syntax (personally I think it is a little harder to read) but it will work for multi-part primary keys whereas the IN (SELECT...) won't
April 2, 2012 at 1:25 pm
aaron.reese (4/2/2012)
yes, that is probably a better syntax (personally I think it is a little harder to read) but it will work for multi-part primary keys whereas the IN (SELECT...) won't
i found removing sub selects when ever possible opens up more possibilities for selectivity. also makes code a little easier to read IMHO
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
April 5, 2012 at 8:17 am
Here is another possibility. Since you haven't provided test data, it's difficult to compare the various methods. This method only scans the table once, but has more logical reads.
WITH CTE AS (
SELECT ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id, d_vital
, ROW_NUMBER() OVER(
PARTITION BY ptid, pat_ext_id, pat_last_name, pat_first_name, gwn_practice_id, kis_practice_id
ORDER BY d_vital DESC) AS rn
FROM gwinn_hw
)
DELETE FROM CTE
WHERE rn > 1
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 11 posts - 1 through 10 (of 10 total)
You must be logged in to reply to this topic. Login to reply