May 12, 2008 at 9:51 pm
Comments posted to this topic are about the item Delete Duplicates
Jamie
May 13, 2008 at 12:23 am
Grrrreat question, Excellent answer, 'nuff said:P:P:P:P:P:P
:-PManie Verster
Developer
Johannesburg
South Africa
I can do all things through Christ who strengthens me. - Holy Bible
I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)
May 13, 2008 at 12:36 am
I'm wrong here but this is not correct question;
1 & 2 are the same ... with same condition! So the correct answer is 2 but the 1 is the same everything ... too stupid!
3 is different!
also with three options I do not have valid parse for T-SQL code!
how can you explain it!?
:hehe::w00t: :P:P:P:P:P
May 13, 2008 at 12:43 am
OHHH nooooooooooooooooo
I work just from select statement not "with numbered"
pthuuuu
nononononoonononononononono I anaylized very well what happend but without this piece of code!
:hehe:
May 13, 2008 at 2:19 am
whats happened to sql - it used to be so simple 😉
May 13, 2008 at 2:20 am
dougi - are you using sql 2000 by any chance?
May 13, 2008 at 6:21 am
I agree, this was an excellent question!
May 13, 2008 at 7:15 am
Thanks for an Excellent question really rather simply once one understands how partitioning works........... learned a lot from it ...
May 13, 2008 at 7:28 am
Gives me more stuff to figure out, thanks for the question.
May 13, 2008 at 7:36 am
A semicolon at the beginning of the statement is valid
I'm seeing a lot of this lately. I.e., statements such as:
;WITH ...
Note that the semicolon is NOT part of the WITH statement syntax. This issue only arises if the prior statement did not end with a semicolon (which it should). So the leading semicolon is terminating the prior statement.
SQL Server is starting to enforce proper statement termination. IMO, long overdue. Get into the habit of properly terminating ALL of your T-SQL statements. Even Microsoft recommends using it. From BOL:
Many code examples use a semicolon ( ; ) as a Transact-SQL statement terminator. Although the semicolon is not required, using it is considered a good practice.
May 13, 2008 at 7:37 am
I agree with the others, great question!
Q
Please take a number. Now serving emergency 1,203,894
May 13, 2008 at 10:07 am
This anwer works for the given data but wouldn't work if we inserted a row like:
insert into #new(id,keyvalue) values (1,'bb')
The stated answer assumes a duplicate is defined by the ID column and would delete the above row even thought it's unique in the table. Since no PK is defined on the table I think we must assume a duplicate is defined by multiple rows with identical values for every column. So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:
with numbered as
(
SELECT rowno=row_number() over (partition by ID, keyvalue order by ID, keyvalue),ID,keyvalue
from #new
)
delete from numbered where rowno>1
May 13, 2008 at 11:03 am
J is correct.
As for SQL, the old one will work too for this particular case:
declare @I int
select @I = (select count(convert(varchar(2),ID)+Keyvalue) from #new
group by (convert(varchar(2),ID)+Keyvalue)
having count(convert(varchar(2),ID)+Keyvalue) >1)
set @I = @I-1
set rowcount @I
delete #New where convert(varchar(2),ID)+Keyvalue
in
(select convert(varchar(2),ID)+Keyvalue from #new
group by (convert(varchar(2),ID)+Keyvalue)
having count(convert(varchar(2),ID)+Keyvalue) >1)
Regards,Yelena Varsha
May 13, 2008 at 11:05 am
So if we include both ID and keyvalue columns in the over() clause we will delete only non-unique rows:
The angle you derive with the "Partition by" clause is literally a "partition" of a particular cluster of fields defined uniquely. If you prefer, then also, use only the "order by" portion of the clause and create a numeric cluster("PK" ID ) that is dynamic [rather than static].
Select rowno=Row_Number() over(Order by mychoiceofID,2ndchoice,etcchoice), * from mytable.
Jamie
May 13, 2008 at 12:06 pm
Excellent question. I learned something useful.
One application of OVER that seems to have potential (note hedging 😉 ) is that it allows applying aggregate function to more than one different grouping in the same query. For instance,
select customerID, productID, orderDate, orderAmount,
CustomerTotal = sum(orderAmount) over (partition by customerID),
ProductTotal = sum(orderAmount) over (partition by productID)
from Orders
Any thoughts on performance?
Edited for clarity.
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply