need to filter rows from a table by distinct

Question

Post reply

need to filter rows from a table by distinct

winston Smith

SSCoach

Points: 19500
More actions
May 26, 2006 at 9:28 am

#112420

i need to get all the distinct rows from a table. it works fine when i try

select distinct ( col1) from tableA

but when i try :

select distinct(col1),col2, col3 from tableA
it doesnt work.

it also doesnt wok if i try
select distinct( col1,col2,col3)from tableA.

i really need the rows from the table with a distinct col1 value. any ideas?

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply

PW-201837 SSC-Insane Points: 20805 More actions · Answer 1

See this recent, similar thread:

http://www.sqlservercentral.com/forums/shwmessage.aspx?forumid=8&messageid=282795

Distinct works across all the columns in the result set. If you want unique col1 values, but also want col2 & col3 in your resultset, you need to provide the business rules about which records to include if col1 values repeat with different col2/3

winston Smith SSCoach Points: 19500 More actions · Answer 2

i have read the post which you provided a link to, but im still a bit hazy.

heres my exact problem. i some rows in my table( not very many) are duplicated exactly.

i need to be able to 1 of each of the duplicated rows into a temp table, i can then delete all the duplicates from the main table and insert the rows back in from the temp table. i need all the data from all columns. i now realise that distinct only works on one col but how do i get the effect of a distinct for each row in this case?

PW-201837 SSC-Insane Points: 20805 More actions · Answer 3

DISTINCT syntax does not use parentheses. This should work:

SELECT DISTINCT col1,col2,col3

INTO DeDupedTable

FROM tableA

winston Smith SSCoach Points: 19500 More actions · Answer 4

the problem with that is it will select all rows in the table that are distinct, which is most rows. i only wnt the ones that are duplicates

David McFarland SSChampion Points: 11815 More actions · Answer 5

I think the confusion might be based on your definition of the terms distinct and duplicate. Both imply that the combination of all of the columns in a select list are distinct, not just some of the values, but I'm thinking that's not the case with your situation.

Are you trying to identify rows where col1 is distinct, and then return that as well as col2 and col3, or are you wanting the combination of col1, col2, and col3 to be distinct?

If the latter, PW's code should do the trick for you.

David McFarland SSChampion Points: 11815 More actions · Answer 6

Upon reading your latest message, it sounds like you want all rows where the values in col1, col2, and col3 taken together appear in more than 1 row.

Try: SELECT col1, col2, col3 FROM tableA GROUP BY col1, col2, col3 HAVING Count(1) > 1

winston Smith SSCoach Points: 19500 More actions · Answer 7

i want to return all the data from all the rows rows where col1 is not distinct. col1 isnt a primary key, but its still supposed to be distinct.

so i want col1, col2, col3,..... where col1 is not distinct

winston Smith SSCoach Points: 19500 More actions · Answer 8

here is the actual query i am running to get the data i require ( sorry its a bit long but it might demonstraight exactly what im looking to get:

select

InvoiceGuid

,CreditNoteID

,InvoiceHeaderID

,InvoiceBatchID

,BillToAccountName

,BilltoAccountCountry

,BillToAccountCountryCode

,BillToContactName

from mcsInvoice

goup

by invoicebatchId

having

count(invoiceBatchId)>1

i get this error:

Msg 156, Level 15, State 1, Line 59

Incorrect syntax near the keyword 'by'.

Can anyone tell me what the problem is here?

David McFarland SSChampion Points: 11815 More actions · Answer 9

There is no "r" in your GROUP BY statement.

It won't do what it seems like you're wanting anyway, as you aren't aggregating. Based on what I think is your intent, try this:

SELECT

InvoiceGuid

,CreditNoteID

,InvoiceHeaderID

,InvoiceBatchID

,BillToAccountName

,BilltoAccountCountry

,BillToAccountCountryCode

,BillToContactName

FROM

mcsInvoice

WHERE

InvoiceBatchID IN (

SELECT

InvoiceBatchID

FROM

mcsInvoice

GROUP BY

InvoiceBatchID

HAVING

Count(1) > 1)

winston Smith SSCoach Points: 19500 More actions · Answer 10

SELECT

InvoiceGuid

,CreditNoteID

,InvoiceHeaderID

,InvoiceBatchID

,BillToAccountName

,BilltoAccountCountry

,BillToAccountCountryCode

,BillToContactName

FROM

mcsInvoice

WHERE

InvoiceBatchID IN (

SELECT

InvoiceBatchID

FROM

mcsInvoice

GROUP BY

InvoiceBatchID

HAVING

Count(1) > 1)

i have tried this and it returns everything in my table. there are over 3000 rows in the table. i am pretty sure there are about 600 duplicates

David McFarland SSChampion Points: 11815 More actions · Answer 11

When you say distinct and duplicate, are you referring to the InvoiceBatchID? I made that assumption based on the code you pasted.

If that's the case, try "SELECT Count(Distinct InvoiceBatchID) FROM mcsInvoice" and tell us what number you get back. My code above, while untested, should do that, if InvoiceBatchID is indeed the column you want to find the dupes on.