Select distinct on multiple columns

Question

Post reply

Select distinct on multiple columns

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 1

Yeah, there was no forum for Sql server 2000.

I'm going to go cry now 😉

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 2

steveb (3/3/2009)
Thanks, this code should work orderd by Level
SELECT * FROM YourTable WHERE LEVEL IN(
SELECT MAX(LEVEL) AS LEVEL
FROM YouTable
GROUP BY [entry_id], [group_id])

That will only ever allow me to return 1 row, correct? Assuming that I add in the where entry_id=? and group_id=? because otherwise it will only ever return results with the same level, where not all results that I want have the same level.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 3

sqlcentral (3/3/2009)
Yeah, there was no forum for Sql server 2000.

Yes there is. The 2000 forums are below the 2005 forums on the forum home page.

T-SQL (2000)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

steveb. SSC-Forever Points: 46733 More actions · Answer 4

sqlcentral (3/3/2009)
steveb (3/3/2009)
Thanks, this code should work orderd by Level
SELECT * FROM YourTable WHERE LEVEL IN(
SELECT MAX(LEVEL) AS LEVEL
FROM YouTable
GROUP BY [entry_id], [group_id])
That will only ever allow me to return 1 row, correct? Assuming that I add in the where entry_id=? and group_id=? because otherwise it will only ever return results with the same level, where not all results that I want have the same level.

No, this query will return all the distinct rows in your table where the level is the highest for each entry_Id and group_id

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 5

GilaMonster (3/3/2009)
Yes there is. The 2000 forums are below the 2005 forums on the forum home page.
T-SQL (2000)

Sorry! When I was scanning the list I didn't notice the master groupings (i.e. I was only scanning under the master header for 2005)... Whoops!

Anyway.

Is the with/as syntax 2000 compliant?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

sqlcentral (3/3/2009)
That will only ever allow me to return 1 row, correct?

No. There's a group by in the subquery, so that query will return 1 row for each [entry_id], [group_id], the one with the highest level.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 7

steveb (3/3/2009)
No, this query will return all the distinct rows in your table where the level is the highest for each entry_Id and group_id

Ok, perhaps my sample data wasn't broad enough. The level field isn't unique across all rows, only across rows with the entry and group ids identical. For example several rows have a level of 1 but different entry and group ids, so this query doesn't appear to work (tried it) because the inner query returns levels that exist for the duplicated rows. Here is some more accurate sample data:

entry_id,group_id,cost,level

1001 2001 5 3

1001 2001 4 1

1001 2000 5 1

1002 2001 4 2

1002 2001 3 1

1002 2000 2 1

So in this case your inner query will return (1,2,3) which will cause the outer query to return all results.

steveb. SSC-Forever Points: 46733 More actions · Answer 8

sqlcentral (3/3/2009)
steveb (3/3/2009)
No, this query will return all the distinct rows in your table where the level is the highest for each entry_Id and group_id
Ok, perhaps my sample data wasn't broad enough. The level field isn't unique across all rows, only across rows with the entry and group ids identical. For example several rows have a level of 1 but different entry and group ids, so this query doesn't appear to work (tried it) because the inner query returns levels that exist for the duplicated rows. Here is some more accurate sample data:
entry_id,group_id,cost,level
1001 2001 5 3
1001 2001 4 1
1001 2000 5 1
1002 2001 4 2
1002 2001 3 1
1002 2000 2 1
So in this case your inner query will return (1,2,3) which will cause the outer query to return all results.

That is why i asked for a unique column that defines the order, without a primary key on your table then this is going to be difficult.

is there a reason you are avoiding the use of primary keys?

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 9

steveb (3/3/2009)
That is why i asked for a unique column that defines the order, without a primary key on your table then this is going to be difficult.
is there a reason you are avoiding the use of primary keys?

Sorry, maybe I don't understand.

This is an aggregate query that generates that list based off of several databases and conditions, so where would a primary key fit in?

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 10

sqlcentral (3/3/2009)
The level field isn't unique across all rows, only across rows with the entry and group ids identical.

Then try this

SELECT * FROM YourTable OuterTable WHERE EXISTS (

SELECT 1

FROM YouTable sub

WHERE sub.[entry_id] = OuterTable.entry_id and sub.[group_id] = OuterTable.group_id

GROUP BY [entry_id], [group_id]

HAVING MAX(sub.LEVEL) = OuterTable.level

)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

steveb. SSC-Forever Points: 46733 More actions · Answer 11

sqlcentral (3/3/2009)
steveb (3/3/2009)
That is why i asked for a unique column that defines the order, without a primary key on your table then this is going to be difficult.
is there a reason you are avoiding the use of primary keys?
Sorry, maybe I don't understand.
This is an aggregate query that generates that list based off of several databases and conditions, so where would a primary key fit in?

Okay Sorry i thought this was all in the same table

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 12

It ain't pretty, but will this get it?

-----------------------------------------------------------------

create table #temp (entry_id int ,group_id int,cost int,level int)

insert into #temp

select 1001, 2001, 5, 3 union all

select 1001, 2001, 4, 1 union all

select 1001, 2000, 5, 1 union all

select 1002, 2001, 4, 2 union all

select 1002, 2001, 3, 1 union all

select 1002, 2000, 2, 1

select t1.entry_id,t1.group_id,t1.cost,t1.level

from #temp t1

where cast(entry_ID as varchar(10))+cast(group_id as varchar(10))+cast(level as varchar(10))

IN (select top 1 cast(entry_ID as varchar(10))+cast(group_id as varchar(10))+cast(level as varchar(10))

from #temp t2

where t2.entry_id = t1.entry_id

and t2.group_id = t1.group_id

order by t2.level desc)

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 13

I guess SOMETHING did.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

sqlcentral-1032101 Old Hand Points: 314 More actions · Answer 14

I took a different approach and restructured my data so that I didn't need to do this. The with/as row_number()/partition worked beautifully and did exactly what I wanted and was fairly efficient for large sets of data, however I'm stuck with using SQL Server 2000 and every other solution is horribly inefficient and requires a massive amount of DB work to do something very simple. I also played with stored procedures (cursor with line-by-line insert to a temp table with a primary key defined over the columns that I wanted was the fastest approach, but far from elegant). I figured going back to the drawing board and changing the requirement so I didn't need to do that was a better approach. It requires a bit of duplication of data in the DB (yes, ugly) but works much better overall. Sometimes you need to trade a small increase in DB size for efficiency. Disk space is cheap and we're far more concerned with the amount of time it takes to perform the transaction, so this works out better in the long run anyway.

Thanks for all the suggestions!

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 15

Thanks for the reply. It's always good to know what the resolution was. I sent you that last example because I believe it will work on SQL 2000, but it wasn't tested against large datasets. If a redesign solved your problem, that's great. It's a good sign when thinking progresses from "How do I code a solution to this problem." to "Is there a better way to set up the database."

Have a good one 🙂

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills