March 3, 2009 at 9:31 am
Ok, after beating my head for a day there has GOT to be an easy way of doing this. MySQL can do it in about a dozen different ways, but MSSQL seems to not be able to.
I have an aggregate query which piles through a bunch of databases and returns a result set like so (simplified):
entry_id,group_id,cost
1001 2001 5
1001 2001 4
1001 2000 5
1002 2001 3
1002 2000 2
now all I need to do is skip the second row so that I can use the cost to calculate something based on transactions. Very simple! Basically generate distinct results on entry_id and group_id but not on cost. Can't figure it out.
I've tried distinct, but it works on all rows. This is a simplified result set, so using min/max as aggregates to get around the distinct limitation won't work because it won't choose an entire row. I've tried creating a temporary table with unique constraint, but can't find a way to get it to just insert the rows that don't fail. There doesn't appear to be a first aggreate function. I've tried doing a where not exists select but because of how complicated the initial query is it becomes a huge mess.
All I need to do is return the result set with distinct entry_id,group_id columns by choosing only the first result that matches in the result set.
There has got to be a way to do this, right?
Of course, to make matters worse, I have to use SQL Server 2000.
Thanks in advance for your assistance!
March 3, 2009 at 9:34 am
Do you have a primary key to determine the order?
March 3, 2009 at 9:37 am
Yes, the results are returned in such an order that the first result is the one that I want to keep, any subsequent results with the same values across those two columns can be ignored or discarded.
March 3, 2009 at 9:39 am
If you can let me know what the name of the order cokumn is then i can help you ..
If not then this can be using ROW_ORDER()
March 3, 2009 at 9:43 am
I did a blog once on something like this - would this help any?
March 3, 2009 at 9:54 am
Ok, let's assume there is another column:
entry_id,group_id,cost,level
1001 2001 5 7
1001 2001 4 5
1001 2000 5 8
1002 2001 3 2
1002 2000 2 6
for any given set of rows where the entry and group ids are equal I need to keep the row that has the highest level. The results will always be returned so that for any given set of matching entry and group ids the first record is the one I want to keep.
March 3, 2009 at 9:55 am
sqlcentral (3/3/2009)
Yes, the results are returned in such an order that the first result is the one that I want to keep, any subsequent results with the same values across those two columns can be ignored or discarded.
Don't count on the order being consistent unless you have an ORDER BY which is based on a unique combination (which will guarantee the rows will always be returned in a consistent order).
As of right now - like has been mentioned before- we're missing one column (to specify the order) in order to generate an example.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
March 3, 2009 at 9:56 am
You need a way to fully define your ordering to give you a "first".
See if either of these help
with cte as (
select entry_id,group_id,cost,
row_number() over(partition by entry_id,group_id order by level desc) as rn
from mytable)
select entry_id,group_id,cost
from cte
where rn=1
select a.entry_id,a.group_id,a.cost
from mytable a
where not exists (select * from mytable b
where b.entry_id=a.entry_id
and b.group_id=a.group_id
and b.level>a.level)
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537March 3, 2009 at 9:56 am
Lee: Thanks, that looks neat but I shudder while thinking how it would apply for non-trivial result sets. That looks like a lot of work for a relatively simple problem.
I can't believe that there is no way to do an insert into a table and have it skip failed rows instead of trash the entire insert operation.
March 3, 2009 at 10:00 am
sqlcentral (3/3/2009)
Lee: Thanks, that looks neat but I shudder while thinking how it would apply for non-trivial result sets. That looks like a lot of work for a relatively simple problem.I can't believe that there is no way to do an insert into a table and have it skip failed rows instead of trash the entire insert operation.
There is a very simple way, you just need to tell us what the unique column is that defines the order of your data.
March 3, 2009 at 10:13 am
Steveb: I already added another post with a column called level.
March 3, 2009 at 10:14 am
Mark: the row_number() over partition looks like it's working!!
Please tell me that is Sql Server 2000 compliant 🙂 (my test machine is Sql 2005 Express)
March 3, 2009 at 10:19 am
sqlcentral (3/3/2009)
Mark: the row_number() over partition looks like it's working!!Please tell me that is Sql Server 2000 compliant 🙂 (my test machine is Sql 2005 Express)
Unfortunately, Row_number() won't work on sql 2000.
March 3, 2009 at 10:22 am
sqlcentral (3/3/2009)
Please tell me that is Sql Server 2000 compliant 🙂 (my test machine is Sql 2005 Express)
Nope. 2005 and higher only. You did post in a 2005 forum.
Doing it in 2000 is a lot more tricky and usually involves temp tables and identity columns.
See if Mark's othe query helps.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
March 3, 2009 at 10:26 am
sqlcentral (3/3/2009)
Steveb: I already added another post with a column called level.
Thanks, this code should work orderd by Level
SELECT * FROM YourTable WHERE LEVEL IN(
SELECT MAX(LEVEL) AS LEVEL
FROM YouTable
GROUP BY [entry_id], [group_id])
Viewing 15 posts - 1 through 15 (of 29 total)
You must be logged in to reply to this topic. Login to reply