May 4, 2012 at 3:41 am
Hello everyone, I'm a beginner in SQL and i have difficulty to find solution for the folowing problem:
I'd like to do a SELECT which eliminate successive rows having the same "stat" and keep only the last occurrence.
To make myself better understood, I give you this example:
My table Results
idLigne | id | statut idLigne | id | statut
L1 1 A L2 1 A
L2 1 A L6 1 B
L3 1 B L7 1 A
L4 1 B ====> L8 1 B
L5 1 B
L6 1 B
L7 1 A
L8 1 B
Thank you
May 4, 2012 at 5:55 am
Hi,
Try:
with CTE as
(
select
idLigne,
id,
statut,
ROW_NUMBER() OVER(PARTITION BY id, statut ORDER BY idLigne) AS 'RowNo'
from MyTable
)
select * from CTE where RowNo = 1
Hope this helps.
May 4, 2012 at 7:24 am
I'm guessing something like this:
/*
My table Results
idLigne | id | statut idLigne | id | statut
L1 1 A L2 1 A
L2 1 A L6 1 B
L3 1 B L7 1 A
L4 1 B ====> L8 1 B
L5 1 B
L6 1 B
L7 1 A
L8 1 B
*/
create table dbo.MyTest (
idLigne char(2),
id int,
statut char(1)
);
go
insert into dbo.MyTest
values ('L1',1,'A'),('L2',1,'A'),('L3',1,'B'),('L4',1,'B'),('L5',1,'B'),('L6',1,'B'),('L7',1,'A'),('L8',1,'B');
go
select * from dbo.MyTest;
go
-- Assumption based on data from OP: ifLigne is an ever increasing value for ordering the data
with BaseData as (
select
ROW_NUMBER() over (order by idLigne) rownum,
idLigne,
id,
statut
from
dbo.MyTest
)
select
bd1.idLigne,
bd1.id,
bd1.statut
from
BaseData bd1
left outer join BaseData bd2
on (bd1.rownum = bd2.rownum - 1
and bd1.id = bd2.id
and bd1.statut = bd2.statut)
where
bd2.rownum is null
;
go
drop table dbo.MyTest;
go
May 4, 2012 at 7:57 am
Sorry, my suggestion is completely wrong.
May 4, 2012 at 8:02 am
i was attacking it the same way lynn was and my code is very similar. i may suggest making idLigne an INT instead of a VARCHAR as that would eliminate the need for the CTE to row_number() to get nice INT's for the join math.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
May 4, 2012 at 8:07 am
Ummmm... I've not run any of the code but, just by looking, shouldn't any ordering be done in descending order so that you get the "latest" row like what the OP posted?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2012 at 8:21 am
capn.hector (5/4/2012)
i was attacking it the same way lynn was and my code is very similar. i may suggest making idLigne an INT instead of a VARCHAR as that would eliminate the need for the CTE to row_number() to get nice INT's for the join math.
My guess would be keep the cte to sequentially number the rows in case idLigne values have gaps due to deletions or such. Sort of like identity values that could get gaps.
May 4, 2012 at 8:24 am
Jeff Moden (5/4/2012)
Ummmm... I've not run any of the code but, just by looking, shouldn't any ordering be done in descending order so that you get the "latest" row like what the OP posted?
Actually Jeff, mine does return the latest values. Probably could number the rows in descending order and with a slight change in logic get the same results.
May 4, 2012 at 7:15 pm
This is one type of problem where it is perhaps easier to see the logic using SQL Server 2012 syntax:
DECLARE @MyTable AS TABLE
(
idLigne character(2) PRIMARY KEY,
id integer NOT NULL,
statut character(1) NOT NULL
);
INSERT @MyTable
(idLigne, id, statut)
VALUES
('L1',1,'A'),
('L2',1,'A'),
('L3',1,'B'),
('L4',1,'B'),
('L5',1,'B'),
('L6',1,'B'),
('L7',1,'A'),
('L8',1,'B');
SELECT
WithNextRow.idLigne,
WithNextRow.id,
WithNextRow.statut
FROM
(
-- Add the next statut in sequence to the set
SELECT
mt.idLigne,
mt.id,
mt.statut,
next_statut = LEAD(mt.statut, 1, CHAR(0)) OVER (ORDER BY mt.idLigne)
FROM @MyTable AS mt
) AS WithNextRow
WHERE
-- Show only rows where the next statut is different
-- from the current one
WithNextRow.statut <> WithNextRow.next_statut;
Without LEAD (which is new for 2012), we have to number the rows in the desired sequence, and use that to perform a self-join (linking the 'current' row to the next or previous one) as shown in Lynn's example. The new feature may be more or less efficient than the self-join in different circumstances.
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 4, 2012 at 7:42 pm
Lynn Pettis (5/4/2012)
Jeff Moden (5/4/2012)
Ummmm... I've not run any of the code but, just by looking, shouldn't any ordering be done in descending order so that you get the "latest" row like what the OP posted?Actually Jeff, mine does return the latest values. Probably could number the rows in descending order and with a slight change in logic get the same results.
I stand corrected. You might want to take a look at the actual execution plan, though. You have an accidental cross-join (returns 8*8 or 64 internal rows). I don't know if the optimizer will chose to make that go away when a larger number of rows is present.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2012 at 7:56 pm
anwar.melh (5/4/2012)
Hello everyone, I'm a beginner in SQL and i have difficulty to find solution for the folowing problem:I'd like to do a SELECT which eliminate successive rows having the same "stat" and keep only the last occurrence.
To make myself better understood, I give you this example:
My table Results
idLigne | id | statut idLigne | id | statut
L1 1 A L2 1 A
L2 1 A L6 1 B
L3 1 B L7 1 A
L4 1 B ====> L8 1 B
L5 1 B
L6 1 B
L7 1 A
L8 1 B
Thank you
There is nothing in the data you've shown to preserve the order of the rows. idLigne will make a mess of things just as soon as it gets to "L10" for sorting purposes.
Do you have a different column that will produce the correct sort order?
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2012 at 7:59 pm
Jeff Moden (5/4/2012)
Lynn Pettis (5/4/2012)
Jeff Moden (5/4/2012)
Ummmm... I've not run any of the code but, just by looking, shouldn't any ordering be done in descending order so that you get the "latest" row like what the OP posted?Actually Jeff, mine does return the latest values. Probably could number the rows in descending order and with a slight change in logic get the same results.
I stand corrected. You might want to take a look at the actual execution plan, though. You have an accidental cross-join (returns 8*8 or 64 internal rows). I don't know if the optimizer will chose to make that go away when a larger number of rows is present.
Ah... got it. If you change your LEFT OUTER JOIN to a LEFT MERGE JOIN, the accidental cross join goes away, reads drop from 40 to 6, and performance becomes comparatively lightning quick. In this case, I think it's ok to use the join hint because the ROW_NUMBER() function will produce the sorted output that a MERGE JOIN actually needs.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2012 at 8:08 pm
Jeff Moden (5/4/2012)
Ah... got it. If you change your LEFT OUTER JOIN to a LEFT MERGE JOIN, the accidental cross join goes away, reads drop from 40 to 6, and performance becomes comparatively lightning quick. In this case, I think it's ok to use the join hint because the ROW_NUMBER() function will produce the sorted output that a MERGE JOIN actually needs.
I prefer OPTION (MERGE JOIN) over a join hint. Using a join hint adds an implied FORCE ORDER too. That may not matter here, but it will in other queries. That said, using MERGE here does add extra sorts to the plan, which may not be desirable. The optimizer fails to recognize that ROW_NUMBER produces a unique, ordered stream, so sorts are introduced and the merge join is implemented as many-to-many, which requires a work table. There is an open Connect item for this: http://connect.microsoft.com/SQLServer/feedback/details/347442/optimizer-does-not-remove-unnessecary-distinct-in-counjunction-with-row-count
Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi
May 4, 2012 at 9:32 pm
SQL Kiwi (5/4/2012)
Jeff Moden (5/4/2012)
Ah... got it. If you change your LEFT OUTER JOIN to a LEFT MERGE JOIN, the accidental cross join goes away, reads drop from 40 to 6, and performance becomes comparatively lightning quick. In this case, I think it's ok to use the join hint because the ROW_NUMBER() function will produce the sorted output that a MERGE JOIN actually needs.I prefer OPTION (MERGE JOIN) over a join hint. Using a join hint adds an implied FORCE ORDER too. That may not matter here, but it will in other queries. That said, using MERGE here does add extra sorts to the plan, which may not be desirable. The optimizer fails to recognize that ROW_NUMBER produces a unique, ordered stream, so sorts are introduced and the merge join is implemented as many-to-many, which requires a work table. There is an open Connect item for this: http://connect.microsoft.com/SQLServer/feedback/details/347442/optimizer-does-not-remove-unnessecary-distinct-in-counjunction-with-row-count
Damn it. I was in a hurry and didn't scroll far enough to the right on the execution plan. :blush: You're absolutely correct about the extra sorts.
I think the extra sorts are absolutely worth it, though. They get rid of the accidental cross join, nearly an order of magnitude of reads, and runs several orders of magnitude faster according to priofiler.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 4, 2012 at 9:36 pm
To continue, both queries use a "Work Table". The one with the MERGE hint uses it a whole lot less, though.
[font="Courier New"]Warning: The join order has been enforced because a local join hint is used.
(4 row(s) affected)
Table 'Worktable'. Scan count 1, logical reads 31, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTest'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(4 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'MyTest'. Scan count 2, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.[/font]
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 1 through 15 (of 22 total)
You must be logged in to reply to this topic. Login to reply