March 20, 2014 at 3:12 am
Hi Friends,
Create Table Sample (ID int not null primary key, RefID int , SeqNo int , Name varchar(10) )
insert into Sample
select 1, 1000, 1, 'Mike'
union
select 2, 1000, 2, 'Mikey'
union
select 3, 1000, 3, 'Michel'
union
select 4, 1001, 1, 'Carmel'
union
select 5, 1001, 2, 'Camy'
union
select 6, 1002, 1, 'Joe'
union
select 7, 1002, 3, 'Jeff'
union
select 8, 1002, 4, 'Martin'
select * from Sample
I have here sample data given.
What I want to do is, I want to check the RefID which is not having proper order of sequence number. If you see the RefID 1000, 1001 they are having properly sequence order in SeqNo field. But it is not in RefID 1002. RefID 1002 does not have proper order. It is because user has deleted a row which was having seqno 2. So i want to get what are all the RefID's are not having properly sequenced. So that I would be able to know these are all the RefID's are affected by delete statement that was done by user.
Can any one suggest me the query? I tried with self join but it did not work....
Thanks,
Charmer
March 20, 2014 at 4:46 am
There are loads of ways of doing this. Here are two.
drop Table #Sample
Create Table #Sample (ID int not null primary key, RefID int , SeqNo int , Name varchar(10) )
insert into #Sample
select 1, 1000, 1, 'Mike' union
select 2, 1000, 2, 'Mikey' union
select 3, 1000, 3, 'Michel' union
select 4, 1001, 1, 'Carmel' union
select 5, 1001, 2, 'Camy' union
select 6, 1002, 1, 'Joe' union
select 7, 1002, 3, 'Jeff' union
select 8, 1002, 4, 'Martin' union
select 9, 1002, 5, 'Sean' union
select 10, 1002, 7, 'Philip'
CREATE INDEX ix_Stuff2 ON #Sample (RefID, SeqNo)
;WITH FullSequence AS (
SELECT RefID, SeqNo = iTally.n
FROM (
SELECT RefID, MAX_SeqNo = CAST(MAX(SeqNo) AS BIGINT)
FROM #Sample
GROUP BY RefID
) d
CROSS APPLY (
SELECT TOP (d.MAX_SeqNo) n
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) d (n) -- expand to meet your requirements
) iTally
)
SELECT *
FROM FullSequence fs
EXCEPT
SELECT RefID, SeqNo FROM #Sample
SELECT d.RefID, SeqNo = d.SeqNo-1
FROM #Sample d
CROSS APPLY (
SELECT TOP 1 si.SeqNo
FROM #Sample si
WHERE si.RefID = d.RefID AND si.SeqNo < d.SeqNo
ORDER BY si.SeqNo DESC
) x
WHERE d.SeqNo > x.SeqNo + 1
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 20, 2014 at 7:02 am
Thank you Chris. It worked. First query is much better than second query.
Thanks,
Charmer
March 20, 2014 at 7:22 am
You're welcome, thanks for the feedback.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
March 20, 2014 at 8:59 am
You could also use an offset window function like LAG or LEAD.
WITH C1 AS (
select *, LAG(SeqNo, 1, SeqNo) OVER(PARTITION BY RefID ORDER BY SeqNo, ID) AS prv_SeqNo
from [Sample]
)
SELECT *
FROM C1
WHERE (SeqNo - prv_SeqNo) > 1;
An index like (RefID, SeqNo, ID) and any other column of interest in the INCLUDE option could help with the performance.
March 20, 2014 at 7:08 pm
If all you want is the RefIDs that are missing one or more row numbers, that's really easy:
WITH SampleData (ID, RefID, SeqNo, Name) AS
(
select 1, 1000, 1, 'Mike' union
select 2, 1000, 2, 'Mikey' union
select 3, 1000, 3, 'Michel' union
select 4, 1001, 1, 'Carmel' union
select 5, 1001, 2, 'Camy' union
select 6, 1002, 1, 'Joe' union
select 7, 1002, 3, 'Jeff' union
select 8, 1002, 4, 'Martin' union
select 9, 1002, 5, 'Sean' union
select 10, 1002, 7, 'Philip'
)
-- Return only RefIDs missing one or more rows
SELECT RefID
FROM
(
SELECT *, rn=ROW_NUMBER() OVER (PARTITION BY RefID ORDER BY SeqNo)
FROM SampleData
) a
WHERE rn <> SeqNo
GROUP BY RefID;
If you also want to know the SeqNos that are missing, you can treat it as a Islands to Gaps problem (explained here: The SQL of Gaps and Islands in Sequences [/url]):
WITH SampleData (ID, RefID, SeqNo, Name) AS
(
select 1, 1000, 1, 'Mike' union
select 2, 1000, 2, 'Mikey' union
select 3, 1000, 3, 'Michel' union
select 4, 1001, 1, 'Carmel' union
select 5, 1001, 2, 'Camy' union
select 6, 1002, 1, 'Joe' union
select 7, 1002, 3, 'Jeff' union
select 8, 1002, 4, 'Martin' union
select 9, 1002, 5, 'Sean' union
select 10, 1002, 7, 'Philip'
)
-- Using islands and gaps
SELECT RefID, SeqNo=MIN(SeqNo)
FROM
(
-- Find the missing seqence numbers (the gaps) from the islands
SELECT RefID, MinSeqNo, MaxSeqNo, SeqNo
,rn=ROW_NUMBER() OVER (PARTITION BY RefID ORDER BY SeqNo)/2
FROM
(
-- Create a grouping (rn) used to calculate islands
SELECT RefID, MinSeqNo=MIN(SeqNo), MaxSeqNo=MAX(SeqNo)
FROM
(
SELECT *, rn=SeqNo-ROW_NUMBER() OVER (PARTITION BY RefID ORDER BY SeqNo)
FROM SampleData
) a
GROUP BY RefID, rn
) a
CROSS APPLY
(
VALUES (MinSeqNo-1),(MaxSeqNo+1)
) b (SeqNo)
) a
GROUP BY RefID, rn
HAVING COUNT(*) = 2;
I do believe this will be faster than any of the solutions so far and specifically it is faster than using LAG (or LEAD):
The Performance of the T-SQL Window Functions [/url]
Edit: Of course, you could always just ignore the problem by fixing it with a MERGE:
http://www.sqlservercentral.com/Forums/FindPost1540518.aspx
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 21, 2014 at 6:18 pm
Dwain,
The link to the article is broken. I am interested in reading it to find out why "specifically" it is faster than using LAG / LEAD.
One thing that is handy from using these functions is that the enumeration doesn't have to start from one. It could be any enumeration.
SET NOCOUNT ON;
USE tempdb;
GO
Create Table dbo.Sample (
ID int not null primary key,
RefID int ,
SeqNo int ,
Name varchar(10)
);
GO
insert into dbo.Sample
select 1, 1000, 101, 'Mike'
union
select 2, 1000, 102, 'Mikey'
union
select 3, 1000, 103, 'Michel'
union
select 4, 1001, 11, 'Carmel'
union
select 5, 1001, 12, 'Camy'
union
select 6, 1002, 21, 'Joe'
union
select 7, 1002, 23, 'Jeff'
union
select 8, 1002, 24, 'Martin';
GO
CREATE UNIQUE INDEX idx_uq_Sample_POC ON dbo.Sample (RefID, SeqNo);
GO
WITH C1 AS (
select RefID, SeqNo, LEAD(SeqNo, 1, SeqNo) OVER(PARTITION BY RefID ORDER BY SeqNo) AS nxt_SeqNo
from [Sample]
)
SELECT DISTINCT RefID
FROM C1
WHERE (nxt_SeqNo - SeqNo) > 1;
GO
DROP TABLE dbo.Sample;
GO
As for finding the gaps, here is a simple solution using LEAD.
WITH C1 AS (
select RefID, SeqNo, LEAD(SeqNo, 1, SeqNo) OVER(PARTITION BY RefID ORDER BY SeqNo) AS nxt_SeqNo
from [Sample]
)
SELECT RefID, SeqNo + 1 AS rangestart, nxt_SeqNo - 1 AS rangeend
FROM C1
WHERE (nxt_SeqNo - SeqNo) > 1;
GO
March 21, 2014 at 6:33 pm
I think the link is fixed now. The article doesn't say why it is faster, just provides an example that shows that it does.
If you've got a gap on the leading or following points, it is simple enough to add placeholders that are less than or greater than whatever your expected end point is.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
March 25, 2014 at 5:17 am
Thank you Hunch, The query you have given is here is really nice one.
Thanks,
Charmer
Viewing 9 posts - 1 through 8 (of 8 total)
You must be logged in to reply to this topic. Login to reply