June 6, 2011 at 2:42 am
Hello All,
I need you help for one search pattern that I want to use in the select statement. The problem is, I have table containing three values ID, Letter, and Sequence.
letter will have one letter form A-z, while Sequence will have multiple letters separated by comma. what I want is that suppose is the Sequence for the row ends with the same letter for that row then it should not be selected. but in case if the same letter is in the Sequence and another letter is followed by that letter then even after ending same as the letter that row should get selected.
I have created One sample query for this.
Create Table #Letter
(
Id INT Primary key,
Letter VARCHAR(2),
Sequence VARCHAR(250)
)
INSERT INTO #Letter Values(1,'A','A+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A')
INSERT INTO #Letter Values(2,'A','A+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A')
INSERT INTO #Letter Values(3,'A','A,A,A+,A,A,A,A,A,A,A,A,A,A,A,A')
INSERT INTO #Letter Values(4,'A','A,A,A+,A,A,A')
SELECT *
FROM #Letter O1
WHERE Letter = 'A' AND Sequence like '%[^,A][A+B-Z,]%'
Order BY Id
DROP TABLE #Letter
The Output I want is
1AA+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A
3AA,A,A+,A,A,A,A,A,A,A,A,A,A,A,A
4AA,A,A+,A,A,A
The 2nd row is not expected in the output since it has A letter at the end.
2AA+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A
Other rows are selected because they have 'A' followed by 'A+' or 'B' i.e different letter and then again 'A'.
Can any one please help me on this query.
Thank you
Yatish
June 6, 2011 at 2:57 pm
My solution makes use of a Tally table. If you're unfamiliar with those please have a look through this article: http://www.sqlservercentral.com/articles/T-SQL/62867/[/url]
Build Tally table:
--=============================================================================
-- Setup
--=============================================================================
USE tempdb --DB that everyone has where we can cause no harm
SET NOCOUNT ON --Supress the auto-display of rowcounts for appearance/speed
DECLARE @StartTime DATETIME --Timer to measure total duration
SET @StartTime = GETDATE() --Start the timer
--=============================================================================
-- Create and populate a Tally table
--=============================================================================
--===== Conditionally drop
IF OBJECT_ID('dbo.Tally') IS NOT NULL
DROP TABLE dbo.Tally
--===== Create and populate the Tally table on the fly
SELECT TOP 11000 --equates to more than 30 years of dates
IDENTITY(INT,1,1) AS N
INTO dbo.Tally
FROM Master.dbo.SysColumns sc1,
Master.dbo.SysColumns sc2
--===== Add a Primary Key to maximize performance
ALTER TABLE dbo.Tally
ADD CONSTRAINT PK_Tally_N
PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100
--===== Let the public use it
GRANT SELECT, REFERENCES ON dbo.Tally TO PUBLIC
--===== Display the total duration
SELECT STR(DATEDIFF(ms,@StartTime,GETDATE())) + ' Milliseconds duration'
Solution query:
CREATE TABLE #Letter
(
Id INT PRIMARY KEY,
Letter VARCHAR(2),
Sequence VARCHAR(250)
)
GO
INSERT INTO #Letter
VALUES (1, 'A', 'A+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A') ;
INSERT INTO #Letter
VALUES (2, 'A', 'A+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A') ;
INSERT INTO #Letter
VALUES (3, 'A', 'A,A,A+,A,A,A,A,A,A,A,A,A,A,A,A') ;
INSERT INTO #Letter
VALUES (4, 'A', 'A,A,A+,A,A,A') ;
GO
DECLARE @Letter VARCHAR(2) = 'A' ;
WITH cte(Id, sequence)
AS (
SELECT Id,
',' + sequence + ','
FROM #Letter
WHERE Letter = @Letter
AND sequence LIKE '%' + @Letter
),
cte2(row_num, Id, grade)
AS (
SELECT ROW_NUMBER() OVER (PARTITION BY l.Id ORDER BY t.N) AS row_num,
l.Id,
SUBSTRING(l.Sequence, N + 1, CHARINDEX(',', l.Sequence, N + 1) - N - 1)
FROM cte l
CROSS JOIN tempdb.dbo.Tally t
WHERE N < LEN(l.Sequence)
AND SUBSTRING(l.Sequence, N, 1) = ','
)
SELECT DISTINCT
cx2.Id
FROM cte2 cx2
WHERE (
SELECT MAX(row_num)
FROM cte2
WHERE Id = cx2.Id
AND grade != @Letter
) > (
SELECT MIN(row_num)
FROM cte2
WHERE Id = cx2.Id
AND grade = @Letter
) ;
DROP TABLE #Letter
GO
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 6, 2011 at 10:56 pm
Hi opc.three,
I am Thankful to you for your suggestion, But i would like to know if there is any other way to solve the query..i mean there is possibility that I may not use the Tally or numbers Table, so what can be other way this can be solved. Please let me know your thoughts.
if anyone know any other way to solve this please let me know.
Thank you
Yatish
June 7, 2011 at 4:46 am
Hello yatish,
I'm just curious because these kind of requirements looks more as a puzzle than a real life requirement. Could you explain briefly what's the need for this kind of filter?.
Anyway I found a solution for your query. I try to replace Letter by '?' and apply a LIKE pattern, I follow these steps:
- Double any comma in Sequence.
- Add ',,' at begin and end of Sequence.
- Now we can replace ',Letter,' by ',?,'.
- Now we can apply a LIKE pattern.
THE SQL instruction is
WITH Letter AS (
SELECT 1 AS Id, 'A' AS Letter, 'A+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A' AS Sequence
UNION SELECT 2, 'A', 'A+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 3, 'A', 'A,A,A+,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 4, 'A', 'A,A,A+,A,A,A'
)
SELECT * FROM Letter
WHERE RIGHT(Sequence, LEN(Letter)) <> Letter
OR REPLACE(',,' + replace(Sequence, ',', ',,') + ',,', ',' + Letter + ',', ',?,')
LIKE '%?,,[^?]%'
Regards,
Francesc
June 7, 2011 at 5:23 am
Hello frfernan,
Thank you for your reply. I need this for real life example only, I have modified the original fields names and added only sample example. I am creating this Sequence from History of records in database. the requirement is that the letter should be present in the Sequence at least once and it should not be at last i.e. that letter should not be at ending letter in Sequence. also there should be at least one different letter followed by that letter in the Sequence.
suppose in the example below if I am looking for letter 'A' then it should be as
1AA+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A
3AA,A,A+,A,A,A,A,A,A,A,A,A,A,A,A
4AA,A,A+,A,A,A
since in above output in 1st row 'A' is followed by 'B' this is considered in output along with 3rd and 4th rows
2AA+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A
In 2nd row 'A' is followed by 'A' only and the Sequence is end with that so it is not considered in output.
I will give another example
suppose in the example below if I am looking for letter 'C' then it should be as
1CA+,A+,A+,A+,A+,A,C,A+,A,A,A,A,A,A,A
3CA,A,A+,C,A,A,A,A,A,A,A,A,A,A,A
since in above output in 1st row 'C' is followed by 'A' this is considered in output along with 3rd row.
2CA+,A+,A+,A,A,A,A,A,A,A,A,A,A,C,C,C
4CA,A,C,C,C,C
In 2nd and 4th row 'C' is followed by 'C' only and the Sequence is end with that so it is not considered in output.
Please let me know If I need to explain this more.
Thank you for your help
Thank you
Yatish
June 7, 2011 at 6:28 am
yatish.patil (6/6/2011)
Hi opc.three,I am Thankful to you for your suggestion, But i would like to know if there is any other way to solve the query..i mean there is possibility that I may not use the Tally or numbers Table, so what can be other way this can be solved. Please let me know your thoughts.
if anyone know any other way to solve this please let me know.
Thank you
Yatish
You do not need to use a static Tally table...you have the option to create an inline Tally table using a CTE. It will add a few hundred milliseconds to a query, but there are times when the performance gain is well worth the cycles. Here is the article that shows how to create a Tally table inline: http://www.sqlservercentral.com/articles/tally+table/72993/[/url]
That said, I went down the wrong rabbit hole. I was tripped up by the different Letter-lengths so went ahead in splitting the string. I think Francesc has the best presented solution. If I am understanding your last post you can simply omit the check where the Letter ends the sequence:
WITH Letter AS (
SELECT 1 AS Id, 'A' AS Letter, 'A+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A' AS Sequence
UNION SELECT 2, 'A', 'A+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 3, 'A', 'A,A,A+,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 4, 'A', 'A,A,A+,A,A,A'
)
SELECT * FROM Letter
WHERE REPLACE(',,' + replace(Sequence, ',', ',,') + ',,', ',' + Letter + ',', ',?,')
LIKE '%?,,[^?]%'
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 7, 2011 at 7:32 am
Thanks Yatish, your explanations are clear. I have only a doubt: "A" and "A+" are different letters at all?, or there is any obscure relation between these two letters?. If not, you simply need to clarify what is the condition about last letter in Sequence to put the instruction to work.
And a small correction, I commented "Add ',,' at begin and end of Sequence" but really there is no need for a double comma at begin at end, only one comma is needed.
Regards,
Francesc
June 7, 2011 at 8:38 am
frfernan (6/7/2011)
And a small correction, I commented "Add ',,' at begin and end of Sequence" but really there is no need for a double comma at begin at end, only one comma is needed.
I think you had it right the first time. I think we need it...
WITH Letter AS (
SELECT 1 AS Id, 'A' AS Letter, 'A+,A+,A+,A+,A+,A,B,A+,A,A,A,A,A,A,A' AS Sequence
UNION SELECT 2, 'A', 'A+,A+,A+,A,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 3, 'A', 'A,A,A+,A,A,A,A,A,A,A,A,A,A,A,A'
UNION SELECT 4, 'A', 'A,A,A+,A,A,A'
UNION SELECT 5, 'A', 'A,C,A,A,A,A,A'
)
SELECT * FROM Letter
WHERE REPLACE(REPLACE(Sequence, ',', ',,') + ',,', ',' + Letter + ',', ',?,')
LIKE '%?,,[^?]%'
...otherwise row 5 will not be returned when I think it should.
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
June 8, 2011 at 3:57 am
Oh,
I didn't explain my idea correctly, where you understood
REPLACE(Sequence, ',', ',,') + ',,'
I tried to express
',' + REPLACE(Sequence, ',', ',,') + ','
, with this last expression your example works correctly.
Regards,
Francesc
June 8, 2011 at 9:03 am
The worded explanation makes more sense with an accompanying code example 😉
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply