April 3, 2015 at 6:51 am
I have writtena da query to search for a string in an expression by the number of it's appearance. Script is like this:
DECLARE @Expression VARCHAR(8000) = 'abcd_e_fgh',
@SearchString VARCHAR(10)= '_',
@OccuranceNumber SMALLINT = 1
DECLARE @SearchIndex INT = 0, @SearchIndexPrevious INT = 0, @Sno INT = 0
WHILE @Sno < @OccuranceNumber BEGIN
SELECT @SearchIndex = CHARINDEX(@SearchString, @Expression, @SearchIndex + 1)
IF @SearchIndexPrevious >= @SearchIndex BEGIN
SELECT @SearchIndex = -1
END ELSE BEGIN
SELECT @SearchIndexPrevious = @SearchIndex
END
SELECT @SearchIndexPrevious = @SearchIndex
SELECT @Sno = @Sno + 1
END
SELECT @SearchIndex , @SearchIndexPrevious
Here i'm trying to search "_" in expression "abcd_e_fgh" where it is appearing for first time. it gives me 5 correctly. Now when i change the @OccurenceNumber to 2 or 3, it gives correct values 7 and -1 respectively. However now when i change it to 4, it gives me 5. So when it's trying to check for fifth appearance of "_", it's not actually giving 0 or -1 but repeating the value 5. Could anyone see anything wrong in the script or is it sql's strange behaviour ?
April 3, 2015 at 7:17 am
Slightly different approach
USE tempdb;
GO
SET NOCOUNT ON;
GO
DECLARE @Expression VARCHAR(8000) = 'abcd_e_fgh';
DECLARE @SearchString VARCHAR(10) = '_';
DECLARE @OccuranceNumber INT = 1;
;WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
, NUMS(N) AS (SELECT TOP(LEN(@Expression)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4)
,PARSED_SET AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY (SELECT NULL)
) AS OCC_NO
,NM.N
FROM NUMS NM
WHERE SUBSTRING(@Expression, NM.N, LEN(@SearchString)) = @SearchString
)
SELECT
PS.OCC_NO
,PS.N AS POS
FROM PARSED_SET PS
WHERE PS.OCC_NO = @OccuranceNumber;
Results
OCC_NO POS
------- ----
1 5
April 3, 2015 at 7:19 am
Hi Eirikur, Thanks but this was a question with just an example. Actually it is a generic one. String could be anything and not just "abcd_e_fgh".
Though i've just sorted out the issue.
April 3, 2015 at 7:20 am
sqlnaive, I have just fired your script through the sql debugger, It resets the index to -1 after reaching the end of the string, so your algorithm starts again looking for the target. As an example, try it with 7
April 3, 2015 at 7:21 am
DECLARE @Expression VARCHAR(8000) = 'abcd_e_fgh',
@SearchString VARCHAR(10)= '_',
@OccuranceNumber SMALLINT = 5
DECLARE @SearchIndex INT = 0, @SearchIndexPrevious INT = 0, @Sno INT = 0
WHILE @Sno < @OccuranceNumber BEGIN
SELECT @SearchIndex = CHARINDEX(@SearchString, @Expression, @SearchIndex + 1)
IF @SearchIndexPrevious >= @SearchIndex BEGIN
--SELECT @SearchIndex = -1
BREAK
END ELSE BEGIN
SELECT @SearchIndexPrevious = @SearchIndex
END
SELECT @SearchIndexPrevious = @SearchIndex
SELECT @Sno = @Sno + 1
END
SELECT @SearchIndex , @SearchIndexPrevious
April 3, 2015 at 7:22 am
Yeah DouglasH, you are correct. I sorted out that point and bingo. Was not that complex but seems I need a coffee break. Was stuck at this for some time.
April 3, 2015 at 7:26 am
Looks like you solved it yourself by putting the question out there. It's a pretty good strategy when you're stuck. Well done.
April 3, 2015 at 7:28 am
sqlnaive (4/3/2015)
Hi Eirikur, Thanks but this was a question with just an example. Actually it is a generic one. String could be anything and not just "abcd_e_fgh".Though i've just sorted out the issue.
Just turn the code into a inlineable table value function
CREATE FUNCTION dbo.ITVFN_EE_LOCATE_PATTERN_IN_STRING
(
@Expression VARCHAR(8000)
,@SearchString VARCHAR(10)
,@OccuranceNumber INT
)
RETURNS TABLE
WITH SCHEMABINDING
RETURN
WITH T(N) AS (SELECT N FROM (VALUES (NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL)) AS X(N))
, NUMS(N) AS (SELECT TOP(LEN(@Expression)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM T T1,T T2,T T3,T T4)
,PARSED_SET AS
(
SELECT
ROW_NUMBER() OVER
(
ORDER BY (SELECT NULL)
) AS OCC_NO
,NM.N
FROM NUMS NM
WHERE SUBSTRING(@Expression, NM.N, LEN(@SearchString)) = @SearchString
)
SELECT
PS.OCC_NO
,PS.N AS POS
FROM PARSED_SET PS
WHERE PS.OCC_NO = @OccuranceNumber;
GO
April 3, 2015 at 9:32 am
It's worth noting SQLNAIVE, that Eirikur's solution will perform about 3X faster.
I took your solution and put it into a scalar valued function like so:
CREATE FUNCTION dbo.SVF_LOCATE_PATTERN_IN_STRING
(
@Expression VARCHAR(8000),
@SearchString VARCHAR(10),
@OccuranceNumber SMALLINT
)
RETURNS INT
AS
BEGIN
DECLARE @SearchIndex INT = 0, @SearchIndexPrevious INT = 0, @Sno INT = 0
WHILE @Sno < @OccuranceNumber BEGIN
SELECT @SearchIndex = CHARINDEX(@SearchString, @Expression, @SearchIndex + 1)
IF @SearchIndexPrevious >= @SearchIndex BEGIN
--SELECT @SearchIndex = -1
BREAK
END ELSE BEGIN
SELECT @SearchIndexPrevious = @SearchIndex
END
SELECT @SearchIndexPrevious = @SearchIndex
SELECT @Sno = @Sno + 1
END
RETURN @SearchIndexPrevious
END
GO
Eirikur's excellent solution is a Table Valued function so the syntax is slightly different but the results are the same if you run both like so:
DECLARE
@Expression VARCHAR(8000) = 'xxx123dddrrreee333123',
@SearchString VARCHAR(10) = '2',
@OccuranceNumber INT = 2;
SELECT *
FROM dbo.ITVFN_EE_LOCATE_PATTERN_IN_STRING(@Expression, @SearchString, 2) ;
SELECT @OccuranceNumber AS OCC_NO, dbo.SVF_LOCATE_PATTERN_IN_STRING(@Expression, @SearchString, 2) AS POS;
GO
Results:
OCC_NO POS
----------- -----------
2 20
OCC_NO POS
----------- -----------
2 20
To demonstrate the performance advantage that Eirikur's solution provides I created a 1,000,000 row test:
IF OBJECT_ID('tempdb..#testdata') IS NOT NULL DROP TABLE #testdata;
GO
CREATE TABLE #testdata (td varchar(36) NOT NULL);
INSERT INTO #testdata SELECT TOP (1000000) newid() FROM sys.all_columns a, sys.all_columns b;
GO
SET NOCOUNT ON;
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
DECLARE @x int;
SET STATISTICS TIME ON;
PRINT 'loop version:'
SELECT @x = dbo.SVF_LOCATE_PATTERN_IN_STRING(td, 'a', 2)
FROM #testdata
PRINT 'EE version:'
SELECT @x = pos
FROM #testdata
CROSS APPLY dbo.ITVFN_EE_LOCATE_PATTERN_IN_STRING(td, 'a', 2);
SET STATISTICS TIME OFF;
GO
Results:
loop version:
SQL Server Execution Times:
CPU time = 7859 ms, elapsed time = 8761 ms.
EE version:
SQL Server Execution Times:
CPU time = 10109 ms, elapsed time = 2796 ms.
The CPU time is higher because Erikikur's function creates a parallel plan but, as you can see, it resolves the query about 3 times faster.
-- Itzik Ben-Gan 2001
April 3, 2015 at 10:02 am
Thanks Alan for this, I was about to do the same when I saw your post, good job!
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy