June 22, 2017 at 9:21 am
Hi,
I have this table, column, and string:
Table = audit_log
Column named = request_u
Example column value = http://stable-test.test.net/api/files/fs.testcollection/TP-0000000.pdf/download/
What I need to do is return the text between the 5th and 6th /. In the above example the returned text will be 'fs.testcollection'.
The column values and length may change but it will always be the text between the 5th and 6th / (forward slash) I need to return.
How do I achieve this?
Thanks in advance.
June 22, 2017 at 9:27 am
You need to use a splitter function to split the string into its individual elements.
John
June 22, 2017 at 9:30 am
You could use the Delimited8KSplit. Then it becomes as trivial as:
SELECT DS.Item
FROM YourTable YT
CROSS APPLY dbo.Delimited8KSplit (YT.YourColumn, '/') DS
WHERE DS.ItemNumber = 6;
This would return the value "fs.testcollection", using your example url.
Thom~
Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
Larnu.uk
June 22, 2017 at 9:57 am
I created a function called SubstringBetween8K which was designed for exactly this type of problem. It uses NGrams8K. Here's the function:
CREATE FUNCTION [dbo].[substringBetween8K]
(
@string varchar(8000),
@start tinyint,
@stop tinyint,
@delimiter char(1)
)
/*****************************************************************************************
Purpose:
Takes in input string (@string) and returns the text between two instances of a delimiter
(@delimiter); the location of the delimiters is defined by @start and @stop.
For example: if @string = 'xx.yy.zz.abc', @start=1, @stop=3, and @delimiter = '.' the
function will return the text: yy.zz; this is the text between the first and third
instance of "." in the string "xx.yy.zz.abc".
Compatibility:
SQL Server 2008+
Syntax:
--===== Autonomous use
SELECT sb.token, sb.position, sb.tokenLength
FROM dbo.substringBetween8K(@string, @start, @stop, @delimiter); sb;
--===== Use against a table
SELECT sb.token, sb.position, sb.tokenLength
FROM SomeTable st
CROSS APPLY dbo.substringBetween8K(st.SomeColumn1, 1, 2, '.') sb;
Parameters:
@string = varchar(8000); Input string to parse
@delimiter = char(1); this is the delimiter use to determine where the output starts/ends
@start = tinyint; the first instance of @delimiter to search for; this is where the
output should start. When @start is 0 then the function will return
everything from the beginning of @string until @end.
@stop = tinyint; the last instance of @delimiter to search for; this is where the
output should end. When @end is 0 then the function will return everything
from @start until the end of the string.
Return Types:
Inline Table Valued Function returns:
token = varchar(8000); the substring between the two instances of @delimiter defined by
@start and @stop
position = smallint; the location of where the substring begins
tokenlength = length of the return token
---------------------------------------------------------------------------------------
Developer Notes:
1. Requires NGrams8K. The code for NGrams8K can be found here:
http://www.sqlservercentral.com/articles/Tally+Table/142316/
2. This function is what is referred to as an "inline" scalar UDF." Technically it's an
inline table valued function (iTVF) but performs the same task as a scalar valued user
defined function (UDF); the difference is that it requires the APPLY table operator
to accept column values as a parameter. For more about "inline" scalar UDFs see this
article by SQL MVP Jeff Moden: http://www.sqlservercentral.com/articles/T-SQL/91724/
and for more about how to use APPLY see the this article by SQL MVP Paul White:
http://www.sqlservercentral.com/articles/APPLY/69953/.
Note the above syntax example and usage examples below to better understand how to
use the function. Although the function is slightly more complicated to use than a
scalar UDF it will yield notably better performance for many reasons. For example,
unlike a scalar UDFs or multi-line table valued functions, the inline scalar UDF does
not restrict the query optimizer's ability generate a parallel query execution plan.
3. dbo.substringBetween8K is deterministic; for more about deterministic and
nondeterministic functions see https://msdn.microsoft.com/en-us/library/ms178091.aspx
Examples:
DECLARE @string varchar(8000) = '123.ABC456.333.222.3333XXX.$$$'
-- beginning of string to 2nd delimiter, 2nd delimiter to end of the string
SELECT '0, 2', * FROM dbo.substringBetween8K(@string,0,2, '.') UNION ALL
SELECT '2, 0', * FROM dbo.substringBetween8K(@string,2,0, '.') UNION ALL
-- Between the 1st & 2nd, then 2nd & 5th delimiters
SELECT '1, 2', * FROM dbo.substringBetween8K(@string,1,2, '.') UNION ALL
SELECT '2, 5', * FROM dbo.substringBetween8K(@string,2,5, '.') UNION ALL
-- dealing with NULLS, delimiters that don't exist and when @first = @Last
SELECT '2, 10', * FROM dbo.substringBetween8K(@string,2,10,'.') UNION ALL
SELECT '1, NULL',* FROM dbo.substringBetween8K(@string,1,NULL,'.') UNION ALL
SELECT '1, 1', * FROM dbo.substringBetween8K(@string,1,NULL,'.');
---------------------------------------------------------------------------------------
Revision History:
Rev 00 - 20160720 - Initial Creation - Alan Burstein
****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH
chars AS
(
SELECT instance = 0, position = 0 WHERE @start = 0
UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY position), position
FROM dbo.NGrams8k(@string,1)
WHERE token = @delimiter
UNION ALL
SELECT -1, DATALENGTH(@string)+1 WHERE @stop = 0
)
SELECT token =
SUBSTRING
(
@string,
MIN(position)+1,
NULLIF(MAX(position),MIN(position)) - MIN(position)-1
),
position = CAST(
CASE WHEN NULLIF(MAX(position),MIN(position)) - MIN(position)-1 > 0
THEN MIN(position)+1 END AS smallint),
tokenLength = CAST(NULLIF(MAX(position),MIN(position)) - MIN(position)-1 AS smallint)
FROM chars
WHERE instance IN (@start, NULLIF(@stop,0), -1);
And here's how you'd use it:
DECLARE @x varchar(100) = 'http://stable-test.test.net/api/files/fs.testcollection/TP-0000000.pdf/download/'
SELECT * FROM dbo.substringBetween8K(@x, 5, 6, '/');
-- Itzik Ben-Gan 2001
June 22, 2017 at 10:23 am
Thanks for your replies.
I'm at a basic level using SQL so forgive me for my novice questions.
The database I'll be pulling from is not mine. I only have read-access. With this said, can I still use your function Alan? or the Delimited8kSplit function?
As I'm seeking to 'SELECT' the data only and do all my functionality in the SELECT statement, I guess I cannot achieve what I wish to? Hence my aim of using SUBSTRING and CHARINDEX functions.
Thanks.
June 22, 2017 at 10:52 am
Decomposing the splitter function might not give the best performance, but it could help in your case.
--Create sample data
CREATE TABLE #Sample ( SomeColumn varchar(1000))
INSERT INTO #Sample
VALUES('Part 1/Part 2/Part 3/Part 4/Part 5/Part 6/Part 7/Part 8/Part 9');
--The solution starts here
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b) --10E+4 or 10,000 rows max
SELECT *
FROM #Sample
CROSS APPLY(
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(SomeColumn,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
FROM (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM (
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(SomeColumn,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
) t(N)
WHERE (SUBSTRING(SomeColumn,t.N,1) = '/' OR t.N = 0)
) s(N1)
)s
WHERE s.ItemNumber = 6;
June 22, 2017 at 11:58 am
If you want to do it with Substring and CHARINDEX here you go.
<SQL>
DECLARE @testVARCHAR(500);
SET @test-2 = 'http://stable-test.test.net/api/files/fs.testcollection/TP-0000000.pdf/download/';
SELECT SUBSTRING(@test,
CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2) + 1) + 1) +1) + 1) + 1,
(CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2) + 1) + 1) +1) + 1) +1)
- CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2, CHARINDEX('/', @test-2) + 1) + 1) +1) + 1)) - 1)
</SQL>
-------------------------------------------------------------
we travel not to escape life but for life not to escape us
Don't fear failure, fear regret.
June 22, 2017 at 12:10 pm
chocthree - Thursday, June 22, 2017 10:23 AMThanks for your replies.
I'm at a basic level using SQL so forgive me for my novice questions.
The database I'll be pulling from is not mine. I only have read-access. With this said, can I still use your function Alan? or the Delimited8kSplit function?
As I'm seeking to 'SELECT' the data only and do all my functionality in the SELECT statement, I guess I cannot achieve what I wish to? Hence my aim of using SUBSTRING and CHARINDEX functions.
Thanks.
If you can't create new SQL objects such as functions then you would have to extract the logic and use it inline as Luis demonstrated. Luis beat me to it - his answer is right on the money. Note that the logic he used comes from Eirikur Eiriksson's DelimitedSplit8K_LEAD a modified (and slightly faster) version of DelimitedSplit8K that uses the LEAD function. I put together how you would accomplish the same task extracting the logic from DelimitedSplit8K; I would recommend Luis' solution as it will perform a little better here's what I put together in case anyone was curious how to do this on Pre-2012 systems.
Using the sample data Luis created:SELECT *
FROM #Sample t
CROSS APPLY
(
SELECT
SUBSTRING(t.SomeColumn, position+1, CHARINDEX('/',t.SomeColumn,position+1) - (position+1))
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY iTally.N), N
FROM
(
SELECT 1 UNION ALL
SELECT TOP (ISNULL(DATALENGTH(t.SomeColumn),0))
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+1
FROM
(
SELECT 1
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)
) E4(N)
) iTally(N)
WHERE SUBSTRING(t.SomeColumn,iTally.N,1) = '/'
) positions(rn, position)
WHERE rn = 6
) splitter(item);
-- Itzik Ben-Gan 2001
June 23, 2017 at 7:43 am
If you are on SQL Server 2016 then use the new native STRING_SPLIT function.
😎
June 23, 2017 at 7:55 am
Eirikur Eiriksson - Friday, June 23, 2017 7:43 AM
Eirikur, I've heard tell that performance with that isn't as good as with the 8K split, although I haven't actually tested it myself. But also, STRING_SPLIT doesn't return a sequence number, so you'd have to rely on the elements always being returned in the correct order (is that guaranteed?) and do you own numbering of the result set.
John
June 23, 2017 at 7:56 am
Eirikur Eiriksson - Friday, June 23, 2017 7:43 AM
The problem with that, is that there's no item number assigned and no order guaranteed for the output of that function.
June 23, 2017 at 9:41 am
Thank you so much for all your help with this.
I actually got there with the below. I just need to replace _ with / and the 192_... with the column name. It works.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
select
a.first, a.second , a.third, a.fourth, a.fifth, a.sixth,a.seventh,
substring ('192_168_0_145_15_0_333_999' , a.fourth+1 , a.fifth-(a.fourth+1)) as actval
From
(
select
charindex( '_', '192_168_0_145_15_0_333_999' ) as first,
charindex( '_', '192_168_0_145_15_0_333_999', charindex( '_', '192_168_0_145_15_0_333_999' )+1 ) as second ,
charindex( '_', '192_168_0_145_15_0_333_999',charindex( '_', '192_168_0_145_15_0_333_999' , charindex( '_', '192_168_0_145_15_0_333_999' ) +1) +1 ) third ,
charindex( '_', '192_168_0_145_15_0_333_999',charindex( '_', '192_168_0_145_15_0_333_999' , charindex( '_', '192_168_0_145_15_0_333_999', charindex( '_', '192_168_0_145_15_0_333_999' )+1 ) +1) +1 ) fourth,
charindex( '_', '192_168_0_145_15_0_333_999',charindex( '_', '192_168_0_145_15_0_333_999' , charindex( '_', '192_168_0_145_15_0_333_999', charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999')+1 )+1 ) +1) +1 ) fifth,
charindex( '_', '192_168_0_145_15_0_333_999',charindex( '_', '192_168_0_145_15_0_333_999' , charindex( '_', '192_168_0_145_15_0_333_999', charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999' )+1 )+1 )+1 ) +1) +1 ) sixth ,
charindex( '_', '192_168_0_145_15_0_333_999',charindex( '_', '192_168_0_145_15_0_333_999' , charindex( '_', '192_168_0_145_15_0_333_999', charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999' ,charindex( '_', '192_168_0_145_15_0_333_999')+1 )+1 )+1 )+1 ) +1) +1 ) seventh
) a
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
June 23, 2017 at 9:53 am
Eirikur Eiriksson - Friday, June 23, 2017 7:43 AM
I wouldn't. It has no ordinal for the position of the returned elements. Really bad oversight on the part of MS since they won't openly guarantee the correct order is returned.
--Jeff Moden
Change is inevitable... Change for the better is not.
June 23, 2017 at 10:30 am
The tally table and/or spring split functions are good ways to go, but recursive CTEs are interesting to learn and I have used them in many one-offs where I can't program in the database. Here is one I have had bookmarked from stackoverflow to help me learn that I modified for here.
/*
Use a recursive CTE and the CHARINDEX function to break a URL into substrings at each "/".
A modified script found on stackoverflow: https://stackoverflow.com/questions/8726111/sql-server-find-nth-occurrence-in-a-string
*/
GO
if OBJECT_ID('tempdb..#t_Strings') is not null
BEGIN
Drop Table #t_Strings;
END;
GO
if OBJECT_ID('tempdb..#t_Interim') is not null
BEGIN
Drop Table #t_Interim;
END;
create table #t_Strings(
URL_String varchar(255) NULL
);
insert into #t_Strings (URL_String)
select 'http://stable-test.test.net/api/files/fs.testcollection/TP-0000000.pdf/download/' union all
select 'http://stable-test.test.net/api_state/file/fs.anothercollection/TP-1111111.exe/upload/' union all
select 'https://stackoverflow.com/questions/8726111/sql-server-find-nth-occurrence-in-a-string';
with cte_Strings(URL_String, Nth_Start, Nth_Found) as (
select URL_String, 1, charindex('/', URL_String) from #t_Strings
union all
select URL_String, Nth_Found + 1, charindex('/', URL_String, Nth_Found + 1)
from cte_Strings
where Nth_Found > 0
)
select URL_String, N = ROW_NUMBER() over (partition by URL_String order by Nth_Start), Nth_Start, Nth_Found,
substring(URL_String, Nth_Start, case when Nth_Found > 0 then Nth_Found - Nth_Start else len(URL_String) end) as Nth_Substring
into #t_Interim
from cte_Strings
order by URL_String, Nth_Start;
-- All substrings numbered
select *
from #t_Interim
order by URL_String, N;
-- Desired Result
select *
from #t_Interim
where N=6
order by URL_String;
June 23, 2017 at 12:36 pm
I have done some fairly extensive tests on the string splitting options and the JSON split methods and the STRING_SPLIT functions are by far the best performers if one is on 2016 or later, by far!
Encapsulating the output in a CTE with a ROW_NUMBER has proven to be accurate and consistent when it comes to the item's ordinals, hence my recommendation.
😎
Hopefully I will find the time to do a write-up soon
Viewing 15 posts - 1 through 15 (of 34 total)
You must be logged in to reply to this topic. Login to reply