September 16, 2011 at 2:03 pm
Hello,
I have this really nasty string that needs to be broken up. I've created some monstrum so far with left, len, patindex and substring and it takes care of exactly 2 first parts of the string. So tiring... :doze:
Is there anyway to use e.g. left() that would search for 2nd, 3rd, etc occurence of the deliminator? there can be variable length of characters between deliminators... I've never worked with strings on such scale before. :unsure:
September 16, 2011 at 2:29 pm
i've got a couple of suggestions for you;
here's a function CHARINDEX2, which allows you to find the nth occurrance of a string within-a-string:
/*
Example:
SELECT dbo.CHARINDEX2('a', 'abbabba', 3)
returns the location of the third occurrence of 'a'
which is 7
*/
CREATE FUNCTION CHARINDEX2(
@TargetStr varchar(8000),
@SearchedStr varchar(8000),
@Occurrence int)
RETURNS int
AS
BEGIN
DECLARE @pos int, @counter int, @ret int
SET @pos = CHARINDEX(@TargetStr, @SearchedStr)
SET @counter = 1
IF @Occurrence = 1
SET @ret = @pos
ELSE
BEGIN
WHILE (@counter < @Occurrence)
BEGIN
SELECT @ret = CHARINDEX(@TargetStr, @SearchedStr, @pos + 1)
SET @counter = @counter + 1
SET @pos = @ret
END
END
RETURN(@ret)
END
GO
another option is to Split() the string into rows, and work witht eh data that way, here is one of the better versions of a split function out there:
--usage: SELECT * FROM dbo.DelimitedSplit8K('alpha,beta,gamma,,more',',')
CREATE FUNCTION dbo.DelimitedSplit8K
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000))
FROM cteStart s
;
Lowell
September 16, 2011 at 2:35 pm
Thanks so much. I'll probably go with the split idea. I think this is going to be fun incorporating this into one query (can't create functions). π
September 16, 2011 at 2:39 pm
happiestgirl (9/16/2011)
Thanks so much. I'll probably go with the split idea. I think this is going to be fun incorporating this into one query (can't create functions). π
be sneaky and create it in tempdb database itself, then you can still use it:
USE TEMPDB
CREATE FUNCTION DelimitedSplit8K
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT 0 UNION ALL
SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT t.N+1
FROM cteTally t
WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000))
FROM cteStart s
;
GO
SELECT * FROM tempdb..DelimitedSplit8K('alpha,beta,gamma,,more',',')
Lowell
September 16, 2011 at 3:01 pm
Oh wow :hehe: I'll play with this one in my spare time :w00t:
It's puzzling to me why SQL Server doesn't have a builtin function to take care of splits....
September 16, 2011 at 10:02 pm
Lowell (9/16/2011)
be sneaky and create it in tempdb database itself, then you can still use it:
Wont objects in tempdb get dropped when the server restarts ?
< i dint read thro the entire thread, so it may just be a dumb question>
September 17, 2011 at 6:14 am
ColdCoffee you are right of course, but at least the function would be available for the duration of the migration/parsing she was doing;
i kind of assumed she was splitting data to stick into another table as a one time load, but the original poster would know better if my best guess will help;
Lowell
September 20, 2011 at 12:02 pm
Ok, I just found the parsename function. It's very close to what I'm looking for, but you know, if I have variable number of delimiters per string and there is more than 4, that will cause a problem for me. The number of delimiters is anything between 0 and 6, so I may have up to 7 columns in my result...
I could ignore (using replace) the first two parts of my string xxx.xxx. - but that leaves me with one extra part that will give me an error, and also everything gets shifted to either left of right, and I need to keep my tree order, if you know what I mean.
I'm not expecting code from anyone, i want to learn, but anyone has an idea what built-in function i could use here?
Lowell (9/17/2011)
ColdCoffee you are right of course, but at least the function would be available for the duration of the migration/parsing she was doing;i kind of assumed she was splitting data to stick into another table as a one time load, but the original poster would know better if my best guess will help;
It's one time thing, I need to keep the tree structure as such
xx.yyy.zzz. www.
xx.yyy
xx.yyy.zzz.www.ddd
ttt.hh.yyy.zzz etc
each period here closes a column, so you see it's a variable length.
September 20, 2011 at 12:17 pm
ok here's an enhanced version of a parsename function, which takes any number of items, and an optional delimiter to boot:
i think this is from a Jeff Moden post ....
--usage:
with MySAmpleData
AS
(
SELECT 'xx.yyy.zzz. www.' AS MyColumn UNION ALL
SELECT 'xx.yyy' UNION ALL
SELECT 'xx.yyy.zzz.www.ddd' UNION ALL
SELECT 'ttt.hh.yyy.zzz.etc'
)
select
dbo.fn_parsename(MyColumn,'.',1),
dbo.fn_parsename(MyColumn,'.',2),
dbo.fn_parsename(MyColumn,'.',3),
dbo.fn_parsename(MyColumn,'.',4),
dbo.fn_parsename(MyColumn,'.',5),
dbo.fn_parsename(MyColumn,'.',6),
dbo.fn_parsename(MyColumn,'.',7),
MyColumn
FROM MySAmpleData
the function:
CREATE FUNCTION dbo.fn_parsename
(
@pString VARCHAR(7999),
@pDelimiter CHAR(1),
@Occurrance int
)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE @Results VARCHAR(8000)
--===== "Inline" CTE Driven "Tally Tableβ produces values up to
-- 10,000... enough to cover VARCHAR(8000)
;WITH
E1(N) AS ( --=== Create Ten 1's
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 --10
),
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --100
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10,000
cteTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4) ,
--===== Do the split
InterResults
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY N) AS ItemNumber,
SUBSTRING(@pString, N, CHARINDEX(@pDelimiter, @pString + @pDelimiter, N) - N) AS Item
FROM cteTally
WHERE N < LEN(@pString) + 2
AND SUBSTRING(@pDelimiter + @pString, N, 1) = @pDelimiter
)
SELECT @Results = Item FROM InterResults WHERE ItemNumber = @Occurrance
return @Results
END --FUNCTION
Lowell
September 20, 2011 at 12:23 pm
I see no way of doing it without an udf, that's OK I'll ask for permission, thank you so much for tour help π
Viewing 10 posts - 1 through 9 (of 9 total)
You must be logged in to reply to this topic. Login to reply