July 17, 2020 at 12:47 pm
Ah... bugger. I thought one of Steve's submittals worked just as well but it misses correct output on some of the items I had in my test table and so had to take down my comment on it.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 17, 2020 at 12:52 pm
... and it also comes up with correct answers.
(As Brian admitted in a later post) my solution did provide the correct answers!
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
July 17, 2020 at 1:07 pm
... and it also comes up with correct answers.
(As Brian admitted in a later post) my solution did provide the correct answers!
It's definitely on the right track and does perfect on most entries but one of the requirements the OP had was that if a string had no brackets, to return the whole thing. That's the only place where your code fails.
I also had to take down my earlier comment about Steve's code because it also has a couple of failures if you do a little extreme testing with an entry like ABC[123]. I missed the errors earlier because my eyes don't appear to work correctly until I've had a bunch of coffee.
--Jeff Moden
Change is inevitable... Change for the better is not.
July 17, 2020 at 1:21 pm
Phil Parkin wrote:... and it also comes up with correct answers.
(As Brian admitted in a later post) my solution did provide the correct answers!
It's definitely on the right track and does perfect on most entries but one of the requirements the OP had was that if a string had no brackets, to return the whole thing. That's the only place where your code fails.
I also had to take down my earlier comment about Steve's code because it also has a couple of failures if you do a little extreme testing with an entry like ABC[123]. I missed the errors earlier because my eyes don't appear to work correctly until I've had a bunch of coffee.
Aha, I missed that requirement. Here's a modified version which handles it:
DROP TABLE IF EXISTS #x;
CREATE TABLE #x
(
SomeText VARCHAR(100)
);
INSERT #x
(
SomeText
)
VALUES
('A[CL]')
,('CF');
SELECT *
FROM #x x;
WITH Tally (N)
AS (
-- 8000 rows (max length of the VARCHAR string)
-- SQL Prompt formatting off
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) c(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) d(n)
CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0)) e(n)
-- SQL Prompt formatting on
)
SELECT x.SomeText
,Combo = IIF(calcs.StartPos > 0
,CONCAT(LEFT(x.SomeText, calcs.StartPos - 1), SUBSTRING(x.SomeText, t.N + calcs.StartPos, 1))
,x.SomeText)
FROM #x x
CROSS JOIN Tally t
CROSS APPLY
(
SELECT StartPos = CHARINDEX('[', x.SomeText)
,EndPos = CHARINDEX(']', x.SomeText)
) calcs
WHERE t.N <= calcs.EndPos - calcs.StartPos - 1
OR
(
calcs.StartPos = 0
AND t.N = 1
)
ORDER BY x.SomeText
,t.N;
The absence of evidence is not evidence of absence.
Martin Rees
You can lead a horse to water, but a pencil must be lead.
Stan Laurel
July 17, 2020 at 2:50 pm
I have not tested this against any of the other solutions posted so far. It just looked like an interesting problem so I figured that I'd throw my hat into the ring.
Based on what I've read so far, this will handle all of the requested scenarios plus a few more, including spaces around the bracketed section and constants following the bracketed section...
IF OBJECT_ID('tempdb..#test_string', 'U') IS NOT NULL
BEGIN DROP TABLE #test_string; END;
CREATE TABLE #test_string (
id int NOT NULL IDENTITY(1,1),
string varchar(8000) NOT NULL
);
INSERT #test_string(string) VALUES
('a[abcd]'),
('ABCD[123456]'),
('just a test [a]'),
('another test [asdfghj]'),
('test [asdfghj]ing'),
('test [asdfghj] ing'),
('ZXCV'),
('[QWERTY]');
GO
-- SELECT * FROM #test_string ts
SELECT
ts.id,
ts.string,
--br.[open],
--br.[close],
--sec.beg,
--sec.mid,
--sec.[end],
--t.n,
--bv.bracket_val,
combo_value = CONCAT(sec.beg, bv.bracket_val, sec.[end])
FROM
#test_string ts
CROSS APPLY ( VALUES (
NULLIF(CHARINDEX('[', ts.string), 0),
NULLIF(CHARINDEX(']', ts.string), 0)
) ) br ([open], [close])
CROSS APPLY ( VALUES (
REPLACE(SUBSTRING(ts.string, 1, ISNULL(br.[open], 8000)), '[', ''),
REPLACE(SUBSTRING(ts.string, br.[open], br.[close] - br.[open]), '[', ''),
REPLACE(SUBSTRING(ts.string, ISNULL(br.[close], 8000), 8000), ']', '')
) ) sec (beg, mid, [end])
CROSS APPLY dbo.tfn_Tally_10k(LEN(ISNULL(sec.mid, 0)), 1) t-- this is a "tally function" similar to what Jeff Moden has already posted.
CROSS APPLY ( VALUES (SUBSTRING(sec.mid, t.n, 1)) ) bv (bracket_val);
July 17, 2020 at 4:24 pm
If tvf's are allowed then I'd borrow some of Phil's code
drop function if exists dbo.fnTIDprefix_Expand;
go
create function dbo.fnTIDprefix_Expand(
@TIDprefix varchar(20))
returns table as
return
select
iif(br.br_start>0, concat(left(@TIDprefix, br.br_start-1), substring(@TIDprefix, t.n+br.br_start, 1)), @TIDprefix) TIDprefix_Expanded
from
(select
charindex('[', @TIDprefix) br_start,
charindex(']', @TIDprefix) br_end) br
cross apply
dbo.fnTally(1, iif(br.br_start>0, (br.br_end-br.br_start-1), 1)) t;
go
/* final query */
select
t.TIDPrefix,
tx.TIDprefix_Expanded
from
dbo.test_TID t
cross apply
dbo.fnTIDprefix_Expand(t.TIDprefix) tx;
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
July 21, 2020 at 10:57 am
CREATE TABLE dbo.t
(
TxtVARCHAR(8000)
);
INSERT dbo.t (Txt) VALUES
('A[CL]'),
('B[HQUW]'),
('C[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ]'),
('D2'),
('E[1]');
GO
WITH Result
AS
(
SELECTLEFT(Txt, 1) AS Constant,-- all strings with [...]
SUBSTRING(Txt, 3, 1) AS Variable,
SUBSTRING(Txt, 4, LEN(Txt) - 4) AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) = '['
UNION
SELECTLEFT(Txt, 1) AS Constant,-- all strings without [...]
SUBSTRING(Txt, 2, 1) AS Variable,
'' AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) <> '['
UNION ALL
SELECTConstant,
LEFT(Rest, 1) AS Variable,
SUBSTRING(Rest, 2, 8000) AS Rest
FROM Result
WHERE LEN(Rest) > 0
)
SELECT *, CONCAT(Constant, Variable) AS Col
FROM Result
ORDER BY Constant;
July 21, 2020 at 7:36 pm
CREATE TABLE dbo.t
(
TxtVARCHAR(8000)
);
INSERT dbo.t (Txt) VALUES
('A[CL]'),
('B[HQUW]'),
('C[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ]'),
('D2'),
('E[1]');
GO
WITH Result
AS
(
SELECTLEFT(Txt, 1) AS Constant,-- all strings with [...]
SUBSTRING(Txt, 3, 1) AS Variable,
SUBSTRING(Txt, 4, LEN(Txt) - 4) AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) = '['
UNION
SELECTLEFT(Txt, 1) AS Constant,-- all strings without [...]
SUBSTRING(Txt, 2, 1) AS Variable,
'' AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) <> '['
UNION ALL
SELECTConstant,
LEFT(Rest, 1) AS Variable,
SUBSTRING(Rest, 2, 8000) AS Rest
FROM Result
WHERE LEN(Rest) > 0
)
SELECT *, CONCAT(Constant, Variable) AS Col
FROM Result
ORDER BY Constant;
CREATE TABLE dbo.t
(
TxtVARCHAR(8000)
);
INSERT dbo.t (Txt) VALUES
('A[CL]'),
('B[HQUW]'),
('C[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ]'),
('D2'),
('E[1]');
GO
WITH Result
AS
(
SELECTLEFT(Txt, 1) AS Constant,-- all strings with [...]
SUBSTRING(Txt, 3, 1) AS Variable,
SUBSTRING(Txt, 4, LEN(Txt) - 4) AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) = '['
UNION
SELECTLEFT(Txt, 1) AS Constant,-- all strings without [...]
SUBSTRING(Txt, 2, 1) AS Variable,
'' AS Rest
FROM dbo.t
WHERE SUBSTRING(Txt, 2, 1) <> '['
UNION ALL
SELECTConstant,
LEFT(Rest, 1) AS Variable,
SUBSTRING(Rest, 2, 8000) AS Rest
FROM Result
WHERE LEN(Rest) > 0
)
SELECT *, CONCAT(Constant, Variable) AS Col
FROM Result
ORDER BY Constant;
While that does appear to provide the desired result, it uses a recursive cte that scans the original table twice and produces 257 logical reads. Please see the following article for why rCTEs (recursive CTEs) aren't a good thing to use for this type of thing... they're actually a form of "Hidden RBAR" on steriods.
https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes
Please see my previous code post for a method that only uses a single read (the one page the data lives on).
--Jeff Moden
Change is inevitable... Change for the better is not.
July 21, 2020 at 7:39 pm
Not sure if i read it correctly but look into STRING_AGG or STRING_SPLIT
July 22, 2020 at 12:21 am
Here's my attempt:
;with TestData as
(
SELECT *
FROM (VALUES ('A[kfg]'),
('B[lr]'),
('Cf'),
('D[3r]'),
('E[CL]'),
('F[HQUW]'),
('G[kfg]'),
('H][kfg]'),
('IJK[123]')
) A(Name)
)
SELECT *
,CASE WHEN L.PosLeftBracket > 1 THEN LEFT(t.Name,L.PosLeftBracket-1) + SUBSTRING(t.Name,L.PosLeftBracket+ Row.RowNum,1)
ELSE t.Name
End Output
FROM TestData t
CROSS APPLY(VALUES (CHARINDEX('[',t.Name,1))) L(PosLeftBracket)
CROSS APPLY(VALUES (CHARINDEX(']',t.Name,L.PosLeftBracket+1))) R(PosRightBracket)
OUTER APPLY(SELECT TOP(CASE WHEN R.PosRightBracket-L.PosLeftBracket > 0 THEN R.PosRightBracket-L.PosLeftBracket -1 ELSE 0 END)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RowNum
FROM sys.columns) Row
July 22, 2020 at 12:28 am
Not sure if i read it correctly but look into STRING_AGG or STRING_SPLIT
Got code?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2020 at 1:46 am
Here's my attempt:
;with TestData as
(
SELECT *
FROM (VALUES ('A[kfg]'),
('B[lr]'),
('Cf'),
('D[3r]'),
('E[CL]'),
('F[HQUW]'),
('G[kfg]'),
('H][kfg]'),
('IJK[123]')
) A(Name)
)
SELECT *
,CASE WHEN L.PosLeftBracket > 1 THEN LEFT(t.Name,L.PosLeftBracket-1) + SUBSTRING(t.Name,L.PosLeftBracket+ Row.RowNum,1)
ELSE t.Name
End Output
FROM TestData t
CROSS APPLY(VALUES (CHARINDEX('[',t.Name,1))) L(PosLeftBracket)
CROSS APPLY(VALUES (CHARINDEX(']',t.Name,L.PosLeftBracket+1))) R(PosRightBracket)
OUTER APPLY(SELECT TOP(CASE WHEN R.PosRightBracket-L.PosLeftBracket > 0 THEN R.PosRightBracket-L.PosLeftBracket -1 ELSE 0 END)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RowNum
FROM sys.columns) Row
Nicely done. My only concern would be that since the DBA won't allow the OP to create the fnTally function, will they also forbid the OP from calling a system view?
--Jeff Moden
Change is inevitable... Change for the better is not.
July 22, 2020 at 2:18 am
Nicely done. My only concern would be that since the DBA won't allow the OP to create the fnTally function, will they also forbid the OP from calling a system view?
Can be fixed:
;with TestData as
(
SELECT *
FROM (VALUES ('A[kfg]'),
('B[lr]'),
('Cf'),
('D[3r]'),
('E[CL]'),
('F[HQUW]'),
('G[kfg]'),
('H][kfg]'),
('IJK[123]')
) A(Name)
)
SELECT *
,CASE WHEN L.PosLeftBracket > 1 THEN LEFT(t.Name,L.PosLeftBracket-1) + SUBSTRING(t.Name,L.PosLeftBracket+ Row.RowNum,1)
ELSE t.Name
End Output
FROM TestData t
CROSS APPLY(VALUES (CHARINDEX('[',t.Name,1))) L(PosLeftBracket)
CROSS APPLY(VALUES (CHARINDEX(']',t.Name,L.PosLeftBracket+1))) R(PosRightBracket)
OUTER APPLY(SELECT TOP(CASE WHEN R.PosRightBracket-L.PosLeftBracket > 0 THEN R.PosRightBracket-L.PosLeftBracket -1 ELSE 0 END)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RowNum
FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) A(N),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) B(N),
(VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) C(N)) Row
July 22, 2020 at 9:46 am
I do not agree with you. We have discussed it before and your focus is on IO. Your solution is much more cpu heavy if the statement is only rarely executed and needs to be compiled every time. I think that the design will be different if the query will be executed many times each minute.
I'm editing my script to allow more prefix characters - the same as your script.
CREATE TABLE dbo.t
(
IDINTNOT NULL IDENTITY,
TxtVARCHAR(8000)
);
INSERT dbo.t (Txt) VALUES
('A[CL]'),
('B[HQUW]'),
('C[1234567890ABCDEFGHIJKLMNOPQRSTUVWXYZ]'),
('D2'),
('E[1]'),
('FGH[12345]'),
('IJKLMN');
GO
SET STATISTICS TIME, IO ON;
GO
CREATE FUNCTION dbo.fnTally
/**********************************************************************************************************************
Jeff Moden Script on SSC: https://www.sqlservercentral.com/scripts/create-a-tally-function-fntally
**********************************************************************************************************************/
(@ZeroOrOne BIT, @MaxN BIGINT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN WITH
H2(N) AS ( SELECT 1
FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
)V(N)) --16^2 or 256 rows
, H4(N) AS (SELECT 1 FROM H2 a, H2 b) --16^4 or 65,536 rows
, H8(N) AS (SELECT 1 FROM H4 a, H4 b) --16^8 or 4,294,967,296 rows
SELECT N = 0 WHERE @ZeroOrOne = 0 UNION ALL
SELECT TOP(@MaxN)
N = ROW_NUMBER() OVER (ORDER BY N)
FROM H8;
If we execute the 2 statements
WITH ctePosit
AS
(--==== "Cheater" method to handle just about anything
SELECT Txt
,LeftBower = CHARINDEX('[', Txt + '[')
,RightBower = CHARINDEX(']', Txt + '[]')
FROM dbo.t
)--==== This gabs the "header" and each character and concatenates them together as requested.
SELECT --TIDPrefix = p.Txt,
TIDExpanded = CONCAT(LEFT(p.Txt, LeftBower - 1), SUBSTRING(p.Txt, LeftBower + t.N , 1))
FROM ctePosit AS p OUTER APPLY dbo.fnTally(1, RightBower - LeftBower - 1) AS t;
GO
-------------------------------------------------------------------------------
WITH Result
AS
(
SELECTID,
LEFT(Txt, CHARINDEX('[', Txt) - 1) AS Constant,-- all strings with [...]
SUBSTRING(Txt, CHARINDEX('[', Txt) + 1, 1) AS Variable,
SUBSTRING(Txt, CHARINDEX('[', Txt) + 2, LEN(Txt) - CHARINDEX('[', Txt) - 2) AS Rest
FROM dbo.t
WHERE CHARINDEX('[', Txt) > 0
UNION
SELECTID,
Txt AS Constant,-- all strings without [...]
NULL AS Variable,
'' AS Rest
FROM dbo.t
WHERE CHARINDEX('[', Txt) = 0
UNION ALL
SELECTID,
Constant,
LEFT(Rest, 1) AS Variable,
SUBSTRING(Rest, 2, 8000) AS Rest
FROM Result
WHERE LEN(Rest) > 0
)
SELECT CONCAT(Constant, Variable) AS Col
FROM Result;
and look at the result your solution will use much more cpu.
Your solution
SQL Server parse and compile time:
CPU time = 26 ms, elapsed time = 26 ms.
(50 rows affected)
Table 't'. Scan count 1, logical reads 1, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
My solution
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 2 ms.
(50 rows affected)
Table 'Worktable'. Scan count 2, logical reads 289, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 't'. Scan count 2, logical reads 2, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
So your solution use 26 ms, my solution only 2. And my solution have 289 logical read - not expensive reads, because data is in memory, your solution only 1. But have your system a bottleneck - is it IO or cpu!
In this situation - as always - tally is not the only way to do it! You write in the article, that you reference
"If you go back and read the paragraph above, the word "quickly" has been braced in quotes because it's not always done quickly. Judging from many of the posts on this and other SQL forums, more and more well meaning folks are counting with Recursive CTE's because they're relatively simple to write and don't require such things as a Tally table to drive them. They have no idea of the performance problem they've just instantiated with their code nor do the people who pick up on the technique. It's becoming a nearly viral problem."
Developers who ALWAYS think that the tally-solution is the right solution, make a solution that can be more difficult to write and understand/verify and maybe instantiated cpu-problems!
July 22, 2020 at 1:38 pm
... Your solution is much more cpu heavy if the statement is only rarely executed and needs to be compiled every time. I think that the design will be different if the query will be executed many times each minute.
Just to be clear, the comparison is between 1) creation of the tally function and execution of the query, versus 2) execution of the recursive CTE query? If so it's not a fair comparison imo because there's not really a need to recompile the function. Jeff recognizes this by specifying WITH SCHEMABINDING which is a performance optimization that's applicable when the function doesn't reference any physical objects. If the function doesn't reference physical objects then why recompile it?
Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können
Viewing 15 posts - 31 through 45 (of 71 total)
You must be logged in to reply to this topic. Login to reply