August 26, 2009 at 11:19 am
Would someone please doublecheck my results? For comparison, Peter's original code is shown in a slightly modified version below. It now takes the sequence string as an input variable, rather than a constant. Both versions are using dbo.Tally, and not the "virtual" tally cte. Accepting an input variable appears to slow things down quite a bit, and the presence of multiple strings of all zeroes appears (to me) to confuse Peter's results.
You're absolutely right, Bob. The change to an input variable caused wrong results, not only with the presence of strings with only zeroes. As I already noticed in my previous post, the result of the SELECT ... FOR XML may not be in the right order. When using a virtual tally table or an input variable instead of a constant value, I see the query plan is changed. In my original solution a Streaming Aggregate was used, which, I assume, preserves order. When using a virtual tally table or a input variable the query plan is changed to use a HashMatch to aggregate. I suppose the HashMatch does not preserve order, hence the incorrect results. Here's the fixed code:
set statistics time off
set nocount on;
declare @input varchar(1000)
set @input = '0000,0001,1000,0000,01010,000101,0000'
print '----- Peters Original Solution (fixed) ------'
set statistics time on;
select stuff(replace('*' +
(
select
case when t.c = '1' then ',' + t.p else '*' end
from
(
select
n.N,
substring(s.Sequence, n.N, 1) as c,
cast(n.N - max(coalesce(t.N, 0)) as varchar(10)) p
from
(
select @input as Sequence
) s
cross join
dbo.Tally n
left join
dbo.Tally t on t.N <= n.N and substring(s.Sequence, t.N, 1) = ','
where
n.N <= len(s.Sequence)
group by
n.N, substring(s.Sequence, n.N, 1)
) t
where
t.c '0'
order by -- absolutely required
t.N
for xml path('')
)
, '*,', '*'), 1, 1, '')
I'm afraid (well, not really :-)) my machine is too fast to say something useful about performance based on the timing statistics and a single input value. Both solutions run in 0ms most of the time. Only elapsed time may vary. But based on these elapsed times, it looks like your solution is slightly faster.
Peter
August 26, 2009 at 11:55 am
Thanks for posting back, Peter. Your results make sense when I look at them now. Still using my creaky old laptop, I'm seeing the following times for the new code.
----- Peters Original Solution (fixed) ------
SQL Server Execution Times:
CPU time = 265 ms, elapsed time = 813 ms.
If you are getting 0ms response times, I envy you and covet your machine. 😉
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 26, 2009 at 1:06 pm
You don't have to envy me. Based on the figures you posted, obviously this machine is making me producing very slow code, although I have not figured out yet what causes the huge difference on your machine. Maybe I will spend some more time to dig deeper into that. Anyway, a funny excercise.
Peter
August 27, 2009 at 4:13 pm
Haven't tested the following code for other scenarios, but works for the one mentioned in the post:
declare @STR varchar(50)
declare @str2 varchar(50)
declare @str3 varchar(50)
declare @str4 varchar(50)
set @str4='0001,01010,000101'
set @STR=@str4
set @str2=''
set @str2=@str2+charindex('1', @STR)
while charindex(',',@str)!=0
begin
set @str2=@str2+'*'
set @str3=substring(@str, charindex(',', @STR)+1, len(@str))
set @str2=@str2+cast(charindex('1', @str3) as varchar)+','
set @STR=@str3
set @str3=substring(@str3, charindex('1',@str3)+1, len(@str3))
set @str2=@str2+cast(charindex('1', @str3)+charindex('1',@str) as varchar)
end
select @str4,@str2
August 27, 2009 at 4:32 pm
A straightforward procedural solution, but it really needs to be tested against other patterns. Unless the OP (original poster) tells us otherwise, we can't assume that there will only be three elements, that the first element will only have a single '1', that the second element will only have two '1's, etc.
You could create a WHILE loop to build a more open ended solution, but if you did I believe you would find that performance would still be slower than either of the set based solutions. Good effort, though.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 27, 2009 at 4:42 pm
Yes Bob I did make the following assumptions while writing the code;
1. First set has one '1'
2. Every other set has 2 '1's
3. Last set doesnt have a comma
Thought not to spend more time unless the "OP" is satisfied with this... 😉
August 28, 2009 at 11:23 am
select '4*2,4*4,6'
Might as well make some more assumptions. 😉
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 28, 2009 at 2:55 pm
Friday afternoon fun with Tally tables! I decided to go the table route rather than a single variable route (As the original post seems to indicate that he needs to perform this random and puzzling action on an entire column). For that we need some test data.
Figuring out how to generate random test data without looping actually took about 10x as long as the solution. This isn't terribly efficient and could probably be improved upon, but it met my goals of making it easy to add rows or alter density.
------------ Generate some Test Data ----------------------
CREATE TABLE TD082809(
TDIDINT IDENTITY(1,1) PRIMARY KEY,
CVVARCHAR(200))
;WITH
RND2 (N,R2)AS (
SELECT N, RAND(CHECKSUM(NEWID())) R2
FROM Tally
WHERE N <=200),
RND5 (N,R5) AS (
SELECT N, RAND(CHECKSUM(NEWID())) R5
FROM Tally
WHERE N <=500)
INSERT INTO TD082809(CV)
SELECT (
SELECT ''+
CASE
WHEN (R2 + R5)/2<.65 THEN '0'
WHEN (R2 + R5)/2<.85 THEN '1'
ELSE ','
END
FROM Tally T
INNER JOIN RND2 ON T.N = RND2.N
,( SELECT RND5.N, R5
FROM RND5
WHERE TOUT.N = RND5.N) RND5
WHERE T.N <= 200
FOR XML PATH('')) CV
FROM Tally TOUT
WHERE TOUT.N <=500
UPDATE TD082809
SET CV = REPLACE(CV,',,',',0')
WHERE CV LIKE '%,,%'
UPDATE TD082809
SET CV = REPLACE(CV,',,',',0')
WHERE CV LIKE '%,,%'
-- SELECT * FROM TD082809
Then the solution. Not terribly happy with the speed of this either, the outer reference kinda kills it.
SELECT CV,(
SELECT CASEWHEN SUBSTRING(CV,N,1) = '1'
THEN CAST(N AS VARCHAR(3)) + ','
WHEN SUBSTRING(CV,N,1) = '*'
THEN '*'
END
FROM Tally,(SELECT REPLACE(CV,',','*') CV
FROM TD082809 WHERE TDID = Tout.N) TD
WHERE N < DATALENGTH(CV)
FOR XML PATH(''))
FROM Tally Tout
LEFT JOIN TD082809 ON Tout.N = TD082809.TDID
WHERE N < (SELECT COUNT(*) FROM TD082809)
I'll second the props to Jeff for learnin' me all about Tally tables.
August 28, 2009 at 5:03 pm
Hey Seth 🙂
You inspired me to convert my CTE solution to an inline table valued function (at the bottom). Use it with a cross apply and it's not too shabby.
use sandbox
set nocount on;
DECLARE @input varchar(100)
DECLARE @output varchar(100)
DECLARE @inputTbl Table (rowID int identity(1,1), Input varchar(100))
INSERT into @inputTbl
select '0000,0001,1000,0000,01010,000101,0000' union all
select '0110,0001,1000,0000,01010,000101,10000' union all
select '011011,01001,1000,0000,01010,000101,00001' union all
select '10110,01001,1000,00100,01010,000101,0000' union all
select '0110,0001,1000,011000,01010,000101,01000' union all
select '01110,0001,10010,0000,01010,000101,0000' union all
select '011000,0001,1000,001100,01010,0010101,1111'
set statistics time on;
select *
from @inputTbl
cross apply dbo.BuildString(input)
set statistics time off;
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 84 ms.
use sandbox;
GO
CREATE FUNCTION dbo.BuildString
(
@input varchar(8000)
)
RETURNS TABLE
AS
RETURN
(
-- using predefined tally table
-- ;with tally (N) as (select row_number() over (order by ID) from master..syscolumns)
with cte1 (position,value) as
(select N,substring('*'+ replace(@input,',','*'),N,1)
from tally
where substring('*'+ replace(@input,',','*'),N,1) in ('1','*')
and N<=len('*'+ replace(@input,',','*'))
)
,cte2 (c1Pos,c1Val,c2Pos,c2Val,strValue) as
(select *,
case when c.Value = '*' and c2.Value = '*' then '0*'
when c.Value = '*' and c2.Value '*' then '*'
else null end as strValue
from cte1 c
join cte1 c2 on c2.position < c.position
where c.value = '*' or (c.value '*' and c2.value = '*')
)
,cte3 as (
select c1Pos,c1Val, min(strValue) as strValue, min(c1Pos-c2Pos) as bitPos
from cte2
group by c1Pos,c1Val
)
,cte4 (c1Pos,strValue) as (
select c1pos,case when strValue is not null then strValue else cast(bitPos as varchar(3))+',' end from cte3
)
,cte5 (result) as
(select ( SELECT top 100 percent '' + strValue
FROM cte4
ORDER BY c1Pos
FOR XML PATH('')
)
)
,cte6 (result) as
(select replace(replace(result,'**','*0*'),',*','*') from cte5)
select case when right(result,1) = '*' then result+'0'
when right(result,1) = ',' then stuff(result,len(result),1,'') else result end as result
from cte6
)
GO
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
August 28, 2009 at 6:26 pm
Garadin (8/28/2009)
I'll second the props to Jeff for learnin' me all about Tally tables.
Heh... thanks, Seth. Folks like you, Bob, and a couple of others have made it a household name and have taken it to the next level, as well. I do have to remind folks, though... I didn't invent it... "I jist 'splained howzit wurks". 😀
--Jeff Moden
Change is inevitable... Change for the better is not.
August 29, 2009 at 12:59 am
This is a plain looping solution that reads the string character by character. It can handle input string of 2000 chars. String can start with 0, 1, or comma.
Bob, can you tell me how inefficient it is compared to some of your solutions that I'm trying to understand?
BTW, seems like the SQL formatting is not maintained 🙁
declare
@input nvarchar(2000),
@inputlength tinyint,
@readpos tinyint,
@readchar nvarchar(1),
@onepos tinyint,
@output nvarchar(2000)
select
@input='0001,01010,000101',
@inputlength = len(@input),
@readpos = 1,
@onepos = 1,
@output = ''
while @readpos 0 and right(@output,1) = ','
begin
set @output = substring(@output,1,len(@output)-1)
end
select
@output = @output + '*',
@onepos = 0
end
else if @readchar = '1'
begin
set @output = @output + cast(@onepos as nvarchar(4)) + ','
end
else
begin
set @output = @output + ''
end
select
@readpos = @readpos + 1,
@onepos = @onepos + 1
end
if len(@output) > 0 and right(@output,1) = ','
begin
set @output = substring(@output,1,len(@output)-1)
end
select @output as 'Output'
Riz
August 29, 2009 at 8:21 am
Glad to, Riz 🙂 On my machine, the loop is taking 29-30 milliseconds, on the average, to run. A couple of times it dropped to 16, and several times it soared to 50-60. By comparison, the inline table valued function using the revised CTE solution is doing 50 rows (of 7 elements apiece) in 155-170 milliseconds.
If the concept of a tally table is new to you, you can find Jeff's most excellent article on the subject here[/url]. Essentially, tally tables enable you to do set-based processing of tasks that would appear to require loops. They can be deadly efficient at doing it too.
Setting statistics time on for the loop, you see the following. (Yes, there really are that many lines.)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
(1 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 1 ms.
__________________________________________________
Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Viewing 12 posts - 16 through 26 (of 26 total)
You must be logged in to reply to this topic. Login to reply