November 19, 2012 at 7:09 am
Jeff Moden (11/19/2012)
@Mark,That was also pretty quick. Have you been using such a thing? If so, what's your take on performance here (And, yes, I still do the test because I said I would and because I'm insanely curious about these things. 🙂 )
Nope, I don't tend to use things like this. Although performance-wise it's probably faster than pure T-SQL,
I find the overhead of deployment and maintenance doesn't make it worthwhile. Unless this sort of function is going
to be very heavily used and is performance critical, I'd stick to a nested REPLACEs. We only use SQLCLRs for a few
functions that can't be practically implemented in T-SQL, Levenshtein distance for strings for example.
Writing a half dozen lines of C# shouldn't take long anyway and it was kinda fun 🙂
____________________________________________________
Deja View - The strange feeling that somewhere, sometime you've optimised this query before
How to get the best help on a forum
http://www.sqlservercentral.com/articles/Best+Practices/61537November 19, 2012 at 8:06 am
Mark-101232 (11/19/2012)
Jeff Moden (11/19/2012)
@Mark,That was also pretty quick. Have you been using such a thing? If so, what's your take on performance here (And, yes, I still do the test because I said I would and because I'm insanely curious about these things. 🙂 )
Nope, I don't tend to use things like this. Although performance-wise it's probably faster than pure T-SQL,
I find the overhead of deployment and maintenance doesn't make it worthwhile. Unless this sort of function is going
to be very heavily used and is performance critical, I'd stick to a nested REPLACEs. We only use SQLCLRs for a few
functions that can't be practically implemented in T-SQL, Levenshtein distance for strings for example.
Writing a half dozen lines of C# shouldn't take long anyway and it was kinda fun 🙂
I'm of the same ilk. It's frequently not worth the performance gain (if there is one) to handle separately managed code, etc.
Speaking of performance gains, there frequently either isn't one or it actually causes a performance problem. I just spent a bunch of time replacing some audit log triggers that were written as a CLR because someone wanted it so that it would automatically figure out if you added a column to the given table. It would take 4 minutes to update just 5 columns on a 137 column (don't ask... I didn't design it). I wrote some code to build hardcoded triggers from the table schema and 10K rows could suddenly be upated in milliseconds.
Things like the Levenshtein distance algorithm are a whole 'nuther story.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 19, 2012 at 9:36 am
Jeff Moden (11/19/2012)
...
To answer the rest, the only way to know for sure is to run a test or two. If someone would create a CLR to drop characters 0 thru 31 and 127, script it out and send it to me (I don't have a C# environment setup and wouldn't know what to do with it if I did), I'd be happy to test it, publish the results, and share the documented test code.
First, I would really like to mentioned that in my previous post I've tried to make the attention to a bit unfair comparison between dedicated UDF and generic CLR wrapper around RegEx.
Now looking into more details of this OP question...
I can see that to do what OP needs I wouldn't not use RegEx at all!
However I wouldn't use StringBuilder either...
Here is my version of CLR based on plain char array:
[Microsoft.SqlServer.Server.SqlFunction]
public static String RemoveCrap(String val)
{
if (val == null) return null;
char[] chrs = new char[val.Length];
int indx = 0;
foreach (char c in val)
{
if ((int) c > 31 && (int) c != 127)
{
chrs[indx] = c;
indx++;
}
}
return new string(chrs, 0, indx);
}
Now, using Jeff's test setup some performance testing:
DECLARE @Start DATETIME, @End DATETIME
DECLARE @var varchar(max)
PRINT 'T-SQL nested REPLACE'
SET @Start = GETUTCDATE()
SELECT @var = ca.CleanedString
FROM #TestTable tt
CROSS APPLY dbo.DropControlCharacters(tt.SomeString) ca
PRINT DATEDIFF(ms,@start,GETUTCDATE());
PRINT 'CLR with String Builder'
SET @Start = GETUTCDATE()
SELECT @var = dbo.RemoveCtrlChars(tt.SomeString)
FROM #TestTable tt
PRINT DATEDIFF(ms,@start,GETUTCDATE());
PRINT 'CLR plain'
SET @Start = GETUTCDATE()
SELECT @var = dbo.RemoveCrap(tt.SomeString)
FROM #TestTable tt
PRINT DATEDIFF(ms,@start,GETUTCDATE());
And here results I got:
T-SQL nested REPLACE
2503
CLR with String Builder
546
CLR plain
343
CLR with StringBuilder is five times faster than T-SQL UDF with nested REPLACE.
CLR based on char array gives another 40% of boost.
Yes, CLR functions do add a bit more to database maintenance, but their performance (eg. in this case) justify such cost easily, at least for myself...
November 15, 2016 at 12:38 am
hi i have this test code which replace our problem on production
first replace for CHAR(0) removes character 'C' --- I dont know Why
when u set input to for example 'Calling' C dont go away
Can u help me?
DECLARE @badStrings TABLE (item NVARCHAR(1))
DECLARE @input VARCHAR(250)
SET @input='Chluemcká' --COLLATE SQL_Latin1_General_CP1_CI_AS
INSERT INTO @badStrings(item)
SELECT CHAR(0) UNION ALL
SELECT CHAR(1) UNION ALL
SELECT CHAR(2) UNION ALL
SELECT CHAR(3) UNION ALL
SELECT CHAR(4) UNION ALL
SELECT CHAR(5) UNION ALL
SELECT CHAR(6) UNION ALL
SELECT CHAR(7) UNION ALL
SELECT CHAR(8) UNION ALL
SELECT CHAR(11) UNION ALL
SELECT CHAR(12) UNION ALL
SELECT CHAR(14) UNION ALL
SELECT CHAR(15) UNION ALL
SELECT CHAR(16) UNION ALL
SELECT CHAR(17) UNION ALL
SELECT CHAR(18) UNION ALL
SELECT CHAR(19) UNION ALL
SELECT CHAR(20) UNION ALL
SELECT CHAR(21) UNION ALL
SELECT CHAR(22) UNION ALL
SELECT CHAR(23) UNION ALL
SELECT CHAR(24) UNION ALL
SELECT CHAR(25) UNION ALL
SELECT CHAR(26) UNION ALL
SELECT CHAR(27) UNION ALL
SELECT CHAR(28) UNION ALL
SELECT CHAR(29) UNION ALL
SELECT CHAR(30) UNION ALL
SELECT CHAR(31)
SELECT @input
SELECT * FROM @badStrings bs
DECLARE @zank nvarchar(1)
DECLARE znaky CURSOR FAST_FORWARD READ_ONLY FOR SELECT bs.item FROM @badStrings bs
OPEN znaky
FETCH NEXT FROM znaky INTO @zank
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @input = REPLACE(@input, item, ' ') FROM @badStrings
SELECT @input
FETCH NEXT FROM znaky INTO @zank
END
CLOSE znaky
DEALLOCATE znaky
November 15, 2016 at 1:32 am
lady.konvicka (11/15/2016)
hi i have this test code which replace our problem on productionfirst replace for CHAR(0) removes character 'C' --- I dont know Why
when u set input to for example 'Calling' C dont go away
Can u help me?
DECLARE @badStrings TABLE (item NVARCHAR(1))
DECLARE @input VARCHAR(250)
SET @input='Chluemcká' --COLLATE SQL_Latin1_General_CP1_CI_AS
INSERT INTO @badStrings(item)
SELECT CHAR(0) UNION ALL
SELECT CHAR(1) UNION ALL
SELECT CHAR(2) UNION ALL
SELECT CHAR(3) UNION ALL
SELECT CHAR(4) UNION ALL
SELECT CHAR(5) UNION ALL
SELECT CHAR(6) UNION ALL
SELECT CHAR(7) UNION ALL
SELECT CHAR(8) UNION ALL
SELECT CHAR(11) UNION ALL
SELECT CHAR(12) UNION ALL
SELECT CHAR(14) UNION ALL
SELECT CHAR(15) UNION ALL
SELECT CHAR(16) UNION ALL
SELECT CHAR(17) UNION ALL
SELECT CHAR(18) UNION ALL
SELECT CHAR(19) UNION ALL
SELECT CHAR(20) UNION ALL
SELECT CHAR(21) UNION ALL
SELECT CHAR(22) UNION ALL
SELECT CHAR(23) UNION ALL
SELECT CHAR(24) UNION ALL
SELECT CHAR(25) UNION ALL
SELECT CHAR(26) UNION ALL
SELECT CHAR(27) UNION ALL
SELECT CHAR(28) UNION ALL
SELECT CHAR(29) UNION ALL
SELECT CHAR(30) UNION ALL
SELECT CHAR(31)
SELECT @input
SELECT * FROM @badStrings bs
DECLARE @zank nvarchar(1)
DECLARE znaky CURSOR FAST_FORWARD READ_ONLY FOR SELECT bs.item FROM @badStrings bs
OPEN znaky
FETCH NEXT FROM znaky INTO @zank
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @input = REPLACE(@input, item, ' ') FROM @badStrings
SELECT @input
FETCH NEXT FROM znaky INTO @zank
END
CLOSE znaky
DEALLOCATE znaky
You better start a new thread with your question, not many are going to be watching this one.
😎
November 15, 2016 at 7:21 am
Eirikur Eiriksson (11/15/2016)
lady.konvicka (11/15/2016)
hi i have this test code which replace our problem on productionfirst replace for CHAR(0) removes character 'C' --- I dont know Why
when u set input to for example 'Calling' C dont go away
Can u help me?
You better start a new thread with your question, not many are going to be watching this one.
😎
This seems like a solution to the original problem. It also does a lot more work than it should.
CHAR(0) doesn't remove character 'C'. It shouldn't as there's nothing that relates both of those characters.
Here's a short version of the code with all unnecessary steps removed.
DECLARE @badStrings TABLE (item NVARCHAR(1))
DECLARE @input VARCHAR(250)
SET @input='Chlu' + CHAR(4) + 'emcká' COLLATE SQL_Latin1_General_CP1_CI_AS
INSERT INTO @badStrings(item)
VALUES
(CHAR(0) ),(CHAR(1) ),(CHAR(2) ),(CHAR(3) ),(CHAR(4) ),(CHAR(5) ),
(CHAR(6) ),(CHAR(7) ),(CHAR(8) ),(CHAR(11)),(CHAR(12)),(CHAR(14)),
(CHAR(15)),(CHAR(16)),(CHAR(17)),(CHAR(18)),(CHAR(19)),(CHAR(20)),
(CHAR(21)),(CHAR(22)),(CHAR(23)),(CHAR(24)),(CHAR(25)),(CHAR(26)),
(CHAR(27)),(CHAR(28)),(CHAR(29)),(CHAR(30)),(CHAR(31));
SELECT @input = REPLACE(@input, item, '') FROM @badStrings;
SELECT @input;
November 15, 2016 at 8:02 am
Eirikur Eiriksson (11/15/2016)
...You better start a new thread with your question, not many are going to be watching this one.
😎
Too late, it's sitting in the 'Active Threads' queue, so everyone will drop by to take a look.
However, this should nail it shut.
create table MySample
(
col1 varchar(30) null
constraint cc_col1_InvalidData check
(case when col1 like '%['+char(0)+'-'+char(31)+']%'
then 0
else 1
end = 1)
);
insert into MySample ( col1 )
values ('John'+char(9));
Msg 547, Level 16, State 0, Line 18
The INSERT statement conflicted with the CHECK constraint "cc_col1_InvalidData".
The conflict occurred in database "Test", table "dbo.MySample", column 'col1'.
"Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho
November 15, 2016 at 8:19 am
its from our supporter and we are triing to resolve it, i dont know that its from the internet, but thanks
November 15, 2016 at 9:42 am
Here is a quick solution, quite efficient
😎
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH('')
) AS OUT_STR
;
November 15, 2016 at 11:12 am
Eirikur Eiriksson (11/15/2016)
Here is a quick solution, quite efficient😎
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH('')
) AS OUT_STR
;
Try this and see what you get...
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + '&' + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH('')
) AS OUT_STR
;
You need to use "TYPE" and that will make it about twice as slow.
--Jeff Moden
Change is inevitable... Change for the better is not.
November 15, 2016 at 11:44 am
Jeff Moden (11/15/2016)
Eirikur Eiriksson (11/15/2016)
Here is a quick solution, quite efficient😎
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH('')
) AS OUT_STR
;
Try this and see what you get...
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + '&' + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH('')
) AS OUT_STR
;
You need to use "TYPE" and that will make it about twice as slow.
One would use the text() function to mitigate the cost, makes a big difference by avoiding the XML reconstruction phase of the output.
😎
The cost ratio between the first and the second query is close to 1:99
DECLARE @pString VARCHAR(8000) = 'Chlu' + CHAR(30) + CHAR(2) + CHAR(0) + CHAR(13) + '&' + CHAR(10) + CHAR(20) + CHAR(31) + 'emcká';
--Better
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH(''), TYPE
).value('(./text())[1]','VARCHAR(8000)') AS OUT_STR
;
-- Bad
;WITH T(N) AS ( SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0))X(N))
, NUMS(N) AS ( SELECT TOP (LEN(ISNULL(@pString,CHAR(32)))) (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))) FROM T T1,T T2,T T3,T T4)
SELECT
(
SELECT SUBSTRING(@pString,NM.N,1)
FROM NUMS NM
WHERE ((ASCII(SUBSTRING(@pString,NM.N,1)) - 32) & 0x7FFF) < 256
FOR XML PATH(''), TYPE
).value('(.)[1]','VARCHAR(8000)') AS OUT_STR
;
Viewing 11 posts - 16 through 25 (of 25 total)
You must be logged in to reply to this topic. Login to reply