April 12, 2012 at 9:34 am
Hi ,
You can check with below code also
Declare @ sql varchar(Max)
Delcare @Tablename Varchar(255)
SET Tablename = 'Test1'
SET @Sql=( 'UPDATE '+ @TableName + '
SET PatientAddress2 =NULL
FROM '+ @TableName + '
WHERE PatientAddress2 IS NOT NULL AND
(LOWER(RTRIM(LTRIM(PatientAddress2))) NOT LIKE ''[a-z]%'' AND RTRIM(LTRIM(PatientAddress2)) NOT LIKE ''[0-9]%''
AND RTRIM(LTRIM(PatientAddress2)) not like ''#%'')
AND LEN(LTRIM(RTRIM(PatientAddress2)))<=1 and isnull(ASCII(LEFT(LTRIM(RTRIM(PatientAddress2)),1)),0)=0 ')
Exec (@Sql)
April 12, 2012 at 10:18 am
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
April 12, 2012 at 11:10 am
Eugene Elutin (4/12/2012)
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
Which is yet another reason to use code like mine. It doesn't require a specialized skillset to maintain/modify/refactor mine, but CLR does require a more rare skillset. Something to keep in mind when picking solutions. Not just "does it do what I need?", but "can I understand it well enough to extend/fix it if I need to in 6 months?" That's an important factor.
That, of course, ignores the security issues with enabling CLR, and other potential drawbacks to it.
Of course, mine has some of the same drawbacks. Not portable outside of T-SQL (Oracle/DB2/MySQL, you'll have to re-write mine or CLR). Can't be used in engines prior to SQL 2005.
And mine does require a Numbers table, which does take disk space (minor), and needs to be kept intact (minimal effort, but matters if you're deploying to customer servers, et al), which a CLR solution doesn't need.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 12, 2012 at 11:14 am
I just ran another speed test. 23,028 rows needed phone numbers cleaned. Took 734 milliseconds total, 733 ms CPU time.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 12, 2012 at 11:46 am
GSquared (4/12/2012)
Eugene Elutin (4/12/2012)
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
Which is yet another reason to use code like mine. It doesn't require a specialized skillset to maintain/modify/refactor mine, but CLR does require a more rare skillset. Something to keep in mind when picking solutions. Not just "does it do what I need?", but "can I understand it well enough to extend/fix it if I need to in 6 months?" That's an important factor.
That, of course, ignores the security issues with enabling CLR, and other potential drawbacks to it.
Of course, mine has some of the same drawbacks. Not portable outside of T-SQL (Oracle/DB2/MySQL, you'll have to re-write mine or CLR). Can't be used in engines prior to SQL 2005.
And mine does require a Numbers table, which does take disk space (minor), and needs to be kept intact (minimal effort, but matters if you're deploying to customer servers, et al), which a CLR solution doesn't need.
What happens to your code if you build it with a dynamic tally table as part of the code?
April 12, 2012 at 11:59 am
Lynn Pettis (4/12/2012)
GSquared (4/12/2012)
Eugene Elutin (4/12/2012)
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
Which is yet another reason to use code like mine. It doesn't require a specialized skillset to maintain/modify/refactor mine, but CLR does require a more rare skillset. Something to keep in mind when picking solutions. Not just "does it do what I need?", but "can I understand it well enough to extend/fix it if I need to in 6 months?" That's an important factor.
That, of course, ignores the security issues with enabling CLR, and other potential drawbacks to it.
Of course, mine has some of the same drawbacks. Not portable outside of T-SQL (Oracle/DB2/MySQL, you'll have to re-write mine or CLR). Can't be used in engines prior to SQL 2005.
And mine does require a Numbers table, which does take disk space (minor), and needs to be kept intact (minimal effort, but matters if you're deploying to customer servers, et al), which a CLR solution doesn't need.
What happens to your code if you build it with a dynamic tally table as part of the code?
Reduced CPU time 633 ms for the same records. Limited the Numbers CTE to 100 rows, because this is phone numbers and doesn't go beyond that in length.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 12, 2012 at 12:59 pm
GSquared (4/12/2012)
Lynn Pettis (4/12/2012)
GSquared (4/12/2012)
Eugene Elutin (4/12/2012)
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
Which is yet another reason to use code like mine. It doesn't require a specialized skillset to maintain/modify/refactor mine, but CLR does require a more rare skillset. Something to keep in mind when picking solutions. Not just "does it do what I need?", but "can I understand it well enough to extend/fix it if I need to in 6 months?" That's an important factor.
That, of course, ignores the security issues with enabling CLR, and other potential drawbacks to it.
Of course, mine has some of the same drawbacks. Not portable outside of T-SQL (Oracle/DB2/MySQL, you'll have to re-write mine or CLR). Can't be used in engines prior to SQL 2005.
And mine does require a Numbers table, which does take disk space (minor), and needs to be kept intact (minimal effort, but matters if you're deploying to customer servers, et al), which a CLR solution doesn't need.
What happens to your code if you build it with a dynamic tally table as part of the code?
Reduced CPU time 633 ms for the same records. Limited the Numbers CTE to 100 rows, because this is phone numbers and doesn't go beyond that in length.
Reduced CPU time to 633 ms or by 633 ms? I'm going to go with to 633 ms shaving off 100 ms from what you reported earlier.
April 12, 2012 at 1:33 pm
Lynn Pettis (4/12/2012)
GSquared (4/12/2012)
Lynn Pettis (4/12/2012)
GSquared (4/12/2012)
Eugene Elutin (4/12/2012)
...
It's entirely possible someone with better C#/VB.NET skill than mine can make a CLR function that's faster, but the options I found online for that were all much, much slower, or buggy (in one case, it scrambled character-sequences in some cases).
...
That is absolute true! If you need CLR function - give it to someone with good c# skills and knowledge about how CLR functions should be written to achieve the best possible performance when called from SQL. Again, it will require better than average c# skills.
Which is yet another reason to use code like mine. It doesn't require a specialized skillset to maintain/modify/refactor mine, but CLR does require a more rare skillset. Something to keep in mind when picking solutions. Not just "does it do what I need?", but "can I understand it well enough to extend/fix it if I need to in 6 months?" That's an important factor.
That, of course, ignores the security issues with enabling CLR, and other potential drawbacks to it.
Of course, mine has some of the same drawbacks. Not portable outside of T-SQL (Oracle/DB2/MySQL, you'll have to re-write mine or CLR). Can't be used in engines prior to SQL 2005.
And mine does require a Numbers table, which does take disk space (minor), and needs to be kept intact (minimal effort, but matters if you're deploying to customer servers, et al), which a CLR solution doesn't need.
What happens to your code if you build it with a dynamic tally table as part of the code?
Reduced CPU time 633 ms for the same records. Limited the Numbers CTE to 100 rows, because this is phone numbers and doesn't go beyond that in length.
Reduced CPU time to 633 ms or by 633 ms? I'm going to go with to 633 ms shaving off 100 ms from what you reported earlier.
Yeah. I started to type, "Reduced CPU time by ..." and list a percentage. Then decided the actual number of milliseconds it ended up with would be better, and deleted "by" and forgot to put "to" in there.
Yes, reduced from 734 ms to 633 ms. About 11%.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 12, 2012 at 7:32 pm
GSquared,
It is good to know it when someone uses a posted solution in a production environment because it at least confirms that it probably works pretty well (i.e., no bugs).
The reason I seemed a little testy early on about performance is because I took your solution and stacked in next to mine to check the execution plan. In that case, I added a 100 number CTE tally. The results screamed at me that mine would perform better as your query plan took up 97% of the cost (without the UPDATE). Obviously, I know that isn't the full story and that's why I didn't post that result before. The speed you're reporting looks pretty good in fact it's probably doubtful that mine will beat it, but at least it was fun to write.
Until that is when you made that ampersand bug jump out at me. Very irritating side effect of XML and it made the code-around ugly.
Suffice it to say there are a couple of lessons learned here and I may still yet take a page from Jeff Moden, anti-RBAR Jedi and use the 1000000 row test harness to see just how bad my solution is.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 12, 2012 at 8:48 pm
After all was said and done, I decided to add the famous 1000000 row test harness (only with 100000 rows though), so I came up with the following two tests and results (run on Dell Inspiron Core i5 4GB).
1st solution: Recursive CTE followed by Dynamic SQL to do the UPDATE:
CREATE TABLE #t (strings VARCHAR(25));
CREATE TABLE #b (bad CHAR(1));
DECLARE @SQL NVARCHAR(MAX), @n INT, @i INT;
DECLARE @base VARCHAR(83)
SET @base = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789~!@#$%^&*()_+-=[]{}?/'
DECLARE @StartValue INT, @EndValue INT, @Range INT
SELECT @StartValue = 1
,@EndValue = 83
,@Range = @EndValue - @StartValue + 1
INSERT INTO #t (strings)
SELECT ''
UNION ALL SELECT '@'
UNION ALL SELECT '&'
;WITH cteTally (n) AS (
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)
INSERT INTO #t (strings)
SELECT SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
FROM cteTally
SET STATISTICS TIME ON
PRINT 'QUERY 1 START'
-- Identify the bad characters
;WITH BadStrings AS (
SELECT strings
FROM #t
WHERE strings LIKE '%[^A-Za-z0-9]%'
)
,Parse AS (
SELECT SUBSTRING(strings + '*', PATINDEX('%[^A-Za-z0-9]%',strings + '*'), 1) as bad
,SUBSTRING(strings + '*', 1+PATINDEX('%[^A-Za-z0-9]%',strings + '*'), LEN(strings + '*')) as rest
FROM BadStrings
UNION ALL
SELECT SUBSTRING(rest, PATINDEX('%[^A-Za-z0-9]%',rest), 1)
,SUBSTRING(rest, 1+PATINDEX('%[^A-Za-z0-9]%',rest), LEN(rest))
FROM Parse WHERE LEN(rest) > 0 and
SUBSTRING(rest, 1+PATINDEX('%[^A-Za-z0-9]%',rest), LEN(rest)) <> ''
)
INSERT INTO #b (bad)
SELECT DISTINCT bad
FROM Parse b
PRINT 'QUERY 1 END'
-- Now clean them up
SELECT @SQL = 'UPDATE #t SET strings = ', @i = 1, @n = (SELECT COUNT(*) FROM #b WHERE bad <> '&')
WHILE @i <= @n + 1
SELECT @SQL = @SQL + 'REPLACE(', @i = @i + 1
SELECT @SQL = @SQL + 'strings, '''
+(SELECT bad + ''', ''''), '''FROM #bWHERE bad <> '&' FOR XML PATH(''))
SELECT @SQL = SUBSTRING(@SQL, 1, LEN(@SQL)-3) + ',''&'','''')' + 'WHERE strings LIKE ''%[^A-Za-z0-9]%'''
--SELECT @SQL
PRINT 'QUERY 2 START'
EXEC (@SQL)
SELECT * FROM #t
PRINT 'QUERY 2 END'
DROP TABLE #t
DROP TABLE #b
Results:
1. 100003 rows updated in about 13 seconds including data set up time
2nd solution: Provided by GSquared:
CREATE TABLE #t (strings VARCHAR(25));
DECLARE @base VARCHAR(83)
SET @base = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789~!@#$%^&*()_+-=[]{}?/'
DECLARE @StartValue INT, @EndValue INT, @Range INT
SELECT @StartValue = 1
,@EndValue = 83
,@Range = @EndValue - @StartValue + 1
INSERT INTO #t (strings)
SELECT ''
UNION ALL SELECT '@'
UNION ALL SELECT '&'
;WITH cteTally (n) AS (
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)
INSERT INTO #t (strings)
SELECT SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue,1)
FROM cteTally
SET STATISTICS TIME ON
PRINT 'QUERY START'
;WITH Numbers (Number) AS (
SELECT TOP 100000 ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2)
SELECT *,
(
SELECT Sub + ''
FROM (SELECT SUBSTRING(Strings, Number, 1) AS Sub,
Number
FROM Numbers -- table of positive integers
WHERE Number BETWEEN 1 AND LEN(Strings)) AS Parser
WHERE Sub LIKE '[0-9a-zA-Z]'
ORDER BY Number
FOR XML PATH(''),
TYPE).value('.[1]', 'varchar(100)') fixedstring
FROM #T ;
PRINT 'QUERY END'
DROP TABLE #t
--DROP TABLE #b
Results:
1. Code only SELECTs the resulting fixed strings (no UPDATE)
2. Query cancelled after 4:52 (mm:ss) with 13158 rows returned
I must have done something wrong!
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 13, 2012 at 6:53 am
You can't compare a Select to an Update.
Here's a test of my method, using the test data you came up with:
SET NOCOUNT ON ;
IF OBJECT_ID(N'tempdb..#t') IS NOT NULL
DROP TABLE #t ;
CREATE TABLE #t (strings VARCHAR(25)) ;
DECLARE @base VARCHAR(83)
SET @base = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789~!@#$%^&*()_+-=[]{}?/'
DECLARE @StartValue INT,
@EndValue INT,
@Range INT
SELECT @StartValue = 1,
@EndValue = 83,
@Range = @EndValue - @StartValue + 1
INSERT INTO #t
(strings)
SELECT ''
UNION ALL
SELECT '@'
UNION ALL
SELECT '&' ;
WITH cteTally(n)
AS (SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2)
INSERT INTO #t
(strings)
SELECT SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1)
FROM cteTally
PRINT 'QUERY START'
SET STATISTICS TIME ON ;
WITH Seeds(Seed)
AS (SELECT V
FROM ( VALUES ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1), ( 1) ) AS T (V)),
Numbers(Number)
AS (SELECT ROW_NUMBER() OVER (ORDER BY S1.Seed)
FROM Seeds AS S1
CROSS JOIN Seeds AS S2)
UPDATE #t
SET strings =
( SELECT Sub + ''
FROM (SELECT SUBSTRING(Strings, Number, 1) AS Sub,
Number
FROM Numbers -- table of positive integers
WHERE Number BETWEEN 1 AND LEN(Strings)) AS Parser
WHERE Sub LIKE '[0-9a-zA-Z]'
ORDER BY Number
FOR XML PATH(''),
TYPE).value('.[1]', 'varchar(100)')
SET STATISTICS TIME OFF ;
PRINT 'QUERY END'
DROP TABLE #t
--DROP TABLE #b
On my machine, results are:
QUERY START
SQL Server Execution Times:
CPU time = 5257 ms, elapsed time = 6053 ms.
QUERY END
Same machine, your version took about 12 seconds. Twice as long. Remove the output from it, and it looks like this:
SET NOCOUNT ON ;
IF OBJECT_ID(N'tempdb..#t') IS NOT NULL
DROP TABLE #t ;
IF OBJECT_ID(N'tempdb..#b') IS NOT NULL
DROP TABLE #b ;
CREATE TABLE #t (strings VARCHAR(25)) ;
CREATE TABLE #b (bad CHAR(1)) ;
DECLARE @SQL NVARCHAR(MAX),
@n INT,
@i INT ;
DECLARE @base VARCHAR(83)
SET @base = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789~!@#$%^&*()_+-=[]{}?/'
DECLARE @StartValue INT,
@EndValue INT,
@Range INT
SELECT @StartValue = 1,
@EndValue = 83,
@Range = @EndValue - @StartValue + 1
INSERT INTO #t
(strings)
SELECT ''
UNION ALL
SELECT '@'
UNION ALL
SELECT '&' ;
WITH cteTally(n)
AS (SELECT TOP 100000
ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2)
INSERT INTO #t
(strings)
SELECT SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1) + SUBSTRING(@base,
ABS(CHECKSUM(NEWID()))
% @Range
+ @StartValue, 1)
+ SUBSTRING(@base, ABS(CHECKSUM(NEWID())) % @Range + @StartValue, 1)
FROM cteTally
PRINT 'QUERY 1 START'
SET STATISTICS TIME ON
-- Identify the bad characters
;
WITH BadStrings
AS (SELECT strings
FROM #t
WHERE strings LIKE '%[^A-Za-z0-9]%'),
Parse
AS (SELECT SUBSTRING(strings + '*', PATINDEX('%[^A-Za-z0-9]%', strings + '*'), 1) AS bad,
SUBSTRING(strings + '*', 1 + PATINDEX('%[^A-Za-z0-9]%', strings + '*'), LEN(strings + '*')) AS rest
FROM BadStrings
UNION ALL
SELECT SUBSTRING(rest, PATINDEX('%[^A-Za-z0-9]%', rest), 1),
SUBSTRING(rest, 1 + PATINDEX('%[^A-Za-z0-9]%', rest), LEN(rest))
FROM Parse
WHERE LEN(rest) > 0
AND SUBSTRING(rest, 1 + PATINDEX('%[^A-Za-z0-9]%', rest), LEN(rest)) <> '')
INSERT INTO #b
(bad)
SELECT DISTINCT
bad
FROM Parse b
PRINT 'QUERY 1 END'
-- Now clean them up
SELECT @SQL = 'UPDATE #t SET strings = ',
@i = 1,
@n = (SELECT COUNT(*)
FROM #b
WHERE bad <> '&')
WHILE @i <= @n + 1
SELECT @SQL = @SQL + 'REPLACE(',
@i = @i + 1
SELECT @SQL = @SQL + 'strings, ''' + (SELECT bad + ''', ''''), '''
FROM #b
WHERE bad <> '&'
FOR XML PATH(''))
SELECT @SQL = SUBSTRING(@SQL, 1, LEN(@SQL) - 3) + ',''&'','''')' + 'WHERE strings LIKE ''%[^A-Za-z0-9]%'''
--SELECT @SQL
PRINT 'QUERY 2 START'
EXEC (@SQL)
SET STATISTICS TIME OFF ;
PRINT 'QUERY 2 END'
Results:
QUERY 1 START
SQL Server Execution Times:
CPU time = 8674 ms, elapsed time = 8849 ms.
QUERY 1 END
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
QUERY 2 START
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 1045 ms, elapsed time = 1039 ms.
SQL Server Execution Times:
CPU time = 1076 ms, elapsed time = 1062 ms.
QUERY 2 END
Total runtime, about 11 seconds, not including the data generation.
By the way, the two-step CTE Numbers table (Seeds then Numbers) that I used in my modification, is faster than using system tables to generate it. If you're going to query a table, from disk or cache, you may as well use an actual Numbers table instead of a CTE version. The whole idea is to make it a purely RAM+CPU object with no table behind it at all. That avoids locking overhead, which is the main reason a virtual table is faster than a physical one.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 13, 2012 at 7:07 am
On seconds runs (taking advantage of caching), my update-only version of your script took about 11 seconds, while my script took 3.8 seconds.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 13, 2012 at 7:35 am
GSquared,
I see no flaws in your comparison. Clearly the UPDATE version of yours is faster than the original one you posted with only a SELECT. Too bad I didn't take the time to make that modification myself.
I can't take credit for the SYS tables tally generator nor the random numbers generator as I appropriated them from Jeff Moden's excellent articles on the subject. I'll let him take it up with you if he disagrees on which is faster.
I would like to point out though, that if you know in advance exactly what are "bad" characters, you don't need to run my initial SELECT (that was really in there kind of just for fun anyway). You can just keep them stored in a table or put them into a CTE and then run only my Dynamic SQL, which from your timing runs appears to run in about 2 seconds.
Thanks for taking the time to get it all sorted out.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
April 13, 2012 at 8:15 am
You don't need to know the bad characters at all. White-list, don't black-list. It's more secure, easier, and faster.
You could modify my version to use a table of "allowed characters", skip the Like statement and join to that table (virtual or real), and probably gain a few milliseconds on the whole thing.
And, yes, I've discussed uses of Numbers tables with Jeff. More than once. Several times, I've helped him speed up code he was using for various things. His skillset and mine overlap considerably. I was one of the people who told him to start using a 0-base Numbers/Tally table several years ago, instead of starting it at 1, to get a performance improvement on string parsing, for example.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
April 29, 2012 at 6:06 pm
GSquared (4/13/2012)
You don't need to know the bad characters at all. White-list, don't black-list. It's more secure, easier, and faster.You could modify my version to use a table of "allowed characters", skip the Like statement and join to that table (virtual or real), and probably gain a few milliseconds on the whole thing.
And, yes, I've discussed uses of Numbers tables with Jeff. More than once. Several times, I've helped him speed up code he was using for various things. His skillset and mine overlap considerably. I was one of the people who told him to start using a 0-base Numbers/Tally table several years ago, instead of starting it at 1, to get a performance improvement on string parsing, for example.
It turns out that the 1 based parser is faster but it was definitely a good thought.
--Jeff Moden
Change is inevitable... Change for the better is not.
Viewing 15 posts - 16 through 29 (of 29 total)
You must be logged in to reply to this topic. Login to reply