October 25, 2012 at 4:37 pm
so i went for the SQL CLR just to see if i could get it to work (been on a learning kick right now so im going with it) and here are my results.
Duration for CHARINDEX = 00:00:21:963
Duration for LEN = 00:00:16:217
Duration for CHARINDEX with COLLATE = 00:00:10:260
Duration for PARSENAME = 00:00:11:043
Duration for LIKE = 00:00:11:293
Duration for LIKE with Collate = 00:00:10:707
Duration for SQLCLR = 00:00:14:427
What i added to the run
DBCC FREEPROCCACHE WITH NO_INFOMSGS;
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS;
DECLARE @HOLDER INT, @Duration CHAR(12), @StartTime DATETIME;
SELECT @StartTime = GETDATE();
SELECT @HOLDER = ID
FROM #testEnvironment
WHERE dbo.RegexCLR(ISNULL(yourData,'')) = 1
SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for SQLCLR = %s',0,1,@Duration) WITH NOWAIT;
and the code for the SQLCLR
public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input,@"^[^\.]+\.[^\.]+$");
}
};
of course when we change the NTEXT to NVARCHAR(MAX) i get the following:
Duration for CHARINDEX = 00:00:06:030
Duration for LEN = 00:00:03:253
Duration for CHARINDEX with COLLATE = 00:00:00:943
Duration for PARSENAME = 00:00:01:147
Duration for LIKE = 00:00:07:253
Duration for LIKE with Collate = 00:00:01:353
Duration for SQLCLR = 00:00:04:040
they are all really close over 1 million records.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
October 26, 2012 at 7:45 am
Try the Parsename function. Not what is was made for, but works quite well for what you need and is much faster that CharIndex and Len.
Duration for CHARINDEX = 00:00:00:933
Duration for Len = 00:00:00:607
Duration for PARSENAME = 00:00:00:127
SET NOCOUNT ON;
IF object_id('tempdb..#testEnvironment') IS NOT NULL
BEGIN
DROP TABLE #testEnvironment;
END;
--1,000,000 Random rows of data
SELECT IDENTITY(INT,1,1) AS ID, CAST(LEFT(yourData, LEN(yourData) - 1) AS varchar(50)) AS yourData
INTO #testEnvironment
FROM (SELECT TOP 1000000
REPLICATE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(NEWID(),'-','')
,'0',''),'9',''),'8',''),'7',''),'6',''),'5',''),'4',''),'3',''),'2',''),'1','') + '.',
(ABS(CHECKSUM(NEWID())) % 2) + 2)
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3
)a(yourData);
--select * from #testEnvironment where PARSENAME(yourData,3) is null --00:00:48:00
DECLARE @HOLDER INT, @Duration CHAR(12), @StartTime DATETIME;
SELECT @StartTime = GETDATE();
SELECT @HOLDER = ID
FROM #testEnvironment
WHERE CHARINDEX('.', yourData, CHARINDEX('.', yourData) + 1) = 0;
SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for CHARINDEX = %s',0,1,@Duration) WITH NOWAIT;
SELECT @StartTime = SYSDATETIME();
SELECT @HOLDER = ID
FROM #testEnvironment
WHERE LEN(CAST(yourData AS NVARCHAR(MAX)))-1 = LEN(REPLACE(CAST(yourData AS NVARCHAR(MAX)) COLLATE Latin1_General_BIN2,'.',''));
SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for Len = %s',0,1,@Duration) WITH NOWAIT;
SELECT @StartTime = SYSDATETIME();
select @HOLDER = ID from #testEnvironment where PARSENAME(yourData,3) is null
SELECT @Duration = CONVERT(CHAR(12),GETDATE()-@StartTime,114);
RAISERROR('Duration for PARSENAME = %s',0,1,@Duration) WITH NOWAIT;
October 26, 2012 at 8:05 am
jdfletchr (10/26/2012)
Try the Parsename function. Not what is was made for, but works quite well for what you need and is much faster that CharIndex and Len.
Did you read the thread with the tests made and the observation on a solultion that was the same as yours?
Without an extra condition, your PARSENAME solution will throw incorrect results.
By the way, another problem is we're dealing with an ntext column.
October 26, 2012 at 8:08 am
...
they are all really close over 1 million records
...
If you want your CLR with Regex to perform well you need to declare your Regex object as static and use Compile option for the pattern. Try changing your CLR to this:
public partial class UserDefinedFunctions
{
static readonly Regex _regex = new Regex(@"^[^\.]+\.[^\.]+$", RegexOptions.Compiled);
[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input);
}
};
October 26, 2012 at 4:09 pm
Eugene Elutin (10/26/2012)
...
they are all really close over 1 million records
...
If you want your CLR with Regex to perform well you need to declare your Regex object as static and use Compile option for the pattern. Try changing your CLR to this:
public partial class UserDefinedFunctions
{
static readonly Regex _regex = new Regex(@"^[^\.]+\.[^\.]+$", RegexOptions.Compiled);
[Microsoft.SqlServer.Server.SqlFunction()]
public static bool RegexCLR(string input)
{
return Regex.IsMatch(input);
}
};
Thanks for the tip on the regex im new to C# but chose to learn that language specifically for CLR's. after the code change and rerunning the tests this is what i got with NTEXT:
Duration for CHARINDEX = 00:00:14:957
Duration for LEN = 00:00:15:723
Duration for CHARINDEX with COLLATE = 00:00:09:790
Duration for PARSENAME = 00:00:10:717
Duration for LIKE = 00:00:11:087
Duration for LIKE with COLLATE = 00:00:10:683
Duration for SQLCLR = 00:00:11:427
and now for NVARCHAR(MAX)
Duration for CHARINDEX = 00:00:06:253
Duration for LEN = 00:00:03:200
Duration for CHARINDEX with COLLATE = 00:00:01:033
Duration for PARSENAME = 00:00:01:327
Duration for LIKE = 00:00:07:237
Duration for LIKE with COLLATE = 00:00:01:457
Duration for SQLCLR = 00:00:01:953
Im actually supprised that by changing the datatype we can chop a factor of 10 off the execution times. never really saw the direct impact of data types like this before.
For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]
Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]
Viewing 5 posts - 16 through 19 (of 19 total)
You must be logged in to reply to this topic. Login to reply