January 17, 2014 at 3:35 am
Some excellent work
DECLARE @TestVar VARCHAR(16);
DECLARE @StartTime DATETIME;
PRINT 'Old Method Scalar Function'
SELECT @StartTime = GETDATE();
SELECT@TestVar = Utility.f_ProperCase(PT.String)
FROM#ProperTest AS PT;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Dohsan'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCaseString
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Luis'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCased
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase1(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Dwain'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCaseStr
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase2(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Magoo varchar255'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.[text]
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase3(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Luis V2'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCaseString
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase4(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
Here as the results:
300k
Old Method Scalar Function
47306 MilliSeconds
Dohsan
19113 MilliSeconds
Luis
5946 MilliSeconds
Dwain
54486 MilliSeconds
Magoo varchar255
5440 MilliSeconds
Luis V2
6433 MilliSeconds
1million
Old Method Scalar Function
160710 MilliSeconds
Dohsan
63750 MilliSeconds
Luis
15450 MilliSeconds
Dwain
175366 MilliSeconds
Magoo varchar255
15513 MilliSeconds
Luis V2
17843 MilliSeconds
How do they handle some special cases?
DECLARE @SingleTestString VARCHAR(255) = 'ándre, ándre, luís and o''brien had a low-budget 3d printer in washington d.c.',
@TestVar2 VARCHAR(255);
PRINT 'Original String'
PRINT @SingleTestString;
PRINT 'Old Method Scalar Function'
SELECT @TestVar2 = Utility.f_ProperCase(@SingleTestString);
PRINT @TestVar2
PRINT 'Dohsan'
SELECT @TestVar2 = ProperCaseString FROM dbo.ProperCase(@SingleTestString);
PRINT @TestVar2
PRINT 'Luis'
SELECT @TestVar2 = ProperCased FROM dbo.ProperCase1(@SingleTestString);
PRINT @TestVar2
PRINT 'Dwain'
SELECT @TestVar2 = ProperCaseStr FROM dbo.ProperCase2(@SingleTestString);
PRINT @TestVar2
PRINT 'Magoo varchar255'
SELECT @TestVar2 = [text] FROM dbo.ProperCase3(@SingleTestString);
PRINT @TestVar2
PRINT 'Luis V2'
SELECT @TestVar2 = ProperCaseString FROM dbo.ProperCase4(@SingleTestString);
PRINT @TestVar2
Results:
Original String
ándre, ándre, luís and o'brien had a low-budget 3d printer in washington d.c.
Old Method Scalar Function
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.c.
Dohsan
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.c.
Luis
Ándre, Ándre, Luís And O'brien Had A Low-budget 3d Printer In Washington D.c.
Dwain
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.
Magoo varchar255
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.
Luis V2
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3d Printer In Washington D.C.
January 17, 2014 at 3:56 am
Original Test strings were small (16 chars), the original function was only for 255 although a few of the functions presented have been able to handle far more. So upping it to nearer the 255 limit to see how they cope (so all functions can be run)
IF OBJECT_ID('tempdb..#ProperTest','U') IS NOT NULL
DROP TABLE #ProperTest;
SELECTREPLICATE(STUFF(STUFF(CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),N%16+1,1,' '),N%3+1,1,' '),14) AS String
INTO #ProperTest
FROMdbo.GetNums(1,300000);
Results:
Old Method Scalar Function
617393 MilliSeconds
Dohsan
162690 MilliSeconds
Luis
36680 MilliSeconds
Dwain
424590 MilliSeconds
Magoo varchar255
44063 MilliSeconds
Luis V2
49436 MilliSeconds
January 17, 2014 at 5:43 am
I ran some similar comparisons, but I had run them up to handle varchar(8000) and ran against a million-row test table with the strings filled to capacity. They were running pretty close to each other (with MM being the best at ~28 seconds), but I haven't tested everything with special characters or multiple trailing spaces. Dropping down to a max length of 255 definitely improved performance (dropping the best to ~22 seconds), but I hadn't completed testing last night before I needed to get some sleep. 😛
All in all, I think the work done on this thread has been excellent.
January 17, 2014 at 8:06 am
That is some excellent work on the testing there Dohsan - well done!:-)
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 17, 2014 at 3:52 pm
BWAAA-HAAA!!!! Dare I say IT? DARE I? Oh, what the heck...
[font="Arial Black"]"Ahhhhhhh MAGOO! You've done it again." [/font]:-P (Sorry ol' friend... I've been holding it back for more than a month and couldn't resist).
--Jeff Moden
Change is inevitable... Change for the better is not.
January 17, 2014 at 4:52 pm
Jeff Moden (1/17/2014)
BWAAA-HAAA!!!! Dare I say IT? DARE I? Oh, what the heck...[font="Arial Black"]"Ahhhhhhh MAGOO! You've done it again." [/font]:-P (Sorry ol' friend... I've been holding it back for more than a month and couldn't resist).
If I am honest, I kind of missed it :hehe:
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 17, 2014 at 5:22 pm
I agree that you've done some very thorough testing here. Excellent work!
Dohsan (1/17/2014)
Dwain
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.
I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
January 17, 2014 at 5:49 pm
dwain.c (1/17/2014)
I agree that you've done some very thorough testing here. Excellent work!Dohsan (1/17/2014)
Dwain
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.
I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.
Don't make me start with contractions or possessives. You'Ll Say It'S My Fault.:hehe:
January 18, 2014 at 9:56 am
Luis Cazares (1/17/2014)
dwain.c (1/17/2014)
I agree that you've done some very thorough testing here. Excellent work!Dohsan (1/17/2014)
Dwain
Ándre, Ándre, Luís And O'Brien Had A Low-Budget 3D Printer In Washington D.C.
I wasn't sure how you wanted to handle "Low-Budget' and "3D" but either can be handled in the opposite manner by adjusting the match pattern.
Don't make me start with contractions or possessives. You'Ll Say It'S My Fault.:hehe:
hah very true!
January 19, 2014 at 5:42 am
This thread has been fun to watch, thanks folks! Here's an option I'm surprised nobody posted up:
CREATE FUNCTION [dbo].[IF_ProperWithREPLACE]
(
@inputstring VARCHAR(8000)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
(
SELECT ProperisedString =
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
' '+@inputstring COLLATE Latin1_General_BIN,
' a',' A'),' b',' B'),' c',' C'),' d',' D'),' e',' E'),' f',' F'),' g',' G'),' h',' H'),' i',' I'),
' j',' J'),' k',' K'),' l',' L'),' m',' M'),' n',' N'),' o',' O'),' p',' P'),' q',' Q'),' r',' R'),
' s',' S'),' t',' T'),' u',' U'),' v',' V'),' w',' W'),' x',' X'),' y',' Y'),' z',' Z')
)
It's about 8x faster than any of the solutions posted so far, leaving some scope for dealing with weird stuff like accents.
For better assistance in answering your questions, please read this[/url].
Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
February 26, 2014 at 12:39 pm
I came back to this thread today because I was trying to proper case some stuff around here. I needed to exclude tokens that contain at least a single digit but all other tokens should be init caps.
I made a slight modification to Dwain's that worked well for my purposes.
CREATE FUNCTION [dbo].[ProperCaseWithNumbers]
(
@StrIn VARCHAR(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
SELECT STUFF((
SELECT CASE
WHEN split.Item LIKE '%[0-9]%'
THEN ' ' + split.Item
ELSE ' ' + Upper(LEFT(split.Item, 1)) + Lower(Substring(split.Item, 2, 255))
END AS word
FROM dbo.Delimitedsplit8k(@StrIn, ' ') split
FOR XML PATH('')
,TYPE
).value('.', 'varchar(255)'), 1, 1, '') AS ProperCased
We have some data coming from the mainframe and it is all caps but I need to make this data more "user friendly". Here is an example of this one in action.
declare @MyVals table (SeriesName varchar(25))
insert @MyVals
select '9400 SERIES' union all
select '9760 SERIES' union all
select 'ADE360 SERIES' union all
select 'CUSTOM' union all
select 'M16360 SERIES' union all
select 'N1260 SERIES' union all
select 'N3160 SERIES' union all
select 'PA360 SERIES' union all
select 'S260 SERIES'
select *
from @MyVals v
cross apply dbo.ProperCaseWithNumbers(v.SeriesName)
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 26, 2014 at 5:21 pm
Sean Lange (2/26/2014)
I made a slight modification to Dwain's that worked well for my purposes.
Can't say I see that much of a resemblance but it is always good to hear success stories.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
February 27, 2014 at 7:23 am
dwain.c (2/26/2014)
Sean Lange (2/26/2014)
I made a slight modification to Dwain's that worked well for my purposes.
Can't say I see that much of a resemblance but it is always good to hear success stories.
Doh!!! That's because it was a modification of the code Luis posted.
Thanks Luis!!!
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
February 27, 2014 at 8:32 am
Sean Lange (2/27/2014)
dwain.c (2/26/2014)
Sean Lange (2/26/2014)
I made a slight modification to Dwain's that worked well for my purposes.
Can't say I see that much of a resemblance but it is always good to hear success stories.
Doh!!! That's because it was a modification of the code Luis posted.
Thanks Luis!!!
I knew the code look familiar 😀
I guess that's the benefit of collective intelligence.
February 27, 2014 at 8:43 am
I wonder if a CLR function would be worth considering here? .NET includes the System.Globalization.TextInfo.ToTitleCase() function which seems to do a pretty good job getting things properly capitalized. Not perfect but perhaps an extension method could handle the edge cases ?
Viewing 15 posts - 16 through 30 (of 30 total)
You must be logged in to reply to this topic. Login to reply