January 16, 2014 at 8:42 am
I'm having an issue converting a scalar function into Table-Valued function and was hoping some of the minds on this forum would be able to help.
Here is the original Function:
CREATE FUNCTION [Utility].[f_ProperCase]
(
@strIn VARCHAR(255)
)
RETURNS VARCHAR(255)
AS
BEGIN
IF @strIn IS NULL
RETURN NULL
DECLARE
@strOut VARCHAR(255),
@i INT,
@Up BIT,
@C VARCHAR(2)
SELECT
@strOut = '',
@i = 0,
@Up = 1
WHILE @i <= DATALENGTH(@strIn)
BEGIN
SET @C = SUBSTRING(@strIn,@i,1)
IF @C IN (' ','-','''')
BEGIN
SET @strOut = @strOut + @C
SET @Up = 1
END
ELSE
BEGIN
IF @up = 1
SET @C = UPPER(@c)
ELSE
SET @C = LOWER(@c)
SET @strOut = @strOut + @C
SET @Up = 0
END
SET @i = @i + 1
END
RETURN @strOut
END
Here is my attempt at a Table Valued Function to perform the same task:
CREATE FUNCTION [dbo].[ProperCase]
(
@StrIn VARCHAR(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
SELECTProperCaseString = (
SELECTCASE
WHEN GN.N = 1 THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
WHEN SUBSTRING(@StrIn,GN.N-1,1) IN (' ','-','''') THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
ELSE LOWER(SUBSTRING(@StrIn,GN.N,1))
END AS [text()]
FROMdbo.GetNums(1,ISNULL(LEN(@StrIn),1)) AS GN
FOR XML PATH('')
);
GO
Part of the issue is that the XML concatenation is converting the blanks spaces into "& # x 2 0;" . I think I've found a way to correct this and get the output i would like (Shown in the below code):
DECLARE @StringTest TABLE
(
StrIn VARCHAR(255)
)
INSERT INTO @StringTest(StrIn)
VALUES ('thIS iS A StRing'),
('ANOTHEr STRING'),
('again another string')
--Converts special characters
SELECTCA1.ProperCase AS ProperCase
FROM@StringTest AS ST
CROSS APPLY (
SELECTCASE
WHEN GN.N = 1 THEN UPPER(SUBSTRING(ST.StrIn,GN.N,1))
WHEN SUBSTRING(ST.StrIn,GN.N-1,1) IN (' ','-','''') THEN UPPER(SUBSTRING(ST.StrIn,GN.N,1))
WHEN SUBSTRING(ST.StrIn,GN.N,1) = ' ' THEN CHAR(32)
ELSE LOWER(SUBSTRING(ST.StrIn,GN.N,1))
END AS [text()]
FROMdbo.GetNums(1,ISNULL(LEN(ST.StrIn),1)) AS GN
FOR XML PATH('')
) AS CA1 (ProperCase)
--Shows special characters i.e. space as a space
SELECTCA1.ProperCase.value('/MyString[1]','varchar(8000)') AS ProperCase
FROM@StringTest AS ST
CROSS APPLY (
SELECTCASE
WHEN GN.N = 1 THEN UPPER(SUBSTRING(ST.StrIn,GN.N,1))
WHEN SUBSTRING(ST.StrIn,GN.N-1,1) IN (' ','-','''') THEN UPPER(SUBSTRING(ST.StrIn,GN.N,1))
WHEN SUBSTRING(ST.StrIn,GN.N,1) = ' ' THEN CHAR(32)
ELSE LOWER(SUBSTRING(ST.StrIn,GN.N,1))
END AS [text()]
FROMdbo.GetNums(1,ISNULL(LEN(ST.StrIn),1)) AS GN
FOR XML PATH(''), root('MyString'), type
) AS CA1 (ProperCase)
The main part of the issue though, and the long winded point of this post, is that i cannot seem to reflect this in a function
CREATE FUNCTION [dbo].[ProperCase]
(
@StrIn VARCHAR(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
SELECTProperCaseString = (
SELECT SQ1.[text()].value('/MyString[1]','varchar(8000)')
FROM
(SELECTCASE
WHEN GN.N = 1 THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
WHEN SUBSTRING(@StrIn,GN.N-1,1) IN (' ','-','''') THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
WHEN SUBSTRING(@StrIn,GN.N,1) = ' ' THEN ' '
ELSE LOWER(SUBSTRING(@StrIn,GN.N,1))
END AS [text()]
FROMdbo.GetNums(1,ISNULL(LEN(@StrIn),1)) AS GN
FOR XML PATH(''), root('MyString'), type
) AS SQ1
);
GO
I get the follow error when I try to create the function:
Msg 8155, Level 16, State 2, Procedure ProperCase, Line 24
No column name was specified for column 1 of 'SQ1'.
Msg 207, Level 16, State 1, Procedure ProperCase, Line 13
Invalid column name 'text()'.
Any suggestions/Improvements welcome!
January 16, 2014 at 9:08 am
Think I may have figured it out:
CREATE FUNCTION [dbo].[ProperCase]
(
@StrIn VARCHAR(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
SELECTProperCaseString = (
(SELECTCASE
WHEN GN.N = 1 THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
WHEN SUBSTRING(@StrIn,GN.N-1,1) IN (' ','-','''') THEN UPPER(SUBSTRING(@StrIn,GN.N,1))
ELSE LOWER(SUBSTRING(@StrIn,GN.N,1))
END AS [text()]
FROMdbo.GetNums(1,ISNULL(LEN(@StrIn),1)) AS GN
FOR XML PATH(''), TYPE
).value('.','varchar(8000)')
);
Now to see if it actually is an improvement!
And if anyone has anything to add please do.
January 16, 2014 at 9:19 am
You have definitely peaked my interest with this one. I've been playing with adapting Jeff Moden's DelimitedSplit8K function covered in the article at http://www.sqlservercentral.com/articles/Tally+Table/72993, but I don't have your solution yet. I'm running into the case where multiple trailing spaces is killing it. Then there's the matter of allowing for more than one delimiter at once. :w00t: I don't know if I'm running down a rabbit hole with no hope here, but it is definitely cool stuff. 😀
I'm curious as to the performance when using the XML approach versus the tally table approach, so please post your performance testing results.
January 16, 2014 at 9:24 am
Hi,
I'm not sure what your GetNums function looks like. I used the DelimitedSplit8K which you can find in the following article which explains how it works. http://www.sqlservercentral.com/articles/Tally+Table/72993
Here's my option for you that seems to be doing what you need.
CREATE FUNCTION [dbo].[ProperCase]
(
@StrIn VARCHAR(255)
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
SELECT STUFF((SELECT ' ' + UPPER( LEFT( split.Item, 1)) + LOWER(SUBSTRING( split.Item, 2, 255)) word
FROM dbo.DelimitedSplit8K( @StrIn, ' ') split
FOR XML PATH(''),TYPE).value('.', 'varchar(255)'), 1, 1, '') AS ProperCased
January 16, 2014 at 9:26 am
Can you post GetNums? There have been a few different proper case functions around here over the years. They are always a challenge because sometime you want every word capitalized and sometimes you want Title Case.
Parade Of The Horribles
Parade of the Horribles
Subtle difference but makes a huge difference.
What about things like Washington D.C.?
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
January 16, 2014 at 9:37 am
dbo.GetNums is as per below, straight from one of Jeff's discussions/articles I believe:
CREATE FUNCTION [dbo].[GetNums]
(
@low AS BIGINT,
@high AS BIGINT
)
RETURNS TABLE WITH SCHEMABINDING
AS
RETURN
WITH
L0 AS (SELECT c FROM (VALUES(1),(1)) AS D(c)),
L1 AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),
L2 AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),
L3 AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),
L4 AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),
L5 AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),
Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM L5)
SELECT TOP(@high - @low + 1) @low + rownum - 1 AS n
FROM Nums
ORDER BY rownum;
For our purposes capitalising each Letter of a word is acceptable.
January 16, 2014 at 9:51 am
Trying to piece together some test data, some Uppercase letters with some spaces inserted:
IF OBJECT_ID('tempdb..#ProperTest','U') IS NOT NULL
DROP TABLE #ProperTest;
SELECTSTUFF(STUFF(CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%26+65),N%16+1,1,' '),N%3+1,1,' ') AS String
INTO #ProperTest
FROMdbo.GetNums(1,300000);
Run the Code
PRINT 'Old Method Scalar Function'
SELECTPT.String,
Utility.f_ProperCase(PT.String) AS Proper
FROM#ProperTest AS PT;
PRINT 'Dohsan'
SELECTPT.String,
CA1.ProperCaseString
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase(PT.String) AS CA1;
PRINT 'Luis'
SELECTPT.String,
CA1.ProperCased
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase1(PT.String) AS CA1;
DECLARE @TestVar VARCHAR(16);
DECLARE @StartTime DATETIME;
PRINT 'Old Method Scalar Function'
SELECT @StartTime = GETDATE();
SELECT@TestVar = Utility.f_ProperCase(PT.String)
FROM#ProperTest AS PT;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Dohsan'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCaseString
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
PRINT 'Luis'
SELECT @StartTime = GETDATE();
SELECT@TestVar = CA1.ProperCased
FROM#ProperTest AS PT
CROSS
APPLYdbo.ProperCase1(PT.String) AS CA1;
PRINT CAST(DATEDIFF(ms,@StartTime,GETDATE()) AS VARCHAR) + ' MilliSeconds';
Results:
Old Method Scalar Function
47843 MilliSeconds
Dohsan
18906 MilliSeconds
Luis
4710 MilliSeconds
January 16, 2014 at 9:55 am
That's nice Dohsan,
Just make sure my function gets the correct results as it won't check for characters different than spaces when trying to capitalize the first letter of each word.
January 16, 2014 at 9:57 am
Yes, I'll have to play around with it a little more when I have some time tomorrow.
I must say i hadn't thought about using the splitter to separate the words out, so +1 for you and ed wagner for having that idea!
January 16, 2014 at 12:46 pm
Luis, your function is much simpler than mine was working out to be. I was working on creating a derivative on the DelimitedSplit8K function and was most of the way there, but the presence of multiple trailing spaces was giving me fits during testing. Also, the multiple delimiters were definitely impacting performance.
I like the way you used the unaltered and well-tested function and manipulated the output of it. Mine would have required testing to make sure all bases were covered.
January 16, 2014 at 2:43 pm
Ed, you gave me an idea an here's what I came out with after playing with the Jeff's splitter. Dohsan would be able to change the word delimiters as he considers correct without much problem.
It's a simple query but can be easily converted in a iTVF.
DECLARE @pString VARCHAR(255) = 'this is a 3d-printer from wahsington d.c.';
--===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000...
-- enough to cover NVARCHAR(4000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each word)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) IN(' ', '.', '-', '''')
)
SELECT (SELECT CASE WHEN s.N1 IS NULL
THEN LOWER(SUBSTRING(@pString,t.N,1))
ELSE UPPER(SUBSTRING(@pString,t.N,1)) END
FROM cteTally t
LEFT JOIN cteStart s ON t.N = s.N1
ORDER BY N
FOR XML PATH(''),TYPE).value('.', 'varchar(255)') ProperCaseString
;
EDIT: I reduced the datatypes for GREAT performance improvement.
January 16, 2014 at 4:59 pm
Here's one similar to Luis, but handles detecting the start of a word differently and works on nvarchar(4000) input.
create function [dbo].[propercase](@text nvarchar(4000))
returns table with schemabinding
as
return (
with seed1 (a)
as
(
select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all
select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1
),
numbers (n) as
(
select top (datalength(@text)) row_number() over (order by (select null))
from seed1 s1, seed1 s2, seed1 s3
)
select a.b.value('(./text())[1]', 'nvarchar(4000)') as [text]
from (
select
case
when n = 1 then upper(substring(@text, n, 1))
when substring(@text, n - 1, 2) like N'[^a-z][a-z]' collate Latin1_General_CI_AI then upper(substring(@text, n, 1))
else lower(substring(@text, n, 1))
end
from numbers
for xml path (''), type
) a (b)
)
On the small test strings it performs about 50% slower than Luis original one, but that is just because of the NVARCHAR(4000) compatibility.
Below is a varchar(255) version that is comparable to Luis, but with the extra word start checks.
create function [dbo].[propercase](@text varchar(255))
returns table with schemabinding
as
return (
with seed1 (a)
as
(
select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1
),
numbers (n) as
(
select top (datalength(@text)) row_number() over (order by (select null))
from seed1 s1, seed1 s2, seed1 s3
)
select a.b.value('(./text())[1]', 'varchar(255)') as [text]
from (
select
case
when n = 1 then upper(substring(@text, n, 1))
when substring(@text, n - 1, 2) like '[^a-z][a-z]' collate Latin1_General_CI_AI then upper(substring(@text, n, 1))
else lower(substring(@text, n, 1))
end
from numbers
for xml path (''), type
) a (b)
)
EDIT: modified to cope with Accénted characters...Damn you Luis! I wanted to sleep:-D
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 16, 2014 at 5:30 pm
I just love when people continue to post to get better solutions.
Mr. Magoo, it's really nice how you handle the start of a word. However, it might generate problems with accentuated words (which are uncommon in English but might occur with words "imported" from other languages). It seems that changing the collation to Latin1_General_CI_AI can solve the problem but might reduce the performance.
January 16, 2014 at 5:52 pm
Luis Cazares (1/16/2014)
I just love when people continue to post to get better solutions.Mr. Magoo, it's really nice how you handle the start of a word. However, it might generate problems with accentuated words (which are uncommon in English but might occur with words "imported" from other languages). It seems that changing the collation to Latin1_General_CI_AI can solve the problem but might reduce the performance.
Thanks Luis, I agree there are compromises, as mine will handle O'Brien as a name, where others posted so far don't. I agree though that accented characters should probably be handled - even in English - we are a very multicultural society these days and it's not uncommon to get almost any name possible. It's just a bit late now for me to be bothered :w00t:
MM
select geometry::STGeomFromWKB(0x0106000000020000000103000000010000000B0000001000000000000840000000000000003DD8CCCCCCCCCC0840000000000000003DD8CCCCCCCCCC08408014AE47E17AFC3F040000000000104000CDCCCCCCCCEC3F9C999999999913408014AE47E17AFC3F9C99999999991340000000000000003D0000000000001440000000000000003D000000000000144000000000000000400400000000001040000000000000F03F100000000000084000000000000000401000000000000840000000000000003D0103000000010000000B000000000000000000143D000000000000003D009E99999999B93F000000000000003D009E99999999B93F8014AE47E17AFC3F400000000000F03F00CDCCCCCCCCEC3FA06666666666FE3F8014AE47E17AFC3FA06666666666FE3F000000000000003D1800000000000040000000000000003D18000000000000400000000000000040400000000000F03F000000000000F03F000000000000143D0000000000000040000000000000143D000000000000003D, 0);
January 16, 2014 at 11:44 pm
Perhaps something like this might help?
CREATE FUNCTION [dbo].[ProperCase]
(
@MyStr VARCHAR(8000) = NULL
)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
SELECT ProperCaseStr=
(
SELECT CASE WHEN [Matched] = 1 THEN STUFF(Item, 1, 1, UPPER(LEFT(Item, 1))) ELSE Item END
FROM dbo.PatternSplitCM(@MyStr, '[a-zA-Z]') b
ORDER BY ItemNumber
FOR XML PATH(''), TYPE
).value('.', 'varchar(8000)')
;
My ProperCase iTVF uses PatternSplitCM which can be found in the 4th article in my signature links.
WITH SampleData (MyStr) AS
(
SELECT 'The cat in the hat'
UNION ALL SELECT 'the mouse is in the parlour. but the dog is out to lunch'
)
SELECT MyStr, ProperCaseStr
FROM SampleData
CROSS APPLY dbo.ProperCase(MyStr);
You'd need to verify of course that it handles all of the cases you want.
My thought question: Have you ever been told that your query runs too fast?
My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.
Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St
Viewing 15 posts - 1 through 15 (of 30 total)
You must be logged in to reply to this topic. Login to reply