String Parsing UDF

Question

Post reply

String Parsing UDF

Charlottecb

Hall of Fame

Points: 3932
More actions
March 23, 2006 at 4:38 am

#114031

Hi All,
SQL2000
I am trying to create a UDF so it can be called in-line to parse field names into more friendly names.
Basically it need to strip off the first few chars then seperate any capitals with a space unless the capitals have a capital before them and like wise with numbers.
I would like it to convert as follows...
strPersonSurname.... Person Surname
intPersonID... Person ID
strAddressLine1... Address Line 1
strAddressLine45... Address Line 45
strLocationOfPDF... Location Of PDF
bByte... Byte
dtDateSent... Date Sent
Can some one help please as its driving me mad!!!
Thanks
CCB

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

RyanRandall SSChampion Points: 13623 More actions · Answer 1

Hi Charlotte,

This works for all your examples. Just run it and see what you think...

--Input examples

DECLARE @v-2 VARCHAR(100)

SET @v-2 = 'strPersonSurname'

-- SET @v-2 = 'intPersonID'

-- SET @v-2 = 'strAddressLine1'

-- SET @v-2 = 'strAddressLine45'

-- SET @v-2 = 'strLocationOfPDF'

-- SET @v-2 = 'bByte'

-- SET @v-2 = 'dtDateSent'

SET NOCOUNT ON

--Create a table to a row for each character

DECLARE @character TABLE (i INT IDENTITY(1, 1), c char(1), marker tinyint)

INSERT @character SELECT TOP 101 NULL, 0 FROM master.dbo.syscolumns

--Add characters to table

UPDATE @character SET c = SUBSTRING(@v, i, 1)

--Mark the ones which might cause a preceding space (with a 1)

UPDATE @character SET marker = 1 WHERE ASCII(c) BETWEEN ASCII('A') AND ASCII('Z') OR c LIKE '[0-9]'

--Mark the ones we don't want (with null)

UPDATE @character SET marker = NULL WHERE i < (SELECT MIN(i) FROM @character WHERE marker = 1)

--Mark the ones which will cause a preceding space (with a 2)

UPDATE a SET marker = 2

FROM @character a LEFT OUTER JOIN @character b ON a.i = b.i + 1 AND a.marker = b.marker

WHERE a.marker = 1 and b.i IS NULL

--Extract results from original string

DECLARE @s-2 VARCHAR(100)

SET @s-2 = ''

SELECT @s-2 = @s-2 + CASE marker WHEN 2 THEN ' ' ELSE '' END + c FROM @character WHERE marker IS NOT NULL

PRINT LTRIM(@s)

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

Charlottecb Hall of Fame Points: 3932 More actions · Answer 2

Ryan

Thats works really well. Can I ask one final thing.

How would I extend it to replace - and _ with spaces.

Thanks very much

CCB

RyanRandall SSChampion Points: 13623 More actions · Answer 3

No worries

I've extended the code to deal with different separators and I've tweaked it slightly. If you want to make specific replacements at various points, just use the REPLACE function where you need it.

--Input examples

DECLARE @v-2 VARCHAR(100)

DECLARE @Separator VARCHAR(1)

SET @Separator = '_'

SET @v-2 = 'strPersonSurname'

-- SET @v-2 = 'intPersonID'

-- SET @v-2 = 'strAddressLine1'

-- SET @v-2 = 'strAddressLine45'

-- SET @v-2 = 'strLocationOfPDF'

-- SET @v-2 = 'bByte'

-- SET @v-2 = 'dtDateSent'

SET NOCOUNT ON

--Create a table to a row for each character

DECLARE @character TABLE (i INT IDENTITY(1, 1), c char(1), marker tinyint)

INSERT @character SELECT TOP 101 NULL, 0 FROM master.dbo.syscolumns

--Add characters to table

UPDATE @character SET c = SUBSTRING(@v, i, 1)

--Mark the ones which might cause a preceding separator (with a 1)

UPDATE @character SET marker = 1 WHERE ASCII(c) BETWEEN ASCII('A') AND ASCII('Z') OR c LIKE '[0-9]'

--Get the first character we want

DECLARE @FirstCharacterPosition INT

SELECT @FirstCharacterPosition = MIN(i) FROM @character WHERE marker = 1

--Mark the ones we don't want (with null)

UPDATE @character SET marker = NULL WHERE i < @FirstCharacterPosition

--Mark the ones which will cause a preceding separator (with a 2)

UPDATE a SET marker = 2

FROM @character a LEFT OUTER JOIN @character b ON a.i = b.i + 1 AND a.marker = b.marker

WHERE a.marker = 1 and b.i IS NULL AND a.i > @FirstCharacterPosition

--Extract results from original string

DECLARE @s-2 VARCHAR(100)

SET @s-2 = ''

SELECT @s-2 = @s-2 + CASE marker WHEN 2 THEN @Separator ELSE '' END + c FROM @character WHERE marker IS NOT NULL

PRINT @s-2

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

Charlottecb Hall of Fame Points: 3932 More actions · Answer 4

Its ok,

I sussed it...

--Mark the ones we don't want (with null)

UPDATE @character SET marker = NULL WHERE (i < (SELECT MIN(i) FROM @character WHERE marker = 1)) or c in('-', '_')

David Burrows SSC Guru Points: 65142 More actions · Answer 5

Just as an exercise thought I'd try this as an exercise

CREATE FUNCTION dbo.udf_Test (@value varchar(100))

RETURNS varchar(100)

AS

BEGIN

DECLARE @result varchar(100)

SET @result = ''

SELECT dbo.udf_Test(REPLACE(REPLACE(@value,'-',' '),'_',' ')

SELECT @result = @result +

CASE WHEN

((ISNUMERIC(SUBSTRING(@value,n.Number,1)) = 1

AND ISNUMERIC(SUBSTRING(@value,n.Number-1,1)) = 0

AND SUBSTRING(@value,n.Number-1,1) <> ' ')

OR (ASCII(SUBSTRING(@value,n.Number,1)) BETWEEN 65 AND 90

AND ASCII(SUBSTRING(@value,n.Number-1,1)) NOT BETWEEN 65 AND 90)

AND SUBSTRING(@value,n.Number-1,1) <> ' ')

THEN ' ' ELSE '' END

+ SUBSTRING(@value,n.Number,1)

FROM (SELECT number FROM master.dbo.spt_values WHERE type = 'P') n

WHERE n.Number BETWEEN 1 AND LEN(@value)

AND CAST(LEFT(@value,n.Number) as varbinary) <> CAST(LOWER(LEFT(@value,n.Number)) as varbinary)

ORDER BY n.Number ASC

RETURN LTRIM(@result)

END

This uses an undocumented table master.dbo.spt_values which should be replaced with a permanent table especially if strings greater than 255 to be used

Far away is close at hand in the images of elsewhere.
Anon.

RyanRandall SSChampion Points: 13623 More actions · Answer 6

LOL - I thought about doing it exactly that way, but ended up going the other way to keep things clear. I guess they're really the same way, but definitely an interesting exercise, and one that I wish I'd done now

Good work!

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 7

Very interesting problem... thought I'd try the "exercise" a wee bit differently...

This one uses a "Tally" table... If you don't already have one, this is one way to make one...

SELECT TOP 9999

IDENTITY(INT,1,1) AS N

INTO dbo.Tally

FROM Master.dbo.SysColumns sc1,

Master.dbo.SysColumns sc2

ALTER TABLE dbo.Tally

ADD CONSTRAINT PK_Tally_N

PRIMARY KEY CLUSTERED (N)

...and here's the code I came up with for this exercise...

DECLARE @v-2 VARCHAR(7000)

DECLARE @d VARCHAR(10) --Delimiter or separator like (_)

SET @d = '_'

SET @v-2 = 'strPersonSurname'

-- SET @v-2 = 'intPersonID'

-- SET @v-2 = 'strAddressLine1'

-- SET @v-2 = 'strAddressLine45'

-- SET @v-2 = 'strLocationOfPDF'

-- SET @v-2 = 'bByte'

-- SET @v-2 = 'dtDateSent'

-- SET @v-2 = 'alllowercase'

-- SET @v-2 = 'strWith-Hyphen_and_underscore'

-- SET @v-2 = 'FirstLetterCap'

SET NOCOUNT ON

SELECT @v-2 = 'x'+@v

SELECT @v-2 = STUFF(@v,N,0,' ')

FROM dbo.Tally

WHERE N <= LEN(@v)

AND ASCII(SUBSTRING(@v,N,1)) <= ASCII('Z')

AND ASCII(SUBSTRING(@v,N-1,1)) > ASCII('Z')

ORDER BY N Desc

SELECT @v-2 = REPLACE(REPLACE(SUBSTRING(@v,CHARINDEX(' ',@v)+1,LEN(@v)),'-',''),@d,' ')

PRINT @v-2

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

RyanRandall SSChampion Points: 13623 More actions · Answer 8

All 3 methods use a tally/numbers/value table and old value setting trick, but that's definitely the neatest yet! Well done, Jeff...

Ryan Randall

Solutions are easy. Understanding the problem, now, that's the hard part.

Jeff Moden SSC Guru Points: 1004356 More actions · Answer 9

Thanks Ryan... Yep... I should'a said "too"

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)