String Parsing UDF

  • Hi All,

    SQL2000

    I am trying to create a UDF so it can be called in-line to parse field names into more friendly names.

    Basically it need to strip off the first few chars then seperate any capitals with a space unless the capitals have a capital before them and like wise with numbers.

    I would like it to convert as follows...

    strPersonSurname.... Person Surname

    intPersonID... Person ID

    strAddressLine1... Address Line 1

    strAddressLine45... Address Line 45

    strLocationOfPDF... Location Of PDF

    bByte... Byte

    dtDateSent... Date Sent

    Can some one help please as its driving me mad!!!

    Thanks

    CCB

     

  • Hi Charlotte,

    This works for all your examples. Just run it and see what you think...

    --Input examples

    DECLARE @v-2 VARCHAR(100)

     SET @v-2 = 'strPersonSurname'

    -- SET @v-2 = 'intPersonID'

    -- SET @v-2 = 'strAddressLine1'

    -- SET @v-2 = 'strAddressLine45'

    -- SET @v-2 = 'strLocationOfPDF'

    -- SET @v-2 = 'bByte'

    -- SET @v-2 = 'dtDateSent'

    SET NOCOUNT ON

    --Create a table to a row for each character

    DECLARE @character TABLE (i INT IDENTITY(1, 1), c char(1), marker tinyint)

    INSERT @character SELECT TOP 101 NULL, 0 FROM master.dbo.syscolumns

    --Add characters to table

    UPDATE @character SET c = SUBSTRING(@v, i, 1)

    --Mark the ones which might cause a preceding space (with a 1)

    UPDATE @character SET marker = 1 WHERE ASCII(c) BETWEEN ASCII('A') AND ASCII('Z') OR c LIKE '[0-9]'

    --Mark the ones we don't want (with null)

    UPDATE @character SET marker = NULL WHERE i < (SELECT MIN(i) FROM @character WHERE marker = 1)

    --Mark the ones which will cause a preceding space (with a 2)

    UPDATE a SET marker = 2

    FROM @character a LEFT OUTER JOIN @character b ON a.i = b.i + 1 AND a.marker = b.marker

    WHERE a.marker = 1 and b.i IS NULL

    --Extract results from original string

    DECLARE @s-2 VARCHAR(100)

    SET @s-2 = ''

    SELECT @s-2 = @s-2 + CASE marker WHEN 2 THEN ' ' ELSE '' END + c FROM @character WHERE marker IS NOT NULL

    PRINT LTRIM(@s)

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Ryan

    Thats works really well. Can I ask one final thing.

    How would I extend it to replace - and _ with spaces.

    Thanks very much

    CCB

  • No worries

    I've extended the code to deal with different separators and I've tweaked it slightly. If you want to make specific replacements at various points, just use the REPLACE function where you need it.

    --Input examples

    DECLARE @v-2 VARCHAR(100)

    DECLARE @Separator VARCHAR(1)

    SET @Separator = '_'

     SET @v-2 = 'strPersonSurname'

    -- SET @v-2 = 'intPersonID'

    -- SET @v-2 = 'strAddressLine1'

    -- SET @v-2 = 'strAddressLine45'

    -- SET @v-2 = 'strLocationOfPDF'

    -- SET @v-2 = 'bByte'

    -- SET @v-2 = 'dtDateSent'

    SET NOCOUNT ON

    --Create a table to a row for each character

    DECLARE @character TABLE (i INT IDENTITY(1, 1), c char(1), marker tinyint)

    INSERT @character SELECT TOP 101 NULL, 0 FROM master.dbo.syscolumns

    --Add characters to table

    UPDATE @character SET c = SUBSTRING(@v, i, 1)

    --Mark the ones which might cause a preceding separator (with a 1)

    UPDATE @character SET marker = 1 WHERE ASCII(c) BETWEEN ASCII('A') AND ASCII('Z') OR c LIKE '[0-9]'

    --Get the first character we want

    DECLARE @FirstCharacterPosition INT

    SELECT @FirstCharacterPosition = MIN(i) FROM @character WHERE marker = 1

    --Mark the ones we don't want (with null)

    UPDATE @character SET marker = NULL WHERE i < @FirstCharacterPosition

    --Mark the ones which will cause a preceding separator (with a 2)

    UPDATE a SET marker = 2

    FROM @character a LEFT OUTER JOIN @character b ON a.i = b.i + 1 AND a.marker = b.marker

    WHERE a.marker = 1 and b.i IS NULL AND a.i > @FirstCharacterPosition

    --Extract results from original string

    DECLARE @s-2 VARCHAR(100)

    SET @s-2 = ''

    SELECT @s-2 = @s-2 + CASE marker WHEN 2 THEN @Separator ELSE '' END + c FROM @character WHERE marker IS NOT NULL

    PRINT @s-2

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Its ok,

    I sussed it...

    --Mark the ones we don't want (with null)

    UPDATE @character SET marker = NULL WHERE (i < (SELECT MIN(i) FROM @character WHERE marker = 1)) or c in('-', '_')

  • Just as an exercise thought I'd try this as an exercise

    CREATE FUNCTION dbo.udf_Test (@value varchar(100))

    RETURNS varchar(100)

    AS

    BEGIN

      DECLARE @result varchar(100)

      SET @result = ''

      SELECT dbo.udf_Test(REPLACE(REPLACE(@value,'-',' '),'_',' ')

      SELECT @result = @result +

        CASE WHEN

               ((ISNUMERIC(SUBSTRING(@value,n.Number,1)) = 1

                AND ISNUMERIC(SUBSTRING(@value,n.Number-1,1)) = 0

                AND SUBSTRING(@value,n.Number-1,1) <> ' ')

             OR (ASCII(SUBSTRING(@value,n.Number,1)) BETWEEN 65 AND 90

                AND ASCII(SUBSTRING(@value,n.Number-1,1)) NOT BETWEEN 65 AND 90)

                AND SUBSTRING(@value,n.Number-1,1) <> ' ')

        THEN ' ' ELSE '' END

        + SUBSTRING(@value,n.Number,1)

      FROM (SELECT number FROM master.dbo.spt_values WHERE type = 'P') n

      WHERE n.Number BETWEEN 1 AND LEN(@value)

      AND CAST(LEFT(@value,n.Number) as varbinary) <> CAST(LOWER(LEFT(@value,n.Number)) as varbinary)

      ORDER BY n.Number ASC

      RETURN LTRIM(@result)

    END

    This uses an undocumented table master.dbo.spt_values which should be replaced with a permanent table especially if strings greater than 255 to be used

    Far away is close at hand in the images of elsewhere.
    Anon.

  • LOL - I thought about doing it exactly that way, but ended up going the other way to keep things clear. I guess they're really the same way, but definitely an interesting exercise, and one that I wish I'd done now

    Good work!

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Very interesting problem... thought I'd try the "exercise" a wee bit differently...

    This one uses a "Tally" table... If you don't already have one, this is one way to make one...

     SELECT TOP 9999

            IDENTITY(INT,1,1) AS N

       INTO dbo.Tally

       FROM Master.dbo.SysColumns sc1,

            Master.dbo.SysColumns sc2

      ALTER TABLE dbo.Tally

            ADD CONSTRAINT PK_Tally_N

                PRIMARY KEY CLUSTERED (N)

    ...and here's the code I came up with for this exercise...

    DECLARE @v-2 VARCHAR(7000)

    DECLARE @d VARCHAR(10)  --Delimiter or separator like (_)

    SET @d = '_'

     SET @v-2 = 'strPersonSurname'

    -- SET @v-2 = 'intPersonID'

    -- SET @v-2 = 'strAddressLine1'

    -- SET @v-2 = 'strAddressLine45'

    -- SET @v-2 = 'strLocationOfPDF'

    -- SET @v-2 = 'bByte'

    -- SET @v-2 = 'dtDateSent'

    -- SET @v-2 = 'alllowercase'

    -- SET @v-2 = 'strWith-Hyphen_and_underscore'

    -- SET @v-2 = 'FirstLetterCap'

    SET NOCOUNT ON

     SELECT @v-2 = 'x'+@v

     SELECT @v-2 = STUFF(@v,N,0,' ')

       FROM dbo.Tally

      WHERE N <= LEN(@v)

        AND ASCII(SUBSTRING(@v,N,1))  <= ASCII('Z')

        AND ASCII(SUBSTRING(@v,N-1,1)) > ASCII('Z')

      ORDER BY N Desc

     SELECT @v-2 = REPLACE(REPLACE(SUBSTRING(@v,CHARINDEX(' ',@v)+1,LEN(@v)),'-',''),@d,' ')

    PRINT @v-2

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • All 3 methods use a tally/numbers/value table and old value setting trick, but that's definitely the neatest yet! Well done, Jeff...

    Ryan Randall

    Solutions are easy. Understanding the problem, now, that's the hard part.

  • Thanks Ryan... Yep... I should'a said "too"

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply