Trimming unwanted characters from string

  • I have a column with 'name/dept' or 'name-dept', such as 'jones/1234' or 'jones-1234'.

    Is there a function or functions I can use in SQL Server 2005 to give me just the name? 'Jones'.

    Obviously the name is different on records, like Jones or McDonald.

  • karthur (4/9/2009)


    I have a column with 'name/dept' or 'name-dept', such as 'jones/1234' or 'jones-1234'.

    Is there a function or functions I can use in SQL Server 2005 to give me just the name? 'Jones'.

    Obviously the name is different on records, like Jones or McDonald.

    No, but you could roll your own. Is the delimiter between the name and the unwanted bit consistent, either "/" or "-"?

    How many of these "-" can you have in any one value?

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • For a one off you can use a combination of PATINDEX and LEFT for a quick and dirty way of trimming out the names you need. As below, the assumptions are the name will always be at the beginning of the string and the seperating char will always be either '-' or '/':

    --Create test table

    create table #test (NameTrim varchar(20))

    --Populate with gumpf

    insert into #test

    select 'jones/1234' union all

    select 'Jones-1234' union all

    select 'McDonald/56780' union all

    select 'Goddard-098765'

    --Extract required info

    select

    NameTrim,

    case when PATINDEX('%-%', NameTrim) > 0

    then left(NameTrim,PATINDEX('%-%', NameTrim)-1)

    when PATINDEX('%/%', NameTrim) > 0

    then left(NameTrim,PATINDEX('%/%', NameTrim)-1)

    else NameTrim

    end as NameTrimmed

    from #test

    However, if (more likely) this is something that you will need to do often, just create a UDF you pass the string and character to (or possible characters if you don't know at that stage) which returns the trimmed out portion. Hope that makes sense.

    --------
    [font="Tahoma"]I love deadlines. I like the whooshing sound they make as they fly by. -Douglas Adams[/font]

  • Hi

    @rob-2

    Thanks for sample data! πŸ˜‰

    Try this:

    DECLARE @test-2 table (NameTrim varchar(20))

    ---Populate with gumpf

    insert into @test-2

    select 'jones/1234' union all

    select 'Jones-1234' union all

    select 'McDonald/56780' union all

    select 'Goddard-098765'

    SELECT LEFT(NameTrim, PATINDEX('%[0-9_/-]%', NameTrim) - 1)

    FROM @test-2

    Greets

    Flo

  • also check SSC (this site) for scripts "split function".

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution πŸ˜€

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Nice Flo, much neater! Think I need to go read up on pattern matching... πŸ˜‰

    Both our examples fall foul of this set though (mine on the later slash in HaveNow and the other on NoSplit)

    --Create the test table

    declare @test-2 table (NameTrim varchar(20))

    --Insert gumpf and rough data

    insert into @test-2

    select 'jones/1234' union all

    select 'Jones-1234' union all

    select 'McDonald/56780' union all

    select 'Goddard-098765' union all

    select 'Gotcha?-1214/34' union all

    select 'HaveNow!/7896-7' union all

    select 'NoSplit'

    Just for SaG's, this function will return all characters of a string up to the point where they're no longer alpha.

    create function ufnTrimName (@Untrimmed varchar(50))

    returns varchar(50)

    as

    begin

    declare @Trimmed varchar(50)

    declare @CharIndex int

    set @Trimmed = ''

    set @CharIndex = 1

    while @CharIndex <= len(@Untrimmed)

    begin

    if ascii(lower(substring(@Untrimmed,@CharIndex,1))) between 97 and 122

    set @Trimmed = @Trimmed + substring(@Untrimmed,@CharIndex,1)

    else

    set @CharIndex = len(@Untrimmed)

    set @CharIndex = @CharIndex + 1

    end

    return @Trimmed

    end

    like I say, just messing around. The most efficient way is Flo's and making sure the data isn't dirty in the first place.

    --------
    [font="Tahoma"]I love deadlines. I like the whooshing sound they make as they fly by. -Douglas Adams[/font]

  • Don't forget a tally-table version!

    DECLARE @Untrimmed varchar(50), @Trimmed varchar(50)

    SET @Untrimmed = 'HaveNow!/7896-7'

    SET @Untrimmed = 'HaveNow'

    --

    SELECT @Trimmed = LEFT(@Untrimmed, (SELECT ISNULL(MIN(n.number)-1, LEN(@Untrimmed))

    FROM Numbers n

    WHERE n.number <= LEN(@Untrimmed)

    AND NOT ascii(lower(substring(@Untrimmed, n.number,1))) between 97 and 122))

    --

    SELECT @Trimmed

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Rob Goddard (4/9/2009)


    Nice Flo, much neater! Think I need to go read up on pattern matching... πŸ˜‰

    I'm also learning every day here... πŸ˜‰

    while @CharIndex <= len(@Untrimmed)

    I would try to avoid a loop. See below solution.

    Both our examples fall foul of this set though (mine on the later slash in HaveNow and the other on NoSplit)

    Thanks for the hint! So I think the most simple and efficient solution would be to just reverse the check from special characters to A-Z:

    DECLARE @test-2 table (NameTrim varchar(20))

    ---Populate with gumpf

    insert into @test-2

    select 'jones/1234' union all

    select 'Jones-1234' union all

    select 'McDonald/56780' union all

    select 'Goddard-098765' union all

    select 'Gotcha?-1214/34' union all

    select 'HaveNow!/7896-7' union all

    select 'NoSplit'

    SELECT

    NameTrim,

    CASE WHEN PATINDEX('%[^A-Z]%', NameTrim) != 0

    THEN LEFT(NameTrim, PATINDEX('%[^A-Z]%', NameTrim) - 1)

    ELSE NameTrim END Trimmed

    FROM @test-2

    Greets

    Flo

  • thanks for the suggestions. I'll give this a try.

  • Chris Morris (4/9/2009)


    Don't forget a tally-table version!

    DECLARE @Untrimmed varchar(50), @Trimmed varchar(50)

    SET @Untrimmed = 'HaveNow!/7896-7'

    SET @Untrimmed = 'HaveNow'

    --

    SELECT @Trimmed = LEFT(@Untrimmed, (SELECT ISNULL(MIN(n.number)-1, LEN(@Untrimmed))

    FROM Numbers n

    WHERE n.number <= LEN(@Untrimmed)

    AND NOT ascii(lower(substring(@Untrimmed, n.number,1))) between 97 and 122))

    --

    SELECT @Trimmed

    Hi Chris

    I'm not sure if the Tally is needed in this case. I didn't try, but I think the simple PATINDEX check I previously posted might be faster. Please correct me if I'm wrong.

    Greets

    Flo

  • You're totally right Flo, it's overkill, all you need is patindex. It was just for fun.

    Try this:

    ISNULL(NULLIF(PATINDEX('%[^A-Z]%', NameTrim),0), LEN(NameTrim))

    which means: if patindex returns 0, replace it with null, so you can then use isnull to switch between [the return value of patindex] and [some other value]

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris Morris (4/9/2009)


    You're totally right Flo, it's overkill, all you need is patindex. It was just for fun.

    Always a good reason for a post! πŸ˜‰

    Try this:

    ISNULL(NULLIF(PATINDEX('%[^A-Z]%', NameTrim),0), LEN(NameTrim))

    which means: if patindex returns 0, replace it with null, so you can then use isnull to switch between [the return value of patindex] and [some other value]

    So finally your solution seems to be the best. I just added the LEFT to get the requested result:

    LEFT(NameTrim, ISNULL(NULLIF(PATINDEX('%[^A-Z]%', NameTrim), 0) - 1, LEN(NameTrim)))

    Greets

    Flo

  • Florian Reischl (4/9/2009)


    So I think the most simple and efficient solution would be

    Oh wow, so simple... :w00t:

    (Obviously k.i.s.s. has stopped ringing in my ears for some reason)

    I was looking at the loop and couldn't see how to do it without either. So I'll now be spending some coffee time this avo not just playing with pattern matching but looking at the merits of a tally table as well.. Thanks guys!

    --------
    [font="Tahoma"]I love deadlines. I like the whooshing sound they make as they fly by. -Douglas Adams[/font]

  • Does your data contain (or will it potentially contain) hyphenated names? This could pose a problem.

  • try this code...

    suppose you have the table 'tblNames' with a column 'name'

    select left(name,charindex('-', name)-1)

    from tblNames union all

    select left(name,charindex('/', name)-1)

    from tblNames

    havent tried it... i'll try this code at home...

    but i think it should work..

    br

    jon

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply